Error handling with RRD files after conversion to the new format
Some users may have errors after converting Round Robin Database (RRD) files to the new format.
LAST TESTED ON CHECKMK 2.4.0P1
This troubleshooting article is at your own risk! Make sure you have a complete backup before attempting any of these steps.
We observed this rare behavior only on installations that have been updated/migrated over the years (i.e. (1.5 → 1.6 → 2.0).
If you are unsure if this applies to you, or need help, do not hesitate to contact us.
Note (as of version 2.4):
Per Werk #17387: "cmk --convert-rrds" deprecated,
The cmk --convert-rrds command has been moved to a new standalone executable: cmk-convert-rrds.
You can use cmk-convert-rrds ARGS as a direct drop-in replacement for cmk --convert-rrds ARGS.
Prerequisites
We expect that the following rules are set up in Setup > Services > Service monitoring rules > Configuration of RRD databases of services:
And Setup > Hosts > Host monitoring rules > Configuration of RRD databases of hosts:
Problem
After converting the RRD files to the new format (described in this manual), in some rare cases, it might happen that data is still written to {{$OMD_ROOT/var/pnp4nagios/perfdata/myhost/}}.
At the same time, you might see error messages in the cmc.log like:
2021-09-20 09:18:14 [4] [rrdcached thread] [rrdcached at "/omd/sites/mysite/tmp/run/rrdcached.sock"] [log] -1 No such file: /omd/sites/mysite/var/pnp4nagios/perfdata/myhost/Memory_and_pagefile_pagefile_total.rrd
2021-09-20 09:18:14 [4] [rrdcached thread] [rrdcached at "/omd/sites/mysite/tmp/run/rrdcached.sock"] [log] -1 No such file: /omd/sites/mysite/var/pnp4nagios/perfdata/myhost/Memory_and_pagefile_pagefile_avg.rrd
2021-09-20 09:18:14 [4] [rrdcached thread] [rrdcached at "/omd/sites/mysite/tmp/run/rrdcached.sock"] [log] -1 No such file: /omd/sites/mysite/var/pnp4nagios/perfdata/myhost/Power_cpu0_Cores_w.rrd
2021-09-20 09:18:14 [4] [rrdcached thread] [rrdcached at "/omd/sites/mysite/tmp/run/rrdcached.sock"] [log] -1 No such file: /omd/sites/mysite/var/pnp4nagios/perfdata/myhost/Power_cpu0_DRAM_w.rrd
2021-09-20 09:18:14 [4] [rrdcached thread] [rrdcached at "/omd/sites/mysite/tmp/run/rrdcached.sock"] [log] -1 No such file: /omd/sites/mysite/var/pnp4nagios/perfdata/myhost/Power_cpu0_Graphics_w.rrd
2021-09-20 09:18:14 [4] [rrdcached thread] [rrdcached at "/omd/sites/mysite/tmp/run/rrdcached.sock"] [log] -1 No such file: /omd/sites/mysite/var/pnp4nagios/perfdata/myhost/Power_cpu0_Package_w.rrdYou can notice that some RRD files in {{$OMD_ROOT/var/pnp4nagios/perfdata/myhost/}} are still updated. The chances are high that not all hosts are affected, but only a few, even less than 10%.
Solution
Change to the site user and stop the site:
su mysite OMD[mysite]:~$ omd stop.
Run the following command for one of the affected hosts:
OMD[mysite]:~$ cmk -vv --convert-rrds --delete-rrds myhost.
If such messages appear like these:
HOST_ PNP -> CMC WARNING: XML /opt/omd/sites/mysite/var/pnp4nagios/perfdata/myhost/_HOST_.xml refers to not existing RRD /opt/omd/sites/mysite/var/pnp4nagios/perfdata/myhost/_HOST__rta.rrd. Nothing to convert. Cleanup the XML file manually in case this is OK..
You can delete the XML file.
.Start the site and the RRD files should be written correctly now:
OMD[mysite]:~$ omd start.
If it works, you can run that for every host separately, as shown above, or for all hosts, with the below command:
OMD[mysite]:~$ cmk -vv --convert-rrds --delete-rrds