Debugging corrupted RRD files
This article describes how to resolve errors for Round Robin Database (RRD) files related to filesystem services post-2.0 upgrade.
LAST TESTED ON CHECKMK 2.0.0P1
Problem
During an update of Checkmk from 1.6.0 to 2.0.0, you may see these messages about some RRD files for filesystem services:
| RRD files for host test1 and service Filesystem / stored in files: -| - /opt/omd/sites/mysite/var/check_mk/rrd/test1/Filesystem__.info -| - /opt/omd/sites/mysite/var/check_mk/rrd/test1/Filesystem__.rrd -| are messed up. Please restore them both from backup. -| RRD files for host eg016070 and service Filesystem / stored in files: -| - /opt/omd/sites/mysite/var/check_mk/rrd/test2/Filesystem__.info -| - /opt/omd/sites/mysite/var/check_mk/rrd/test2/Filesystem__.rrd -| are messed up. Please restore them both from backup.
Good news
Checkmk is still working, including the RRD files.
Bad news
In older Checkmk versions and the raw edition, we use _ as a metric name for the filesystems, e.g., _ for slash, _boot for /boot
With Werk #7444: Rename metric name in Filesystem checks from mount point to fs_used, we try to clean up this behavior, and with Checkmk 2.0, we always use fs_used instead of the _ metrics.
To check if you're using the old metrics, you can execute this command:
OMD[mysite]:~$ lq "GET services\nColumns: host_name description perf_data\nFilter: host_name ~ mylinuxhost\nFilter: description ~ Filesystem" mylinuxhost;Filesystem /;fs_used=382353.996094;381992.03125;429741.035156;0;477490.039062 fs_size=477490.039062;;;; fs_used_percent=80.075806;;;; growth=-12877.324568;;;; trend=-33528.269466;;;0;19895.418294 inodes_used=1563877;28001894.4;29557555.2;0;31113216 mylinuxhost;Filesystem /boot;fs_used=429.703125;563.5875;634.035938;0;704.484375 fs_size=704.484375;;;; fs_used_percent=60.995409;;;; growth=0;;;; trend=0;;;0;29.353516 inodes_used=320;42163.2;44505.6;0;46848 mylinuxhost;Filesystem /boot/efi;fs_used=76.792969;408.7875;459.885938;0;510.984375 fs_size=510.984375;;;; fs_used_percent=15.028438;;;; growth=0;;;; trend=0;;;0;21.291016 mylinuxhost;Filesystem /media/mysite/USB;fs_used=269688.625;370310.4625;416599.270313;0;462888.078125 fs_size=462888.078125;;;; fs_used_percent=58.262167;;;; growth=0;;;; trend=0;;;0;19287.003255 inodes_used=5784;27154022.4;28662579.2;0;30171136 mylinuxhost;Filesystem /media/mysite/SDCard;fs_used=24534.425781;384183.89375;432206.880469;0;480229.867188 fs_size=480229.867188;;;; fs_used_percent=5.108892;;;; growth=0;;;; trend=0;;;0;20009.577799 inodes_used=11;28164096;29728768;0;31293440 mylinuxhost;Filesystem /opt/omd/sites/workshop/tmp;fs_used=8.347656;12771.446875;14367.877734;0;15964.308594 fs_size=15964.308594;;;; fs_used_percent=0.052289;;;; growth=-282.095681;;;; trend=0.124696;;;0;665.179525 inodes_used=1533;3678176.7;3882519.85;0;4086863 mylinuxhost;Filesystem /opt/omd/sites/cme2/tmp; mylinuxhost;Filesystem /opt/omd/sites/mysite/tmp; mylinuxhost;Filesystem /opt/omd/sites/mysite/tmp;fs_used=6.191406;12771.446875;14367.877734;0;15964.308594 fs_size=15964.308594;;;; fs_used_percent=0.038783;;;; growth=0;;;; trend=-0.020642;;;0;665.179525 inodes_used=1439;3678176.7;3882519.85;0;4086863 mylinuxhost;Filesystem /opt/omd/sites/workshop/tmp;fs_used=7.699219;12771.446875;14367.877734;0;15964.308594 fs_size=15964.308594;;;; fs_used_percent=0.048228;;;; growth=-582.891098;;;; trend=-0.019961;;;0;665.179525 inodes_used=1534;3678176.7;3882519.85;0;4086863 mylinuxhost;Filesystem /media/mysite/USB;fs_used=269688.625;370310.4625;416599.270313;0;462888.078125 fs_size=462888.078125;;;; fs_used_percent=58.262167;;;; growth=0;;;; trend=0;;;0;19287.003255 inodes_used=5784;27154022.4;28662579.2;0;30171136 mylinuxhost;Filesystem /media/mysite/SDCard;fs_used=24534.425781;384183.89375;432206.880469;0;480229.867188 fs_size=480229.867188;;;; fs_used_percent=5.108892;;;; growth=0;;;; trend=0;;;0;20009.577799 inodes_used=11;28164096;29728768;0;31293440 mylinuxhost;Filesystem /boot/efi;fs_used=76.792969;408.7875;459.885938;0;510.984375 fs_size=510.984375;;;; fs_used_percent=15.028438;;;; growth=0;;;; trend=0;;;0;21.291016 mylinuxhost;Filesystem /;fs_used=369116.152344;381992.03125;429741.035156;0;477490.039062 fs_size=477490.039062;;;; fs_used_percent=77.303425;;;; growth=6496.604099;;;; trend=-40417.662717;;;0;19895.418294 inodes_used=1551084;28001894.4;29557555.2;0;31113216 mylinuxhost;Filesystem /opt/omd/sites/cme2/tmp; mylinuxhost;Filesystem /boot;fs_used=429.703125;563.5875;634.035938;0;704.484375 fs_size=704.484375;;;; fs_used_percent=60.995409;;;; growth=0;;;; trend=0;;;0;29.353516 inodes_used=320;42163.2;44505.6;0;46848
In my case, all filesystem services are using fs_used. If you're not using fs_used in Checkmk 2.0, please open a support case, and we will provide you with the commands for fixing this.
These metrics should be converted at the latest with the Upgrade to Checkmk 2.0.
With Checkmk 2.0, we introduced a new Service Column:
Customize > Visualization > Views > Add view > All services > Continue > (Choose anything) > Continue
This column will only work for filesystem services if Checkmk uses fs_used as a metric. With the _ metric, this will not work.
Solution
Related articles