Debugging corrupted RRD files

This article describes how to resolve errors for Round Robin Database (RRD) files related to filesystem services post-2.0 upgrade.

LAST TESTED ON CHECKMK 2.0.0P1

Table of Contents

Problem

During an update of Checkmk from 1.6.0 to 2.0.0, you may see these messages about some RRD files for filesystem services:

| RRD files for host test1 and service Filesystem / stored in files:
-|   - /opt/omd/sites/mysite/var/check_mk/rrd/test1/Filesystem__.info
-|   - /opt/omd/sites/mysite/var/check_mk/rrd/test1/Filesystem__.rrd
-| are messed up. Please restore them both from backup.
-| RRD files for host eg016070 and service Filesystem / stored in files:
-|   - /opt/omd/sites/mysite/var/check_mk/rrd/test2/Filesystem__.info
-|   - /opt/omd/sites/mysite/var/check_mk/rrd/test2/Filesystem__.rrd
-| are messed up. Please restore them both from backup.


Good news

Checkmk is still working, including the RRD files.

Bad news

In older Checkmk versions and the raw edition, we use _ as a metric name for the filesystems, e.g., _ for slash, _boot for /boot

With Werk #7444: Rename metric name in Filesystem checks from mount point to fs_used, we try to clean up this behavior, and with Checkmk 2.0, we always use fs_used instead of the  _ metrics.


To check if you're using the old metrics, you can execute this command:

OMD[mysite]:~$  lq "GET services\nColumns: host_name description perf_data\nFilter: host_name ~ mylinuxhost\nFilter: description ~ Filesystem"

mylinuxhost;Filesystem /;fs_used=382353.996094;381992.03125;429741.035156;0;477490.039062 fs_size=477490.039062;;;; fs_used_percent=80.075806;;;; growth=-12877.324568;;;; trend=-33528.269466;;;0;19895.418294 inodes_used=1563877;28001894.4;29557555.2;0;31113216
mylinuxhost;Filesystem /boot;fs_used=429.703125;563.5875;634.035938;0;704.484375 fs_size=704.484375;;;; fs_used_percent=60.995409;;;; growth=0;;;; trend=0;;;0;29.353516 inodes_used=320;42163.2;44505.6;0;46848
mylinuxhost;Filesystem /boot/efi;fs_used=76.792969;408.7875;459.885938;0;510.984375 fs_size=510.984375;;;; fs_used_percent=15.028438;;;; growth=0;;;; trend=0;;;0;21.291016
mylinuxhost;Filesystem /media/mysite/USB;fs_used=269688.625;370310.4625;416599.270313;0;462888.078125 fs_size=462888.078125;;;; fs_used_percent=58.262167;;;; growth=0;;;; trend=0;;;0;19287.003255 inodes_used=5784;27154022.4;28662579.2;0;30171136
mylinuxhost;Filesystem /media/mysite/SDCard;fs_used=24534.425781;384183.89375;432206.880469;0;480229.867188 fs_size=480229.867188;;;; fs_used_percent=5.108892;;;; growth=0;;;; trend=0;;;0;20009.577799 inodes_used=11;28164096;29728768;0;31293440
mylinuxhost;Filesystem /opt/omd/sites/workshop/tmp;fs_used=8.347656;12771.446875;14367.877734;0;15964.308594 fs_size=15964.308594;;;; fs_used_percent=0.052289;;;; growth=-282.095681;;;; trend=0.124696;;;0;665.179525 inodes_used=1533;3678176.7;3882519.85;0;4086863
mylinuxhost;Filesystem /opt/omd/sites/cme2/tmp;
mylinuxhost;Filesystem /opt/omd/sites/mysite/tmp;
mylinuxhost;Filesystem /opt/omd/sites/mysite/tmp;fs_used=6.191406;12771.446875;14367.877734;0;15964.308594 fs_size=15964.308594;;;; fs_used_percent=0.038783;;;; growth=0;;;; trend=-0.020642;;;0;665.179525 inodes_used=1439;3678176.7;3882519.85;0;4086863
mylinuxhost;Filesystem /opt/omd/sites/workshop/tmp;fs_used=7.699219;12771.446875;14367.877734;0;15964.308594 fs_size=15964.308594;;;; fs_used_percent=0.048228;;;; growth=-582.891098;;;; trend=-0.019961;;;0;665.179525 inodes_used=1534;3678176.7;3882519.85;0;4086863
mylinuxhost;Filesystem /media/mysite/USB;fs_used=269688.625;370310.4625;416599.270313;0;462888.078125 fs_size=462888.078125;;;; fs_used_percent=58.262167;;;; growth=0;;;; trend=0;;;0;19287.003255 inodes_used=5784;27154022.4;28662579.2;0;30171136
mylinuxhost;Filesystem /media/mysite/SDCard;fs_used=24534.425781;384183.89375;432206.880469;0;480229.867188 fs_size=480229.867188;;;; fs_used_percent=5.108892;;;; growth=0;;;; trend=0;;;0;20009.577799 inodes_used=11;28164096;29728768;0;31293440
mylinuxhost;Filesystem /boot/efi;fs_used=76.792969;408.7875;459.885938;0;510.984375 fs_size=510.984375;;;; fs_used_percent=15.028438;;;; growth=0;;;; trend=0;;;0;21.291016
mylinuxhost;Filesystem /;fs_used=369116.152344;381992.03125;429741.035156;0;477490.039062 fs_size=477490.039062;;;; fs_used_percent=77.303425;;;; growth=6496.604099;;;; trend=-40417.662717;;;0;19895.418294 inodes_used=1551084;28001894.4;29557555.2;0;31113216
mylinuxhost;Filesystem /opt/omd/sites/cme2/tmp;
mylinuxhost;Filesystem /boot;fs_used=429.703125;563.5875;634.035938;0;704.484375 fs_size=704.484375;;;; fs_used_percent=60.995409;;;; growth=0;;;; trend=0;;;0;29.353516 inodes_used=320;42163.2;44505.6;0;46848

In my case, all filesystem services are using fs_used. If you're not using fs_used in Checkmk 2.0, please open a support case, and we will provide you with the commands for fixing this.

These metrics should be converted at the latest with the Upgrade to Checkmk 2.0.


With Checkmk 2.0, we introduced a new Service Column:

Customize > Visualization > Views > Add view > All services > Continue > (Choose anything) > Continue

Screenshot showing creation of a new view. The Column section is highlighted.

This column will only work for filesystem services if Checkmk uses fs_used as a metric. With the _ metric, this will not work.

Solution

There is no default solution at the moment. You can ignore these messages. Checkmk will now use the new metrics.