Troubleshooting "no space left on device" or filesystem full errors

Example error message for "No space left on device"

LAST TESTED ON CHECKMK 2.2.0P1

Table of Contents

Problem

/etc/cron.daily/dpkg:
 cp: error writing 'dpkg.status': No space left on device
gzip: .//dpkg.status.0.gz: No space left on device
 mv: cannot stat './/dpkg.status.0.gz': No such file or directory
 /etc/cron.daily/logrotate:
 error: Compressing program wrote following message to stderr when compressing log /var/log/apache2/ssl_access.log.1:
gzip: stdout: No space left on device
 error: failed to compress log /var/log/apache2/ssl_access.log.1
 run-parts: /etc/cron.daily/logrotate exited with return code 1
 /etc/cron.daily/man-db:
 gdbm fatal: read error
 run-parts: /etc/cron.daily/man-db exited with return code 1


Identify what consumes most of the disk space. To get an overview of the allocated disk space:

root@mylinuxhost ~$ df -h

Filesystem		Size	Used	Avail	Use%	Mounted on
udev			7.0G	   0	7.9G	  0%	/dev
tmpfs			1.6G	169M	1.4G	 11%	/run
/dev/sda1		3.7G	1.8G	1.9G	 49%	/ro
/dev/md0p1		772M	772M	   0	100%	/rw
aufs			772M	772M	   0	100%	/

Here we notice that / and /rw are at 100% usage.

Now we need to find the files which are flushing the filesystem. For this, we can use the command: du -sh


We start from / to search for the big file:

root@mylinuxhost ~$ du -sh /*
root@mylinuxhost ~$ du -sh /rw/*


This command will list all files & directories below /* and /rw/* and the size of these files.

You need to check the big directories and continue executing du -sh to find the files.

Possible solutions

If you have a big file, please verify:

  • Whether you need all entries in the file? You could remove entries older than a year

  • How old are the files? Do you really need files older than one or two years?

  • What is flushing the file?
    • Is a debug log enabled?
    • A script flushing the file with log messages?
    • An application is crashing and writes a big log?

  • If you have no idea about why the filesystem is being flushed, you could enable the auditd.log
    • We provide guidance only on request!

  • Deleting the log could be one way to solve the problem. But when the log is increasing continuously, you need to follow the steps above! 


Increasing the filesystem

If your filesystem is too small, you might want to consider increasing the filesystem.

Workaround / Solution if the filesystem is sized to 800MB:

  • Workaround: In this case, the filesystem (rw volume) is sized to 800MB by default. To get more space, you need to delete the big files.

  • Solution: With Werk #9295, we increased the rw volume to 4GB. To use this size, you need to:

    1. Update to the latest 1.4 Firmware

    2. Activate IMPI to access the Appliance 

    3. Make a backup of the Appliance as described here: https://docs.checkmk.com/latest/en/appliance_usage.html#cma_backup 

    4. Reset the Appliance to the factory settings

    5. Restore the backup

→ Now you're able to use the new 4GB size for the rw volume