Info |
---|
During a device backup on the Checkmk, it may fail. In this guidance, we show you some error messages and how to solve thisThis manual is an extension of our general Checkmk backup article: Checkmk Backups |
Status | ||||
---|---|---|---|---|
|
Panel | ||||||
---|---|---|---|---|---|---|
| ||||||
|
Problem
Basic information about mkbackup
The manual here is a more detailed version of our general Checkmk backup utility: https://docs.checkmk.com/master/en/backup.html
After configuring the backup job in Webconf, a cronjob is created here. So you can grab the command and execute it via the command lineThis job can be inspected on the command line, after logging in via SSH as site user:
Code Block | ||||
---|---|---|---|---|
| ||||
OMD[mysite]:~$ cat etc/cron.d/mkbackup # Written by mkbackup configuration 0 0 * * * mkbackup backup mybackup >/dev/null OMD[cmamysite]:~$ mkbackup backup hmybackup 2022-05-17 16:02:22 --- Starting backup (Check_MK-cma-cmamysite-hmybackup to thanosmytarget) --- 2022-05-17 16:02:24 Verifying backup consistency 2022-05-17 16:02:24 Cleaning up previously completed backup 2022-05-17 16:02:24 --- Backup completed (Duration: 0:00:01, Size: 7942.6000 MB, IO: 0.0042 B/s) --- OMD[cmamysite]:~$ |
If you need more debugging, you can add --verbose and --debug to the mkbackup command:
Code Block | ||||
---|---|---|---|---|
| ||||
OMD[cma]:~$ mkbackup --verbose --debug backup mybackup |
Collection of error messages
Failed to perform a backup: [Errno 104] Connection reset by peer
Code Block | ||||
---|---|---|---|---|
| ||||
2021-03-17 11:10:20 --- Starting backup (Check_MK_Appliance-test+stage+106-nfs+backup+appliance to nfs-backup-appliance) --- 2021-03-17 11:10:20 Performing system backup (system.tar) 2021-03-17 11:10:25 Performing system data backup (system-data.tar) 2021-03-17 11:10:48 Performing site backup: test Site backup failed: Failed to perform backup: [Errno 104] Connection reset by peer |
Solution
Find the correct backup job
Code Block | ||||
---|---|---|---|---|
| ||||
OMD[mysite]:~$ mkbackup jobs Job-ID Title ------------------------------------------------------------ myid mytitle OMD[mysite]:~$ |
Please run the backup directly on the command line and forward the output to a log file.
Code Block | ||||
---|---|---|---|---|
| ||||
OMD[mysite]:~$ omd -v backup --no-compression mybackup - >~/path/to/my_backup.txt |
...
Code Block | ||||
---|---|---|---|---|
| ||||
Pausing RRD updates for /omd/sites/testmysite/var/pnp4nagios/perfdata/SAP01myhost/DRBDmy_drbd3_disk_read.rrd rrdcached command: SUSPEND /omd/sites/testmysite/var/pnp4nagios/perfdata/SAP01myhost/DRBD_drbd3my_disk_read.rrd rrdcached response: '-1 /omd/sites/testmysite/var/pnp4nagios/perfdata/SAP01myhost/DRBDmy_drbd3_disk_read.rrd - No such file or directory\n' Resuming RRD updates for /omd/sites/testmysite/var/pnp4nagios/perfdata/SAP01myhost/DRBD_drbd3my_disk_read.rrd rrdcached command: RESUME /omd/sites/testmysite/var/pnp4nagios/perfdata/SAP01myhost/DRBD_drbd3my_disk_read.rrd skipping rrdcached command (broken pipe) Pausing RRD updates for /omd/sites/testmysite/var/pnp4nagios/perfdata/SAP01myhost/DRBD_drbd3my_disk_write.rrd rrdcached command: SUSPEND /omd/sites/testmysite/var/pnp4nagios/perfdata/SAP01myhost/DRBDmy_drbd3_disk_write.rrd rrdcached response: '-1 /omd/sites/testmysite/var/pnp4nagios/perfdata/SAP01myhost/DRBD_drbd3my_disk_write.rrd - No such file or directory\n' Resuming RRD updates for /omd/sites/testmysite/var/pnp4nagios/perfdata/SAP01myhost/DRBDmy_drbd3_disk_write.rrd rrdcached command: RESUME /omd/sites/testmysite/var/pnp4nagios/perfdata/SAP01myhost/DRBD_drbd3my_disk_write.rrd Failed to perform backup: [Errno 104] Connection reset by peer |
...
Here it looks like Checkmk is using pnp4nagios instead of Round Robin Database (RRD). We recommend converting the performance Data data to the rrd RRD format. Please follow the steps described here: https://docs.checkmk.com/latest/en/graphing.html#customise_rrds Customizing the RRD structure
Don't forget to stop the site before converting the files!
...
Now the backup should run without any errors.
UnicodeEncodeError: 'utf-8' codec can't encode character '\udcc3' in position 122: surrogates not allowed
Code Block | ||||
---|---|---|---|---|
| ||||
Job state: Site mysite Backup ############################################# Site backup State Failed Runtime Started at 2022-06-21 03:00:02, Finished at 2022-06-21 03:00:02 (Duration: 0:16:36) Output 2022-06-21 03:00:02 — Starting backup (Check_MK-mysite+cmk2-mysite-mysite+bak to Reload) — 2022-06-21 03:00:02 Found previous incomplete backup. Cleaning up those files. Site backup failed: Traceback (most recent call last): File "/omd/sites/mysite/bin/omd", line 60, in <module> omdlib.main.main() File "/omd/versions/2.0.0p23.cee/lib/python3/omdlib/main.py", line 4022, in main command.handler(version_info, site, global_opts, args, command_options) File "/omd/versions/2.0.0p23.cee/lib/python3/omdlib/main.py", line 2753, in main_backup omdlib.backup.backup_site_to_tarfile(site, fh, tar_mode, options, global_opts.verbose) File "/omd/versions/2.0.0p23.cee/lib/python3/omdlib/backup.py", line 54, in backup_site_to_tarfile _backup_site_files_to_tarfile(site, tar, options) File "/omd/versions/2.0.0p23.cee/lib/python3/omdlib/backup.py", line 112, in _backup_site_files_to_tarfile tar.add(site.dir, site.name, filter=filter_files) File "/omd/versions/2.0.0p23.cee/lib/python3/omdlib/backup.py", line 134, in add super(BackupTarFile, self).add(name, arcname, recursive, filter=filter) File "/omd/versions/2.0.0p23.cee/lib/python3.8/tarfile.py", line 1977, in add self.add(os.path.join(name, f), os.path.join(arcname, f), File "/omd/versions/2.0.0p23.cee/lib/python3/omdlib/backup.py", line 134, in add super(BackupTarFile, self).add(name, arcname, recursive, filter=filter) File "/omd/versions/2.0.0p23.cee/lib/python3.8/tarfile.py", line 1977, in add self.add(os.path.join(name, f), os.path.join(arcname, f), File "/omd/versions/2.0.0p23.cee/lib/python3/omdlib/backup.py", line 134, in add super(BackupTarFile, self).add(name, arcname, recursive, filter=filter) File "/omd/versions/2.0.0p23.cee/lib/python3.8/tarfile.py", line 1977, in add self.add(os.path.join(name, f), os.path.join(arcname, f), File "/omd/versions/2.0.0p23.cee/lib/python3/omdlib/backup.py", line 134, in add super(BackupTarFile, self).add(name, arcname, recursive, filter=filter) File "/omd/versions/2.0.0p23.cee/lib/python3.8/tarfile.py", line 1977, in add self.add(os.path.join(name, f), os.path.join(arcname, f), File "/omd/versions/2.0.0p23.cee/lib/python3/omdlib/backup.py", line 134, in add super(BackupTarFile, self).add(name, arcname, recursive, filter=filter) File "/omd/versions/2.0.0p23.cee/lib/python3.8/tarfile.py", line 1977, in add self.add(os.path.join(name, f), os.path.join(arcname, f), File "/omd/versions/2.0.0p23.cee/lib/python3/omdlib/backup.py", line 134, in add super(BackupTarFile, self).add(name, arcname, recursive, filter=filter) File "/omd/versions/2.0.0p23.cee/lib/python3.8/tarfile.py", line 1971, in add self.addfile(tarinfo, f) File "/omd/versions/2.0.0p23.cee/lib/python3/omdlib/backup.py", line 158, in addfile self._suspend_rrd_update(rrd_file_path) File "/omd/versions/2.0.0p23.cee/lib/python3/omdlib/backup.py", line 169, in _suspend_rrd_update self._send_rrdcached_command("SUSPEND %s" % path) File "/omd/versions/2.0.0p23.cee/lib/python3/omdlib/backup.py", line 199, in _send_rrdcached_command self._sock.sendall(("%s\n" % cmd).encode("utf-8")) UnicodeEncodeError: 'utf-8' codec can't encode character '\udcc3' in position 122: surrogates not allowed |
Solution
Please run the backup directly on the command line and forward the output to a log file.
...
This issue is that this file contains a non-ascii character at the end. "AUTORIT�.rrd"
To correct this, we must delete or rename this file. The safest solution would be to rename it.
Code Block | ||||
---|---|---|---|---|
| ||||
OMD[mysite]:~$ mv oldfilename newfilename |
Related articles
Filter by label (Content by label) | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
...