Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Fix misleading log line in example.

...

Status
colourGreen
titleLAST TESTED ON CHECKMK 2.2.0P1


Panel
borderColorblack
bgColor#f8f8f8
titleTable of Contents

Table of Contents

Problem

You are finding log messages in your Checkmk log files ~/var/log/*.log and do not understand what they tell you at first sight.

Solution

The following lists common messages in the log mentioned above files and what can be learned from them.

Preface

All the log entries beginning with [log], like the following, are from the Python application.

Code Block
languagepy
themeRDark
20212024-0910-1531 00:00:4542 [41] [main] [RRD helper 2205] [log] Error creating RRD for cmc_single;$HOST;$SERVICE;;count;0: /opt/omd/sites/$SITE/var/check_mk/rrd/$HOST/_HOST_.rrd: illegal attempt to update using time 1631656844 when last update time is 1631656844client 0] Polling failed: Bad file descriptor

The main thread from the CMC sends the command to the Round Robin Database (RRD) helper process. This process is answering with an error line. So the error is related to the RRD daemon. This has to be kept in mind when troubleshooting these log messages.

...

Code Block
languagebash
themeRDark
OMD[mysite]:~$ pstree -paltu $SITE

cmc.log

MessageDescription
[client $NUMBER] Polling failed: Bad file descriptorWhen using a livestatus proxy, these messages are expected and can be disregarded.

[client $NUMBER] Polling failed: Connection timed out

[client $NUMBER] error: client connection terminated: timeout

These messages generate when the connection to another livestatus daemon or proxy times out. If this message occurs sporadically, you can disregard it. If it occurs frequently, you will see errors in the web interface, indicating an issue with a particular site.
[generic pool] [helper $NUMBER] killed by signal 1The generic helper did not finish in a timely manner or misbehaved in another way, so the core killed it.
[helper $NUMBER] [log] Error in PIGGYBACK fetcher: MKTimeout('Fetcher for host "$HOST" timed out after 60 seconds')The fetcher did not receive an answer from the monitored system and timed out. This issue will be visible in the web interface when repeatedly occurring, as the host will complain about not receiving agent data.
[helper $NUMBER] [log] Error in SNMP fetcher: ValueError("invalid literal for int() with base 10: ''")Probably invalid SNMP output from the monitored device; you should have received a crash report. Please upload this crash report.
[generic pool] [helper $NUMBER] killed by signal 11A memory leak is mainly caused by inline snmp. Please deactivate inline snmp in Setup -> Agents -> SNMP rules ->Hosts using a specific SNMP Backend -> Classic Backend

Filter by label (Content by label)
showLabelsfalse
max5
spacesKB
showSpacefalse
sortmodified
reversetrue
typepage
cqllabel in ( "logfile" , "troubleshooting" , "logging" ) and type = "page" and space = "KB"
labelslogfile logging troubleshooting

...