Troubleshooting missing messages in Event Console

This article explains why messages may be missing in the Event Console.

LAST TESTED ON CHECKMK 2.0.0P1

Table of Contents

Problem

Log lines are missing from the event console but present in the monitored log file. The configured patterns and filters should allow them to be forwarded to the event console.

Additionally, this behavior is not easily reproducible and happens at random times.

Reason

The reason for this is a particular constellation of configuration options.

  • Per default, the periodic service discovery uses cached agent output stored on the Checkmk server for 120 seconds; after that, it will fetch new data from the host.

  • If the default check interval for the agent is equal to or smaller than 120 seconds (which it is by default: 60 seconds), then this error can not occur.

  • If the normal check interval for the Checkmk agent is more significant than 120 seconds, it can come to the following situation:

    • The agent is queried normally, and the agent output is cached on the Checkmk server

    • The periodic service discovery runs at their configured interval and 120 seconds or more after the agent was queried

    • The periodic service discovery considers the cached data too old and fetches a fresh agent output

    • This output contains new log lines but is never processed by the logwatch check plugin and is not forwarded to the event console.

    • When the agent is queried on the regular interval again, it can not realize that log messages have been fetched but not processed.

Solution


There are two main solutions:

  • Decrease the normal check interval for the Checkmk agent below 120 seconds

  • Manually change the affected sites main.mk and add the following lines. Adapt the $SECONDS to be at least the normal check interval of the Checkmk agent!

    Checkmk does not support this! Do this at your own risk!

    max_cachefile_age = $SECONDS
    cluster_max_cachefile_age = $SECONDS