Troubleshooting the notification path
Background
In some cases it might be necessary to troubleshoot the notification path due to:
Missing alerts
Delayed alert delivery
The root cause can be a misconfiguration, e.g., a misconfigured host check interval.
Some background information in order to understand the notification path: The path of a notification from beginning to end
Step-by-step guide
Enable Debug Logging
To gather more insight into what's happening during notification handling, you can enable increased log levels.
Increasing the log levels leads to increasing log file size and can make subsequent troubleshooting harder. Please make sure reset the log levels to defaults, as soon as your immediate troubleshooting concludes.
CMC Logging
Navigate to Setup → Global Settings → Monitoring core → Logging of the core → Notification system
Set to "Debug"
This setting increases the verbosity of the cmc.log
Notification Log Level
Navigate to Setup → Global Settings → Notifications → Notification log level
Set to "Full dump of all variables and command"
This setting increases the verbosity of the notify.log
Simulate a Notification
Note: Starting with Checkmk 2.3.0, the built-in "Test notifications" feature should be used instead of the steps below!
To reproduce the issue under controlled conditions:
Open the Events of Service view for the affected service
Disable flap detection via Master Control to avoid suppression.
Ensure no delay rules or check attempt rules are active
Create a fake check (e.g., set a known service manually to CRIT )
Monitor Logs
To observe notification processing in real time inside the cmc.log:
OMD[mysite]:~/var/log$ tail -f cmc.log | grep "HOSTNAME;SERVICENAME"
Typical output might show lines like the following:
[notification helper] ... hard state change to CRITICAL [notification helper] ... postponing, last host check not recent enough [notification helper] ... sending PROBLEM notification to its contacts
To observe notification processing in real time inside the notify.log:
OMD[mysite]:~/var/log$ tail -f notify.log | grep "HOSTNAME;SERVICENAME"
If this log stays empty, the issue will be somewhere in the cmc.log
Analysis
In this case, logs revealed repeated postponement of notifications due to an outdated host check timestamp. Notifications were only sent once the host check became recent.
Key Finding
The affected site was running Checkmk 2.2.0p24.
Not directly affected by Werk #16509: Fix notifications postponed forever with Use the status of the service, though related symptoms were observed.
Configuration details:
Host check command: PING
Host check interval: 5 minutes
Service check interval: 60 seconds
This interval mismatch caused notification delays of up to 5 minutes.
Recommended Fix
Update the host check interval to match the service check interval:
Suggested setting: 60 seconds
Result: Notification delay resolved under observation
Related articles