Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Info

The following article explains how to monitor and adjust the performance in Checkmk.

...

Panel
borderColorblack
bgColor#f8f8f8
titleTable of Contents

Table of Contents

Overview


Info

Checkmk polls all monitored hosts within the configured normal check interval (by default once a minute).
If the service then enters a not-OK-state, Checkmk uses the retry interval to re-check (again, once a minute by default).
If the endpoint is DOWN, Checkmk regularly tries to poll the agent, needs to wait for the timeout (10 seconds by default) before it aborts this try.
This binds a
 fetcher process  for the mentioned amount of time and hence decreases your monitoring performance.
This problem multiplies with the number of endpoints monitored.


Monitoring performance can be independent of hardware utilization. So even if your Checkmk server is not using 100% of e.g., its CPU or memory, monitoring performance can still be poor.
Also, the needed resources are based on the number of services, active checks, and types of hosts. If you have e.g., a lot of SNMP hosts, you'll need more CPU performance for the protocol overhead, compared to our agent.
Let's dive into the possible reasons and how to understand potential issues.

...