How-to increase WMI timeout
This article explains how-to increase the Windows Management Instrumentation timeout to prevent Checkmk services from going stale.
LAST TESTED ON CHECKMK 2.3.0P1
Problem
The Windows Management Instrumentation tends to time out on a regular basis, no matter which version is used or how big the machine is sized. Since WMI is responsible for collecting all performance data within Windows (which can also be viewed with perfmon on the specific machines), this may likely result in Checkmk services going stale, such as "Memory & Pagefile" and "Processor Queue".
- The easiest fix: reboot. If you reboot your systems regularly, i.e., once a month due to patching, this problem should occur quite rarely.
- Another approach is to increase the WMI timeout
Inside the C:\ProgramData\checkmk\agent\log\check_mk.log, you should see an error like this:
2021-08-03 17:15:35.063 [Err ] Timeout [3] seconds broken when query WMI
The default WMI Timeout value is 3 seconds, but this can be increased up to 12 seconds.
Solution for Checkmk 2.1 and newer
As per werk #12328, the WMI timeout can be configured via the agent bakery, by using the rule: Setup > Agents > Windows, Linux, Solaris, AIX > Agent rules > Windows WMI Timeout > Add rule
Since this is an agent rule, don't forget to bake a new agent and install it on the monitored host.
Useful for troubleshooting:
On the Windows host, the wmi_timeout configuration will be written inside the C:\ProgramData\checkmk\agent\bakery\check_mk.bakery file, like this:
global: ... wmi_timeout: 7 ...
Solution for Checkmk versions older than 2.1
Please follow this guidance to increase the timeout.
Go to the agent directory C:\ProgramData\checkmk\agent .
.
Open the check_mk.user.yml file and search for the wmi_timeout section. Remove the '#' and select the value.
global: ... wmi_timeout: 7 # <- 7 sec, default is 3 ...
.
Related articles