...
Status | ||||
---|---|---|---|---|
|
Panel | ||||||
---|---|---|---|---|---|---|
| ||||||
|
Getting Started
Background information regarding this subject is available in our Official documentation
Basics
Check the OMD <SITENAME> performance graphs of the affected
- Important are the following graphs
- Livestatus Connects and Requests - localhost - OMD nagnis_central performance.
- Livestatus Requests per Connection - localhost - OMD nagnis_central performance.
- Livestatus usage - localhost - OMD nagnis_central performance.
- Check_MK helper usage - localhost - OMD nagnis_central performance.
- Do you see peaks in these graphs? If yes, please check the liveproxyd.log inside the site user context.
- Please check the Livestatus Proxy settings.
- "Maximum concurrent Livestatus connections": inside the global and site-specific global settings.
"Livestatus Proxy default connection parameters": inside the global and site-specific global settings.
Note We recommend using the default levels. In some cases, it makes sense to configure higher values. Please ask the support for some guidance.
- Cleanup your map:
- Do you have objects in your map which that are no longer available in Checkmk?
- Do you have a map with nested maps? Please check if you have objects that are no longer available in Checkmk.
- How often is your Nagvis map refreshing? You can modify this value.
Info |
---|
If the map takes a lot of time to open, you might need to debug further. In this case, we recommend checking the Livestatus queries while reloading the map. |
Network analyze
To see how long the map really needs, we recommend using the network analyzer of your internet browser: Enable Checkmk profilingprofiling#NetworkAnalyzewiththeinternetbrowser.
Debugging with Livestatus
Enable the debug log
How to collect troubleshooting data for various issue types#LivestatusProxy
Debug with the lq queries
Info |
---|
The best way to debug with the lq queries is:
|
Detect long-running lq query.
Do you see any:
- bigger lq query
- a log query
- a periodical message
You can try to execute this query via the network and see how long it takes:
Livestatus Queries#Livestatusqueriesovernetworkqueries#Livestatusqueriesovernetwork
One example
Infrastructure
Info |
---|
OS: Ubuntu 20.4 Version: Checkmk 1.6.0p24 Sites: 1 Central and one Remote |
The map
This is a dynamic map with my remote site as a backend. I created and accessed the map via the central site.
The Debugging
This approach is only if you're running a distributed setup. So, in this case, you can run that command on the central site.
...
The whole logfile: lq_nagvis.txt
What I noticed in the logfile
- A significant amount of lq "GET downtimes" commands during the map reload
If I count the "GET downtimes" lines, there are 4836
Code Block language bash theme RDark OMD[mysite]:~$ cat /tmp/lq_nagvis.txt |grep "GET downtimes" |wc -l 4836
- All the other commands look small and reasonable.
Further debugging
I noticed a lot of "GET downtimes" from the log. Whenever I reload the map, my central site sends thousands of commands via livestatus.
When I check my Checkmk site, I set several host downtimes. This could explain why my central site collects all Downtimes before nagvis shows the map.
The Workaround
- Remove all downtimes. The map will open faster
- Access the map directly via the remote site/local site
Info |
---|
We fixed this behavior with Checkmk 2.0. The downtimes will not affect the reload time of the map |
Related
...
Articles
Filter by label (Content by label) | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Page Properties | ||
---|---|---|
| ||
|
...