Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Panel
borderColorblack
bgColor#f8f8f8
titleTable of Contents

Table of Contents

Getting Started

Background information regarding this subject is available in our Official documentation

Basics

Check the OMD <SITENAME> performance graphs of the affected

...

Info

If the map takes a lot of time to open, you might need to debug further. In this case, we recommend checking the Livestatus queries while reloading the map.

Network analyze

To see how long the map really needs, we recommend using the network analyzer of your internet browser: Enable Checkmk profiling#NetworkAnalyzewiththeinternetbrowser.

Debugging with Livestatus

Enable the debug log

How to collect troubleshooting data for various issue types#LivestatusProxy

Debug with the lq queries

Info

The best way to debug with the lq queries is:

  1. tail -f ~/var/log/liveproxyd.log >/path/to/file.txt
  2. reload the nagvis map
  3. analyze the file

Detect long-running lq query.

Do you see any:

  • bigger lq query
  • a log query
  • a periodical message

...

Livestatus queries#Livestatusqueriesovernetwork

One example

Infrastructure

Info

OS: Ubuntu 20.4

Version: Checkmk 1.6.0p24

Sites: 1 Central and one Remote


The map

This is a dynamic map with my remote site as a backend. I created and accessed the map via the central site.

Screenshot showing showing the web inspector


The Debugging

This approach is only if you're running a distributed setup. So, in this case, you can run that command on the central site.

...

The whole logfile: lq_nagvis.txt


What I noticed in the logfile

  • A significant amount of lq "GET downtimes" commands during the map reload
  • If I count the "GET downtimes" lines, there are 4836

    Code Block
    languagebash
    themeRDark
    OMD[mysite]:~$ cat /tmp/lq_nagvis.txt |grep "GET downtimes" |wc -l
    4836


  • All the other commands look small and reasonable.


Further debugging

I noticed a lot of "GET downtimes" from the log. Whenever I reload the map, my central site sends thousands of commands via livestatus. 

When I check my Checkmk site, I set several host downtimes. This could explain why my central site collects all Downtimes before nagvis shows the map.

Screenshot showing the all host page


The Workaround

  1. Remove all downtimes. The map will open faster

  2. Access the map directly via the remote site/local site

Info

We fixed this behavior with Checkmk 2.0. The downtimes will not affect the reload time of the map

Filter by label (Content by label)
showLabelsfalse
max5
spacesKB
showSpacefalse
sortmodified
reversetrue
typepage
cqllabel in ( "checkmk" , "nagvis" , "maps" ) and type = "page" and space = "KB"
labelsnagvis maps checkmk

...