Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Info

This might be caused by a bug that has been addressed and fixed with 2.0.0p9: Werk #12663: Fixed steadily rising CPU load due to misconfigured dashboard dashlets

Status
colourGreen
titleLAST TESTED ON CHECKMK 2.0.0P1


Panel
borderColorblack
bgColor#f8f8f8
titleTable of Contents

Table of Contents

Problem

In a distributed setup, you encounter the following problems on the mastercentral site.:

The GUI on the Master Server central site randomly become becomes very slow, and the "liveproxyd.log" gets filled with hundreds of these entries:

Code Block
languagebash
themeRDark
2021-09-30 16:24:45,502 [40] [cmk.liveproxyd.(5252).Site(<remote_site_name>).Client(81)] Reading request from client took longer than 5s
Traceback (most recent call last):
File "/omd/sites/mon_master02mysite/lib/python3/cmk/cee/liveproxy/Client.py", line 150, in receive_new_request
request = self._receive_request()
File "/omd/sites/mon_master02mysite/lib/python3/cmk/cee/liveproxy/Client.py", line 339, in _receive_request
raise ClientRequestTimeoutException(
cmk.cee.liveproxy.Client.ClientRequestTimeoutException: Reading request from client took longer than 5s

Solution

...

...

It seems the livestatus is bombarded with hundreds of requests per second.

You probably might find this one:

Reading request from client took longer than
Code Block
languagebash
themeRDark
Reading requests from client took longer than 5s


This is

...

caused by the GUI, which establishes a connection with the Liveproxyd but then does not send any query.

...

This could indicate an overload of either the GUI or Liveproxyd.


Panel
borderColorblack
bgColor#F8F8F8

If the werk mentioned above is the cause, this workaround might also help prior to before installing p9:

  1. Remove all Host/Service Statistics Snapins in all dashboards
, save them and then readd them.
  1. Save them
  2. Re-add them
  3. Then reload all dashboards in the browser.

If this does not work, please open a ticket and add the ~/var/log/apache/access_log of some of the affected sites, which might show

...

many entries like

"ajax_figure_dashlet_data"

Filter by label (Content by label)
showLabelsfalse
max5
spacesKB
showSpacefalse
sortmodified
reversetrue
typepage
cqllabel in ( "distributed" , "troubleshooting" , "livestatus" ) and type = "page" and space = "KB"
labelslivestatus troubleshooting distributed

Page Properties
hiddentrue


Related issues