Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Info

This might be caused by a bug that has been addressed and fixed with 2.0.0p9: https://checkmk.com/werk/12663

Status
colourGreen
titleLAST TESTED ON CHECKMK 2.0.0P1

Table of Contents

Problem

In a distributed setup, you encounter the following problems on the central site.:

The GUI on the central site randomly becomes very slow, and the "liveproxyd.log" gets filled with hundreds of these entries:

Code Block
languagebash
themeRDark
2021-09-30 16:24:45,502 [40] [cmk.liveproxyd.(5252).Site(<remote_site_name>).Client(81)] Reading request from client took longer than 5s
Traceback (most recent call last):
File "/omd/sites/mysite/lib/python3/cmk/cee/liveproxy/Client.py", line 150, in receive_new_request
request = self._receive_request()
File "/omd/sites/mysite/lib/python3/cmk/cee/liveproxy/Client.py", line 339, in _receive_request
raise ClientRequestTimeoutException(
cmk.cee.liveproxy.Client.ClientRequestTimeoutException: Reading request from client took longer than 5s

Solution

It seems the livestatus is bombarded with hundreds of requests per second.

You probably might find this one:

Code Block
languagebash
themeRDark
Reading requests from client took longer than 5s


This is caused by the GUI, which establishes a connection with the Liveproxyd but then does not send any query. This could indicate an overload of either the GUI or Liveproxyd.


Panel

If the werk mentioned above is the cause, this workaround might also help before installing p9:

  1. Remove all Host/Service Statistics Snapins in all dashboards
  2. Save them
  3. Re-add them
  4. Then reload all dashboards in the browser.

If this does not work, please open a ticket and add the ~/var/log/apache/access_log of some of the affected sites, which might show many entries like

"ajax_figure_dashlet_data"

Filter by label (Content by label)
showLabelsfalse
max5
spacesKB
showSpacefalse
sortmodified
reversetrue
typepage
cqllabel in ("distributed","troubleshooting","livestatus") and type = "page" and space = "KB"
labelslivestatus troubleshooting distributed

Page Properties
hiddentrue


Related issues