This might be caused by a bug that has been addressed and fixed with 2.0.0p9: https://checkmk.com/werk/12663
LAST TESTED ON CHECKMK 2.0.0P1
Problem
In a distributed setup, you encounter the following problems on the central site.:
The GUI on the central site randomly becomes very slow, and the "liveproxyd.log" gets filled with hundreds of these entries:
2021-09-30 16:24:45,502 [40] [cmk.liveproxyd.(5252).Site(<remote_site_name>).Client(81)] Reading request from client took longer than 5s Traceback (most recent call last): File "/omd/sites/mysite/lib/python3/cmk/cee/liveproxy/Client.py", line 150, in receive_new_request request = self._receive_request() File "/omd/sites/mysite/lib/python3/cmk/cee/liveproxy/Client.py", line 339, in _receive_request raise ClientRequestTimeoutException( cmk.cee.liveproxy.Client.ClientRequestTimeoutException: Reading request from client took longer than 5s
Solution
It seems the livestatus is bombarded with hundreds of requests per second.
You probably might find this one:
Reading requests from client took longer than 5s
This is caused by the GUI, which establishes a connection with the Liveproxyd but then does not send any query. This could indicate an overload of either the GUI or Liveproxyd.
If the werk mentioned above is the cause, this workaround might also help before installing p9:
- Remove all Host/Service Statistics Snapins in all dashboards
- Save them
- Re-add them
- Then reload all dashboards in the browser.
If this does not work, please open a ticket and add the ~/var/log/apache/access_log of some of the affected sites, which might show many entries like
"ajax_figure_dashlet_data"
Related articles