Troubleshooting "cmk.cee.liveproxy.Client.ClientRequestTimeoutException" error

This might be caused by a bug that has been addressed and fixed with 2.0.0p9: Werk #12663: Fixed steadily rising CPU load due to misconfigured dashboard dashlets

LAST TESTED ON CHECKMK 2.0.0P1

Table of Contents

Problem

In a distributed setup, you encounter the following problems on the central site.:

The GUI on the central site randomly becomes very slow, and the "liveproxyd.log" gets filled with hundreds of these entries:

2021-09-30 16:24:45,502 [40] [cmk.liveproxyd.(5252).Site(<remote_site_name>).Client(81)] Reading request from client took longer than 5s
Traceback (most recent call last):
File "/omd/sites/mysite/lib/python3/cmk/cee/liveproxy/Client.py", line 150, in receive_new_request
request = self._receive_request()
File "/omd/sites/mysite/lib/python3/cmk/cee/liveproxy/Client.py", line 339, in _receive_request
raise ClientRequestTimeoutException(
cmk.cee.liveproxy.Client.ClientRequestTimeoutException: Reading request from client took longer than 5s

Solution

It seems the livestatus is bombarded with hundreds of requests per second.

You probably might find this one:

Reading requests from client took longer than 5s


This is caused by the GUI, which establishes a connection with the Liveproxyd but then does not send any query. This could indicate an overload of either the GUI or Liveproxyd.


If the werk mentioned above is the cause, this workaround might also help before installing p9:

  1. Remove all Host/Service Statistics Snapins in all dashboards
  2. Save them
  3. Re-add them
  4. Then reload all dashboards in the browser.

If this does not work, please open a ticket and add the ~/var/log/apache/access_log of some of the affected sites, which might show many entries like

"ajax_figure_dashlet_data"