Troubleshooting the agent controller

With Checkmk 2.1, we released the new agent controller with a TLS connection. Detailed docs can be found here:

LAST TESTED ON CHECKMK 2.2.0P1

Table of Contents

Agent controller - Connection refused

Problem

In this manual, we will show you how to debug the below error:

Screenshot showing an error of Communication failed. Connection refused.


Please run the following commands to check the state of the agent controller and the port states:


For Linux

root@mylinuxhost~# ss -tulpn | grep 6556  

root@mylinuxhost~# ps waux | grep cmk-agent-ctl  

root@mylinuxhost~# cmk-agent-ctl status  

root@mylinuxhost~# systemctl status check-mk-agent.socket

root@mylinuxhost~# systemctl status cmk-agent-ctl-daemon.service


For Windows

netsh interface ipv4 show excludedportrange protocol=tcp

Netstat for Windows (check the open/listening ports):

netstat -anb > output.txt

Solution

For Linux

One possible solution is to follow the steps in the article Troubleshooting Checkmk agent systemd service repeatedly failing on CentOS 7

Registration with cmk-agent-ctl is not working

Problem

After installing the agent, the registration is not working and showing the following issue:


root@mylinuxhost~# cmk-agent-ctl register
ERROR [cmk_agent_ctl] Something seems wrong with the agent socket (/run/check-mk-agent.socket), aborting


Solution

Please check and verify that the agent controller is running:

root@mylinuxhost~# ss -tulpn | grep 6556   

root@mylinuxhost~# ps waux | grep cmk-agent-ctl   

root@mylinuxhost~# cmk-agent-ctl status   

root@mylinuxhost~# systemctl status check-mk-agent.socket

root@mylinuxhost~# systemctl status cmk-agent-ctl-daemon.service


In that case, the agent controller is not running because xinetd is running. For the agent controller, systemd is necessary: https://checkmk.com/werk/13865

If you're using the bakery, you need to create the following rule:

Screenshot of adding a new rule for Checkmk agent network service. Host tags are set to OS type is Linux.

Without the bakery, please follow these steps: https://docs.checkmk.com/latest/en/agent_linux_legacy.html#_systemd


ERROR [cmk_agent_ctl] Failed to discover agent receiver port from Checkmk REST API, both with http and https.

Problem

You encounter this error when registering the agent controller:

root@mylinuxhost:~# cmk-agent-ctl register -H <host> -s <checkmk-server> -i <site> -U <username>
ERROR [cmk_agent_ctl] Failed to discover agent receiver port from Checkmk REST API, both with http and https.

Error with http:
Failed to discover agent receiver port from http://<checkmk-server>/<site>/check_mk/api/1.0/domain-types/internal/actions/discover-receiver/invoke
error sending request for url (http://<checkmk-server>/<site>/check_mk/api/1.0/domain-types/internal/actions/discover-receiver/invoke): error trying to connect: error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed:ssl/statem/statem_clnt.c:1914: (unable to get local issuer certificate)
error trying to connect: error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed:ssl/statem/statem_clnt.c:1914: (unable to get local issuer certificate)
error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed:ssl/statem/statem_clnt.c:1914: (unable to get local issuer certificate)
error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed:ssl/statem/statem_clnt.c:1914:

Error with https:
Failed to discover agent receiver port from https://<checkmk-server>/<site>/check_mk/api/1.0/domain-types/internal/actions/discover-receiver/invoke
error sending request for url (https://<checkmk-server>/<site>/check_mk/api/1.0/domain-types/internal/actions/discover-receiver/invoke): error trying to connect: error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed:ssl/statem/statem_clnt.c:1914: (unable to get local issuer certificate)
error trying to connect: error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed:ssl/statem/statem_clnt.c:1914: (unable to get local issuer certificate)
error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed:ssl/statem/statem_clnt.c:1914: (unable to get local issuer certificate)
error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed:ssl/statem/statem_clnt.c:1914:

Solution #1

Register the agent with the receiver port (by default 8000, counting up similar to site Apache 5000 → 5001 ...) like so: 

root@mylinuxhost:~# cmk-agent-ctl register -H <host> -s <checkmk-server> -i <site> -U <username> -p:8000

Solution #2

Add the self-signed certificate to the OS's cert store. Example for Ubuntu:  https://ubuntu.com/server/docs/security-trust-store


ERROR [cmk_agent_ctl] Error while loading registered connections.

Detailed error message:

ERROR [cmk_agent_ctl] Error while loading registered connections.

Caused by:
    Failed to split into server and port at ':' at line 4 column 24


If you see this error message when trying to work with any subcommand of cmk-agent-ctl, there is probably something wrong with the file /var/lib/cmk-agent/registered_connections.json.

Solution

First, move the file registered_connections.json to registered_connections.json.bak and re-run the command. If that works, you can start checking the content of the file. If you still need the registration data stored in the file, check the line and column in the error message and try to repair it. If you don't need it anymore, delete the file.

Request failed with code 403 Forbidden: Unauthorized - Details: Unauthorized to read the global settings.

Detailed error message:

cmk-agent-ctl.exe register --server myserver --site mysite --user cmkadmin --password mypwd --hostname myhost
Attempting to register at myserver, port 8000. Server certificate details:

PEM-encoded certificate:
----BEGIN CERTIFICATE----
…
----END CERTIFICATE----

Issued by:
Site 'mysite' local CA
Issued to:
mysite
Validity:
From Mon, 12 Jun 2023 16:20:18 +0000
To Wed, 12 Jun 3022 16:20:18 +0000

Do you want to establish this connection? [Y/n]

Y
[2023-07-21 15:04:06.714043 +02:00] ERROR [cmk_agent_ctl] src
[main.rs:29|http://main.rs:29]: Error registering existing host at https://myserver:8000/mysite

Caused by:
Request failed with code 403 Forbidden: Unauthorized - Details: Unauthorized to read the global settings


If you see this error message when trying to work with any subcommand of cmk-agent-ctl, there is probably something wrong with the file /var/lib/cmk-agent/registered_connections.json.

Solution

This error occurs when the automation user, which is used internally by the agent controller to gather more information, doesn't have the admin role.