Page Comparison

Info
This article helps debug issues with various Checkmk special agents.

Status

colour	Green
title	LAST TESTED ON CHECKMK 2.03.0P1

Panel

borderColor	black
bgColor	#f8f8f8
title	Table of Contents

Table of Contents

...

Code Block

language	bash
theme	RDark
title	For Checkmk 1.6 and below, 2.0 & 2.1

echo '{"access_key_id": "xxxxxxxxxxxxxxxx", "secret_access_key": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"}' | ~/share/check_mk/agents/special/agent_aws '--regions' 'eu-central-1' '--services' 'cloudwatch_alarms' 'dynamodb' 'ebs' 'ec2' 'elb' 'elbv2' 'glacier' 'rds' 's3' 'wafv2' '--ec2-limits' '--ebs-limits' '--s3-limits' '--glacier-limits' '--elb-limits' '--elbv2-limits' '--rds-limits' '--cloudwatch_alarms-limits' '--dynamodb-limits' '--wafv2-limits' '--cloudwatch-alarms' '--wafv2-cloudfront' '--hostname' 'aws' --debug --v

...

Code Block

language	bash
theme	RDark
title	For Checkmk 2.0, 2 .1 and 2.2above

/omd/sites/mysite/share/check_mk/agents/special/agent_aws --access-key-id MYACCESSKEYID --secret-access-key MYSECRETEACCESSKEY --regions MYAWSREGION --global-services ce cloudfront route53 --services cloudwatch_alarms dynamodb ebs ec2 ecs elasticache elb elbv2 glacier lambda rds s3 sns wafv2 --ec2-limits --ebs-limits --s3-limits --glacier-limits --elb-limits --elbv2-limits --rds-limits --cloudwatch_alarms-limits --dynamodb-limits --wafv2-limits --lambda-limits --sns-limits --ecs-limits --elasticache-limits --s3-requests --cloudwatch-alarms --wafv2-cloudfront --cloudfront-host-assignment aws_host --hostname aws --piggyback-naming-convention ip_region_instance

Viewing AWS Options

To view more AWS options, use the cmk -D aws combined with grep

Additional debugging

Code Block

language	bash
theme	RDark

OMD[mysite]:~$ cmk -D aws |grep -A2  "Type of agent"

Type of agent:          
  Program: /omd/sites/mysite/share/check_mk/agents/special/agent_aws --/omd/sites/mysite/share/check_mk/agents/special/agent_aws --access-key-id MYACCESSKEYMYACCESSKEYID --secret-access-key MYSECRETKEYMYSECRETEACCESSKEY --regions MYAWSREGION --global-services ce cloudfront route53ecs --ecs-limits --hostname MYHOSTNAME --services cloudwatch_alarms dynamodb ebs ec2 ecs elasticache elb elbv2 glacier lambda rds s3 sns wafv2 --ec2-limits --ebs-limits --s3-limits --glacier-limits --elb-limits --elbv2-limits --rds-limits --cloudwatch_alarms-limits --dynamodb-limits --wafv2-limits --piggyback-naming-convention ip_region_instance --debug --verbose --vcrtrace trace.txt &> debug.log

Two files should be created on the current active directory, "trace.txt" and "debug.log". Please verify these files were populated and send them to us.

Viewing AWS Options

To view more AWS options, use the cmk -D aws combined with grep

Code Block

language	bash
theme	RDark

OMD[mysite]:~$ cmk -D aws |grep -A2  "Type of agent"

Type of agent:          
  Program: /omd/sites/mysite/share/check_mk/agents/special/agent_aws --access-key-id MYACCESSKEY --secret-access-key MYSECRETKEY --regions MYAWSREGION --global-services ce cloudfront route53 --services cloudwatch_alarms dynamodb ebs ec2 ecs elasticache elb elbv2 glacier lambda rds s3 sns wafv2 --ec2-limits --ebs-limits --s3-limits --glacier-limits --elb-limits --elbv2-limits --rds-limits --cloudwatch_alarms-limits --dynamodb-limits --wafv2-limits --lambda-limits --sns-limits --ecs-limits --elasticache-limits --s3-requests --cloudwatch-alarms --wafv2-cloudfront --cloudfront-host-assignment aws_host --hostname aws --piggyback-naming-convention ip_region_instance

...

Info

title	More information

https://docs.checkmk.com/latest/en/monitoring_azure.html Monitoring Microsoft Azure

Troubleshooting Microsoft Azure - "Graph client: Insufficient privileges to complete the operation" error

...

The first step would be to find the complete command of the Kubernetes special agent.

The command can be found under "Type of agent >> Program." It will consist of multiple parameters depending on how the datasource program rule has been configured.

Code Block

language	bash
theme	RDark

OMD[mysite]:~$ cmk -D k8s | more

k8s 
Addresses: No IP
Tags: [address_family:no-ip], [agent:special-agents], [criticality:prod], [networking:lan],
[piggyback:auto-piggyback], [site:a21], [snmp_ds:no-snmp], [tcp:tcp]
Labels: [cmk/kubernetes/cluster:at], [cmk/kubernetes/object:cluster], [cmk/site:k8s]
Host groups: check_mk
Contact groups: all
Agent mode: No Checkmk agent, all configured special agents
Type of agent: 
Program: /omd/sites/mysite/share/check_mk/agents/special/agent_kube '--cluster' 'k8s' '--token' 'xyz' '--monitored-objects' 'deployments' 'daemonsets' 'statefulsets' 'nodes' 'pods' '--api-server-endpoint' 'https://<YOUR-IP>:6443' '--api-server-proxy' 'FROM_ENVIRONMENT' '--cluster-collector-endpoint' 'https://<YOUR-ENDPOINT>:30035' '--cluster-collector-proxy' 'FROM_ENVIRONMENT'
Process piggyback data from /omd/sites/mysite/tmp/check_mk/piggyback/k8s
Services:
...

Note

An easier way would be this command: /bin/sh -c "$(cmk -D k8s | grep -A1 "^Type of agent:" | grep "Program:" | cut -f2- -d':')"

Please note that if a line matching "^Type of agent:" followed by a line matching "^ Program:" exists more than once, the output might be messed up.

.

The special agent has the below options available for debugging purposes:

Code Block

language	bash
theme	RDark

OMD[mysite]:~$ /omd/sites/mysite/share/check_mk/agents/special/agent_kube -h
...
--debug                     Debug mode: raise Python exceptions
-v / --verbose 				Verbose mode (for even more output use -vvv)
--vcrtrace FILENAME         Enables VCR tracing for the API calls
...

.

Now, you can modify the above command of the Kubernetes special agent like this:

Code Block

language	bash
theme	RDark

OMD[mysite]:~$ /omd/sites/mysite/share/check_mk/agents/special/agent_kube  \
'--cluster' 'at' \
'--token' 'xyz' \
'--monitored-objects' 'deployments' 'daemonsets' 'statefulsets' 'nodes' 'pods' \
'--api-server-endpoint' 'https://<YOUR-IP>:6443' \
'--api-server-proxy' 'FROM_ENVIRONMENT' \
'--cluster-collector-endpoint' 'https://<YOUR-ENDPOINT>:30035' \
'--cluster-collector-proxy' 'FROM_ENVIRONMENT' \
--debug -vvv --vcrtrace ~/tmp/vcrtrace.txt > ~/tmp/k8s_with_debug.txt 2>&1

Here, you can also reduce the number of '--monitored-objects' to a few resources to get less output.
.

Run the special agent with no debug options to create an agent output, or you could download it from the cluster host via the Checkmk web interface.

Code Block

language	bash
theme	RDark

/omd/sites/mysite/share/check_mk/agents/special/agent_kube '--cluster' 'at' '--token' 'xyz' '--monitored
-objects' 'deployments' 'daemonsets' 'statefulsets' 'nodes' 'pods' '--api-server-endpoint' 'https://<YOUR-IP>:6443' '--api-server-proxy' 'FROM_ENVIRONMENT' '--cluster-collector-endpoint' 'https://<YOUR-ENDPOINT>:30035' '--cluster-collector-proxy' 'FROM_ENVIRONMENT' > ~/tmp/k8s_agent_output.txt 2>&1

.

Please upload the following files to the support ticket.

...

Context: the Kubernetes special agent is slightly unconventional relative to other Special agents as it handles up to three different datasources (the API, the cluster collector container metrics, and the cluster collector node metrics)
- the The connection to the Kubernetes API server is mandatory, while the connection to the others is optional (and decided through the configured Datasource rule)
  - Failure to connect to the Kubernetes API server will be shown by the Checkmk service (as usual) → the agent crashes
  - Failure to connect to the cluster collector will be highlighted in the Cluster Collector service → the error is not raised by the agent in production
    - the The error is only raised when executing the agent with the --debug flag
Version: We only support the latest three Kubernetes versions (https://kubernetes.io/releases/#:~:text=The%20Kubernetes%20project%20maintains%20release,9%20months%20of%20patch%20support.Kubernetes Release History)
- If a customer has the latest release and the release itself is quite new (less than one month), ask one of the devs if we already have support.
Kubernetes API connection error: If the agent fails to make a connection to the Kubernetes API (e.g., 401 Unauthorized to query api/v1/core/pods), then the output based on the --debug flag should be sufficient

common causes:
- service Service account was not configured correctly in the Kubernetes cluster
- wrong Wrong token configured
- Forgot to upload the ca.crt in the Global settings >> Trusted certificate authorities for SSL but --verify-cert-api is enabled.
- Wrong IP or Port
- Proxy is not configured in the datasource rule.

Checkmk Cluster Collector connection error:
- Common causes:
  - The cluster collector is not exposed via either NodePort or Ingress.
  - The essential resources like pods, deployments, daemon-sets, replicas, etc., are not running or frequently restarting.
  - A firewall or a security group blocks the cluster collector IP.
  - Port/IP incorrect.
  - Forgot to upload the ca.crt in the Global settings >> Trusted certificate authorities for SSL but --verify-cert-api is enabled.
  - Proxy is not configured in the datasource rule.
API processing error: If the agent reports a bug similar to "value ... was not set, " the user should be asked for the vcrtrace file.

...

Example with Special Agent of storeonce4x

Find out the detailed special agent command (Type of agent column)

Code Block

language	bash
theme	RDark

OMD[mysite]:~$ cmk -D hostname

Note

an An easier way would be this command: /bin/sh -c "$(cmk -D k8s | grep -A1 "^Type of agent:" | grep "Program:" | cut -f2 -d':')"

Please note that if a line matching "^Type of agent:" followed by a line matching "^ Program:" exists more than once, then the output might be messed up.

.

Check if there are some options for debugging

Code Block

language	bash
theme	RDark

OMD[mysite]:~$ ~/share/check_mk/agents/special/agent_storeonce4x -h

There are three options for debugging the request:

Code Block

language	bash
theme	RDark

--debug, -d           Enable debug mode (keep some exceptions unhandled)
--verbose, -v
--vcrtrace TRACEFILE, --tracefile TRACEFILE
                            If this flag is set to a TRACEFILE that does not exist yet, it will be created and
                            all requests the program sends and their corresponding answers will be recorded in said file.
                            If the file already exists, no requests are sent to the server, but the responses will be
                            replayed from the tracefile.

.

Modify the special agent command by adding these three options

Code Block

language	bash
theme	RDark

OMD[mysite]:~$ ~/share/check_mk/agents/special/agent_storeonce4x <OTHER ARGUMENTS> --debug -v --vcrtrace ~/tmp/vcrtrace.txt 2>1 ~/tmp/storeonce4x_with_debug.txt

.

Run the special agent with no debug options to create an agent output. With this file, we can reproduce your issue
Code Block
language bash
theme RDark
OMD[mysite]:~$ /omd/sites/mysite/share/check_mk/agents/special/agent_kube <OTHER ARGUMENTS> > ~/tmp/k8s_agent_output.txt

Rename the token file

The storeonce4x special agent is using username/password for authentication. After the successful login, we obtain the access token. The access token is used for future REST requests.

If you want to read more, you can check this out: https://hewlettpackard.github.io/storeonce-rest/#Authentication

We save the token file inside the site in

Code Block

language	bash
theme	RDark

~/tmp/check_mk/special_agents/agent_storeonce4x/<hostname>_oAuthToken.json

.

Rename the file to _oAuthToken.json.back

Code Block

language	bash
theme	RDark

OMD[mysite]~# mv ~/tmp/check_mk/special_agents/agent_storeonce4x/<hostname>_oAuthToken.json ~/tmp/check_mk/special_agents/agent_storeonce4x/<hostname>_oAuthToken.json.back

.

Run the special agent again

...

Info

Although Containers and their management with Kubernetes took the IT industry by storm, virtualization still has its "right to exist" in on-prem environments and everywhere where containerization would not fit.

This is an extension to https://docs.checkmk.com/latest/en/monitoring_vmware.html.Monitoring VMware ESXi.

Getting Started

Background information regarding this subject is available in our Official documentation

...

One of them, "ESX Snapshots," allows you to monitor all given snapshots of the VM and alert you if they get too old. This is very useful to remind POs to delete their manually created snapshots in a timely fashion.

Basic debugging

Example with Special Agent of vSphere
.

Find out the detailed special agent command

Code Block

language	bash
theme	RDark

OMD[mysite]:~$ cmk -D <vcenter-host> | more

vcenter 
Addresses: x.x.x.x
Tags: [add_ip_addresses:add_ip_addresses_1], [address_family:ip-v4-only], [agent:special-agents], [criticality:prod], 
[ip-v4:ip-v4], [networking:lan], [piggyback:auto-piggyback], [site:nagnis_master], [snmp_ds:no-snmp], [tcp:tcp]
Labels: [cmk/vsphere_object:vm]
Host groups: check_mk
Contact groups: all
Agent mode: No Checkmk agent, all configured special agents
Type of agent: 
Program: /omd/sites/mysite/share/check_mk/agents/special/agent_vsphere -u 'user' -s 'password' -i hos
tsystem,virtualmachine,datastore,counters,licenses -P --spaces cut --snapshot_display vCenter --no-cert-check 'x.x.x.x'
Process piggyback data from /omd/sites/mysite/tmp/check_mk/piggyback/vcenter
Services:
checktype item params

Note

An easier way would be this command: /bin/sh -c "$(cmk -D vcenter | grep -A1 "^Type of agent:" | grep "^ Program:" | cut -f2 -d':')"

Please note that if a line matching "^Type of agent:" followed by a line matching "^ Program:" exists more than once, the output might be messed up.

.

Check if there are options for debugging.

Code Block

language	bash
theme	RDark

OMD[mysite]:~$ /omd/sites/mysite/share/check_mk/agents/special/agent_vsphere -h

There are two options for debugging the request.

Code Block

language	bash
theme	RDark

--debug                       Debug mode: let Python exceptions come through

--tracefile FILENAME          Log all outgoing and incoming data into the given tracefile

.

Modify the special agent command by adding these two options

Code Block

language	bash
theme	RDark

OMD[mysite]:~$ /omd/sites/mysite/share/check_mk/agents/special/agent_vsphere  -u 'user' -s 'password' --debug --tracefile $OMD_ROOT/tmp/vcenter.out -i hostsystem,virtualmachine,datastore,counters,licenses -P --spaces cut --no-cert-check '$HOST_ADDRESS' > $OMD_ROOT/tmp/vcenter.debug

In CMK 1.6.0, you might find the option "--snapshot_display vCenter" in your CMK -D output. If that's the case, you can include this parameter.

.

Run the special agent with no debug options to create an agent output. With this file, we can reproduce your issue.

Code Block

language	bash
theme	RDark

root@linux~# /omd/sites/mysite/share/check_mk/agents/special/agent_vsphere -u 'user' -s 'password' -i hostsystem,virtualmachine,datastore,counters,licenses -P --spaces cut --no-cert-check 'x.x.x.x' >/~tmp/agent.output

.

Please send us all three files. Now we're able to investigate further.
1
2
3
~/tmp/vcenter.debug      # Debug Output
~/tmp/vcenter.out        # Tracefile
/~tmp/agent.output       # Agent Output

Advanced Debugging Examples

Collect several agent outputs over a period of time:

Code Block

language	bash
theme	RDark

export t=60; export s=0; while [ $s -le 600 ]; do echo $s; cmk -d $VSPHERE_HOST > /tmp/agent_vsphere_output.$s; let s=$s+$t; sleep $t; done

Collect several trace files over a period of time:

Code Block

language	bash
theme	RDark

export t=60; export s=0; while [ $s -le 600 ]; do echo $s; ./agent_vsphere --trace /tmp/agent_vsphere_trace.$s $OTHER_COMMAND_PARAMS; let s=$s+$t; sleep $t; done

Version	Old Version 2	New Version Current
Changes made by	Matthew Hierholzer	Matthew Hierholzer
Saved on	May 23, 2024	Jan 08, 2025

Versions Compared

Key

Viewing AWS Options

Viewing AWS Options

Troubleshooting Microsoft Azure - "Graph client: Insufficient privileges to complete the operation" error

Getting Started

Basic debugging

Advanced Debugging Examples

Related articles