Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Info
In this step-by-step guide, we want to advise you on dealing with high CPU usage of the CMC.

...



Panel
borderColor

...

#CCCCCC

...

bgColor

...

#e3fcef

LAST TESTED ON CHECKMK 2.3.0P1



Panel
borderColorblack
bgColor#f8f8f8
titleTable of Contents

Table of Contents

Context

A process monitor like the htop command shows 100% CPU usage for one core by the CMC process. The command line should look something similar to the one below.

Code Block
languagebash
themeRDark
/omd/sites/mysite/bin/cmc /omd/sites/my_site/var/check_mk/core/config.pb

Step-by-step guide

  1. Verify that the CMC is consuming 100% of one or more CPU cores

    1. Install a process monitor like htop
    2. Run the process monitor as a site user
    3. Filter for CMC (e.g., for htop with F4 key) and write the string cmc into the filter

Screenshot of Htop with all four CPUs at 100 percent utilization.Image Modified


Debugging

  1. Go to 'Master Control' within your sidebar.
    .
  2. Disable both Host and Service checks and restart CMC
    Screenshot of the right-hand side Master control with services and host checks set to off.Image Modified

    Code Block
    languagebash
    themeRDark
    OMD[mysite]~# omd restart cmc

    .

  3. Re-enable Host Checks and wait for at least 5 minutes
    Screenshot of the right-hand side Master control with services checks set to off.Image Modified

    If the behavior reoccurs, disable Host Checks and restart CMC.

  4. Re-enable Service Checks and wait for at least 5 minutes
    Screenshot of the right-hand side Master control with host checks set to off.Image Modified

    If the behavior reoccurs, disable Service Checks and restart CMC.

  5. Re-enable both Host Checks and Service Checks

Now, we need to understand which hosts might be causing this behavior.

  1. Start with the top-level folder of the affected site in Setup Hosts and set the "Criticality" of the folder to "Do not monitor this host."
    Screenshot of a host folder properties. Criticality is enabled and set to Do not monitor host.Image Modified


    The subfolders will inherit this property.
    Screenshot of a host folder properties. Criticality is enabled and set to Do not monitor this  host.Image Modified


  2. Activate changes and run omd restart on that site as the site user.

    Code Block
    languagebash
    themeRDark
    OMD[mysite]~# omd restart

    .

  3. Now enable one of the subfolders and activate changes.
    Screenshot of a host folder properties. Criticality is enabled and set to Productive System.Image Modified


  4. Run omd restart again and wait at least 5 minutes before checking htop

    Code Block
    languagebash
    themeRDark
    OMD[mysite]~# omd restart




  5. If the CPU usage does not go back to 100%, repeat steps #3 & #4 until it does. Make sure to wait at least 5 minutes between eachomd restart. Once the CPU usage is back at 100%, we found our culprit.

  6. Now, we can move forward to see what is causing the issue. What kind of host is it? Agent, SNMP, or Special Agent?

    • If it is an agent-based host:
    • Any local plugins?
    • Any special configuration?

  7. Run strace as root. You can use strace to track the cmc process when you face any issue. 

    Code Block
    languagebash
    themeRDark
    root@mylinuxhost~# strace -o cmc-strace.log -p $(cat ~<mysite>/tmp/run/cmc.pid)


    Tip
    Further information can be found here: Debugging the Checkmk Micro Core (CMC) old#strace

    .

  8. With gdb, you can analyze the coredump if checkmk will create one. Note: Checkmk will only create one if you enable it in the global settings.

    Code Block
    languagebash
    themeRDark
    gdb /omd/sites/mysite/bin/cmc --core=/home/mylinuxuser/Downloads/core.python3.989.4b7ee3adffd14e31a0188aac0c215161.804036.1640164046000000
    GNU gdb (Ubuntu 9.2-0ubuntu1~20.04) 9.2GNU gdb (Ubuntu 9.2-0ubuntu1~20.04) 9.2Copyright (C) 2020 Free Software Foundation, Inc.License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>This is free software: you are free to change and redistribute it.There is NO WARRANTY, to the extent permitted by law.Type "show copying" and "show warranty" for details.This GDB was configured as "x86_64-linux-gnu".Type "show configuration" for configuration details.For bug reporting instructions, please see:<http://www.gnu.org/software/gdb/bugs/>.Find the GDB manual and other documentation resources online at:    <http://www.gnu.org/software/gdb/documentation/>.
    For help, type "help".Type "apropos word" to search for commands related to "word"...Reading symbols from /omd/sites/mysite/bin/cmc...
    warning: core file may not match specified executable file.[New LWP 804036]Core was generated by `python3 /omd/sites/mysite/bin/cmk --discover-marked-hosts'.Program terminated with signal SIGSEGV, Segmentation fault.#0  0x00007f2b661be1fd in ?? ()
    (gdb) where
    #0  0x00007f2b661be1fd in ?? ()
    #1  0x00007ffed8a75060 in ?? ()
    #2  0x0000000000000000 in ?? ()
     
     
    # Run it (if it's still crashing, you'll see it crash)
    r
    # View the backtrace (call stack)
    bt 
    # Quit when done
    q
    # Memory mappings
    i proc m
     
    # Listing all threads. This is really useful!
    thread apply all bt

    .

    Tip
    Further information can be found here: Debugging the Checkmk Micro Core (CMC) old#gdb

    .

  9. If your investigation is not successful, please open a ticket and provide us with the following data:

    Please send us the following data to help us reproduce the issue. 

    Code Block
    languagebash
    themeRDark
     * Login as a site user with {{su - $MYSITE}} and
     * create an archive with the following command {{tar czf ~/corefiles.tgz ~/var/check_mk/core/ ~/var/log/}}.


Filter by label
showLabelsfalse
max5
spacesCON
showSpacefalse
sortmodified
reversetrue
typepage
cqllabel in ( "cmc" , "kb-troubleshooting-article" ) and type = "page" and space = "KB"
labelscmc


Page Properties
hiddentrue


Related issues