Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The customer is experiencing difficulties resolving the constant downtime for a specific host. The issue involves a situation where the downtime was initiated without an associated end time.

Troubleshooting

...

  1. Check the downtimes for both services and hosts:
    Image Modified
    Image Modified
  2. If no relevant information is discovered, the next step would involve executing a Livestatus query to retrieve all existing downtimes. 

    Code Block
    languagebash
    themeRDark
    lq "GET downtimes\nColumns: downtime_author downtime_comment downtime_duration downtime_end_time downtime_entry_time downtime_fixed downtime_id downtime_is_service downtime_origin downtime_recurring downtime_start_time host_has_been_checked host_labels host_name host_scheduled_downtime_depth host_state service_description service_has_been_checked service_state"


  3. If nothing is still found, it is recommended to investigate the history file located in the ~/var/check_mk/core, explicitly searching for the summary information. In this particular scenario, the summary to search for is 'DT2'. 

    Code Block
    languagebash
    themeRDark
    OMD[mysite]~$ grep -rl <DOWNTIMESUMMARY> ~/var/check_mk/core/history 


    Code Block
    languagebash
    themeRDark
    OMD[mysite]:-/var/check_ mk/core$ grep -r DT2
    history:[1684243614] EXTERNAL COMMAND: SCHEDULE HOST_ DOWNTIME;localhost2;1684243614:1684243734;1;0;0;cmkadmin;DT2
    history:[1684243614] HOST DOHNTIME ALERT: localhost2;STARTED;DT2
    OMD[mysite]:~ /var/check_mk/core$


  4. If the history file is large, reviewing the files in ~/var/check_mk/core/archive can also be helpful. These history files contain Unix timestamps that can help with troubleshooting. 

    Code Block
    languagebash
    themeRDark
    OMD[mysite]~$ grep -rl <DOWNTIMESUMMARY> ~/var/check_mk/core/archive/*


Solution

Warning
titleWarning

Please note that the following steps are unsupported, and a backup of the Checkmk site should be created before proceeding.


If the event is found in the history file but nowhere else:

  1. Stop the site 

    Code Block
    languagebash
    themeRDark
    OMD[mysite]~$ omd stop


  2. Open a CLI text editor and remove the entry from the history file. 

    Code Block
    languagebash
    themeRDark
    OMD[mysite]~$ vi ~/var/check_mk/core/history


  3. Start this site again. 

    Code Block
    languagebash
    themeRDark
    OMD[mysite]~$ omd start


These steps will effectively clear the active lingering downtime.

...