Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Panel
borderColorblack
bgColor#f8f8f8
titleTable of Contents

Table of Contents

Getting Started

Background information regarding this subject is available in our Official documentation

Problem

The customer is experiencing difficulties resolving the constant downtime for a specific host. The issue involves a situation where the downtime was initiated without an associated end time.

Screenshot of Host last check was 48 minutes ago

Troubleshooting

  1. Check the downtimes for both services and hosts:
    Screenshot of Display rules for recurring downtimes for services
    Screenshot of Display rules for recurring downtimes for hosts
    .
    .
  2. If no relevant information is discovered, the next step would involve executing a Livestatus query to retrieve all existing downtimes. 

    Code Block
    languagebash
    themeRDark
    lq "GET downtimes\nColumns: downtime_author downtime_comment downtime_duration downtime_end_time downtime_entry_time downtime_fixed downtime_id downtime_is_service downtime_origin downtime_recurring downtime_start_time host_has_been_checked host_labels host_name host_scheduled_downtime_depth host_state service_description service_has_been_checked service_state"

    .

  3. If nothing is still found, it is recommended to investigate the history file located in the ~/var/check_mk/core, explicitly searching for the summary information. In this particular scenario, the summary to search for is 'DT2'. 

    Code Block
    languagebash
    themeRDark
    OMD[mysite]~$ grep -rl <DOWNTIMESUMMARY> ~/var/check_mk/core/history 


    Code Block
    languagebash
    themeRDark
    OMD[mysite]:-/var/check_ mk/core$ grep -r DT2
    history:[1684243614] EXTERNAL COMMAND: SCHEDULE HOST_ DOWNTIME;localhost2;1684243614:1684243734;1;0;0;cmkadmin;DT2
    history:[1684243614] HOST DOHNTIME ALERT: localhost2;STARTED;DT2
    OMD[mysite]:~ /var/check_mk/core$

    .

  4. If the history file is large, reviewing the files in ~/var/check_mk/core/archive can also be helpful. These history files contain Unix timestamps that can help with troubleshooting. 

    Code Block
    languagebash
    themeRDark
    OMD[mysite]~$ grep -rl <DOWNTIMESUMMARY> ~/var/check_mk/core/archive/*


...

  1. Stop the site 

    Code Block
    languagebash
    themeRDark
    OMD[mysite]~$ omd stop

    .

  2. Open a CLI text editor and remove the entry from the history file. 

    Code Block
    languagebash
    themeRDark
    OMD[mysite]~$ vi ~/var/check_mk/core/history

    .

  3. Start this site again. 

    Code Block
    languagebash
    themeRDark
    OMD[mysite]~$ omd start


    These steps will effectively clear the active lingering downtime.

Filter by label (Content by label)
showLabelsfalse
max5
spacesKB
showSpacefalse
sortmodified
reversetrue
typepage
cqllabel in ( "businessintelligence" , "bi" ) and type = "page" and space = "KB"
labelsBI businessintelligence

...