Monitoring SQL Server backups in Always On Availability groups

Monitoring SQL Server backups in Always On Availability groups

Enables Checkmk to detect SQL Server Always On backups running on secondary replicas using a custom mk-sql configuration.

LAST TESTED ON CHECKMK 2.3.0P34

Table of Contents

Overview

In environments using Microsoft SQL Server Always On Availability Groups (AG), it is common practice to perform backups on a secondary replica to reduce load on the primary.

However, the default Checkmk mk-sql query filters for primary replicas, which can prevent the backup check from detecting these jobs.
This article provides a supported solution that enables Checkmk to properly recognize backups from secondary replicas using a deployable SQL configuration file.


Problem

By default, mk-sql only considers databases where rep.is_primary_replica is True.
This condition excludes backup results executed on secondary replicas, even though such configurations follow Microsoft’s best practices:
What is an Always On availability group?


As a result:

  • Backup checks may report missing or outdated backups on the secondary replica.

  • The monitoring output differs from the real backup state.

 

Solution

To ensure all backup activity is reported, including those on secondary replicas, Checkmk supports the use of a custom SQL file that adjusts the mk-sql query.
This SQL safely removes the restrictive filter and enables proper detection of secondary-side backups, without impacting primary replica behavior.

This solution can be rolled out using the Agent Bakery or manually deployed to the secondary SQL node.

Step-by-step instructions

  1. Create the folder structure

    On your Checkmk site or in your bakery configuration tree, create the following directory:

    local/share/check_mk/agents/custom/ └── backup2ndreplicaquery └── lib └── config └── mssql └── custombackup.sql


    This is what it will look like within Windows Explorer:

    image-20260123-084151.png



  2. Use the following SQL content

    Save the below as custombackup.sql inside the path shown in step 1.

    DECLARE @HADRStatus sql_variant; DECLARE @SQLCommand NVARCHAR(MAX); SET @HADRStatus = (SELECT SERVERPROPERTY ('IsHadrEnabled')); IF (@HADRStatus IS NULL or @HADRStatus <> 1) BEGIN SET @SQLCommand = ' SELECT CONVERT(NVARCHAR, DATEADD(s, MAX(DATEDIFF(s, ''19700101'', backup_finish_date) - (CASE WHEN time_zone IS NOT NULL AND time_zone <> 127 THEN 60 * 15 * time_zone ELSE 0 END)), ''19700101''), 120) AS last_backup_date, CAST(type AS NVARCHAR(128)) AS type, CAST(machine_name AS NVARCHAR(128)) AS machine_name, CAST(''True'' AS NVARCHAR(12))as is_primary_replica, CAST(''1'' AS NVARCHAR(12)) AS is_local, CAST('''' AS NVARCHAR(12)) AS replica_id, CAST(sys.databases.name AS NVARCHAR(MAX)) AS database_name FROM msdb.dbo.backupset LEFT OUTER JOIN sys.databases ON CAST(sys.databases.name AS NVARCHAR(MAX)) = CAST(msdb.dbo.backupset.database_name AS NVARCHAR(MAX)) WHERE UPPER(machine_name) = UPPER(CAST(SERVERPROPERTY(''Machinename'') AS NVARCHAR(MAX))) GROUP BY type, machine_name, CAST(sys.databases.name AS NVARCHAR(MAX)) ' END ELSE BEGIN SET @SQLCommand = ' SELECT CONVERT(NVARCHAR, DATEADD(s, MAX(DATEDIFF(s, ''19700101'', b.backup_finish_date) - (CASE WHEN time_zone IS NOT NULL AND time_zone <> 127 THEN 60 * 15 * time_zone ELSE 0 END)), ''19700101''), 120) AS last_backup_date, CAST(b.type AS NVARCHAR(MAX)) AS type, CAST(b.machine_name AS NVARCHAR(MAX)) AS machine_name, -- ISNULL(CONVERT(NVARCHAR(40), rep.is_primary_replica), '''') AS is_primary_replica, CAST(''1'' AS NVARCHAR(12)) AS is_primary_replica, -- patch rep.is_local, -- ISNULL(CONVERT(NVARCHAR(40), rep.replica_id), '''') AS replica_id, CAST('''' AS NVARCHAR(12)) AS replica_id, -- patch CAST(b.machine_name AS NVARCHAR(MAX)) AS machine_name, CAST(db.name AS NVARCHAR(MAX)) AS database_name FROM msdb.dbo.backupset b LEFT OUTER JOIN sys.databases db ON CAST(b.database_name AS NVARCHAR(MAX)) = CAST(db.name AS NVARCHAR(MAX)) LEFT OUTER JOIN sys.dm_hadr_database_replica_states rep ON db.database_id = rep.database_id WHERE (rep.is_local is null or rep.is_local = 1) -- AND (rep.is_primary_replica is null or rep.is_primary_replica = ''True'') AND UPPER(machine_name) = UPPER(CAST(SERVERPROPERTY(''Machinename'') AS NVARCHAR(120))) GROUP BY type, rep.replica_id, rep.is_primary_replica, rep.is_local, CAST(db.name AS NVARCHAR(MAX)), CAST(b.machine_name AS NVARCHAR(MAX)), rep.synchronization_state, rep.synchronization_health ' END EXEC (@SQLCommand)

     

  3. Deploy via the Agent Bakery

    1. Navigate to Setup → Agents → Windows, Linux, Solaris → Agent rules → Deploy custom files.

    2. Create a rule called MSSQL backup on secondary replica.

    3. Upload the custombackup.sql file.
      Windows destination path:
      C:\ProgramData\checkmk\agent\config\mssql\custombackup.sql

    4. Restrict the rule to the secondary AG node.

    5. Bake and deploy the agent.


  4. Update mk-sql.yml configuration
    To ensure the custom SQL file is loaded correctly, update the mk-sql.yml configuration and explicitly set custombackup to null:

    - custombackup: null


    mk.sql.yml example

    --- mssql: main: authentication: username: '' type: integrated connection: hostname: localhost port: 1433 sections: - instance: null - databases: null - counters: null - blocked_sessions: null - transactionlogs: null - clusters: null - mirroring: null - availability_groups: null - connections: null - tablespaces: is_async: true - datafiles: is_async: true - backup: is_async: true - jobs: is_async: true - custombackup: null

    This setting is required even if no value is assigned. Omitting it can prevent the custom backup query from being processed as expected.

    Warning! This entry must match the name of your custombackup.sql file. If the file has a different name, for example abc.sql, then this entry must be abc: null.

    e.g.,

    - abc: null



  5. Validate

    Run the following test on the SQL host:

    PS C:\ProgramData\checkmk\agent\config> ..\plugins\mk-sql.exe -c .\mk-sql.yml

    You should now see the expected backup job data from the secondary replica.

    After confirming, perform a service discovery in Checkmk to verify that MSSQL backup services now report correctly.

 

Notes

  • This solution affects only the configured secondary node.

  • Do not deploy the modified SQL globally, as it intentionally changes replica filtering.

  • Attach agent outputs and SQL versions when opening support tickets for traceability.

 

Related articles