How-to monitor SQL AG backups on secondary replicas with mk-sql
Enables Checkmk to detect SQL Server Always On backups running on secondary replicas using a custom mk-sql configuration.
LAST TESTED ON CHECKMK 2.3.0P34
Overview
In environments using Microsoft SQL Server Always On Availability Groups (AG), it is common practice to perform backups on a secondary replica to reduce load on the primary.
However, the default Checkmk mk-sql query filters for primary replicas, which can prevent the backup check from detecting these jobs.
This article provides a supported solution that enables Checkmk to properly recognize backups from secondary replicas using a deployable SQL configuration file.
Problem
By default, mk-sql only considers databases where rep.is_primary_replica is True.
This condition excludes backup results executed on secondary replicas, even though such configurations follow Microsoft’s best practices:
What is an Always On availability group?
As a result:
Backup checks may report missing or outdated backups on the secondary replica.
The monitoring output differs from the real backup state.
Solution
To ensure all backup activity is reported, including those on secondary replicas, Checkmk supports the use of a custom SQL file that adjusts the mk-sql query.
This SQL safely removes the restrictive filter and enables proper detection of secondary-side backups, without impacting primary replica behavior.
This solution can be rolled out using the Agent Bakery or manually deployed to the secondary SQL node.
Step-by-step instructions
Create the folder structure
On your Checkmk site or in your bakery configuration tree, create the following directory:local/share/check_mk/agents/custom/ └── backup2ndreplicaquery └── lib └── config └── mssql └── backup.sql
This is what it will look like within Windows Explorer:Use the following SQL content
Save the below as
backup.sqlinside the path shown in step 1.DECLARE @HADRStatus sql_variant; DECLARE @SQLCommand NVARCHAR(MAX); SET @HADRStatus = (SELECT SERVERPROPERTY ('IsHadrEnabled')); IF (@HADRStatus IS NULL or @HADRStatus <> 1) BEGIN SET @SQLCommand = ' SELECT CONVERT(NVARCHAR, DATEADD(s, MAX(DATEDIFF(s, ''19700101'', backup_finish_date) - (CASE WHEN time_zone IS NOT NULL AND time_zone <> 127 THEN 60 * 15 * time_zone ELSE 0 END)), ''19700101''), 120) AS last_backup_date, CAST(type AS NVARCHAR(128)) AS type, CAST(machine_name AS NVARCHAR(128)) AS machine_name, CAST(''True'' AS NVARCHAR(12))as is_primary_replica, CAST(''1'' AS NVARCHAR(12)) AS is_local, CAST('''' AS NVARCHAR(12)) AS replica_id, CAST(sys.databases.name AS NVARCHAR(MAX)) AS database_name FROM msdb.dbo.backupset LEFT OUTER JOIN sys.databases ON CAST(sys.databases.name AS NVARCHAR(MAX)) = CAST(msdb.dbo.backupset.database_name AS NVARCHAR(MAX)) WHERE UPPER(machine_name) = UPPER(CAST(SERVERPROPERTY(''Machinename'') AS NVARCHAR(MAX))) GROUP BY type, machine_name, CAST(sys.databases.name AS NVARCHAR(MAX)) ' END ELSE BEGIN SET @SQLCommand = ' SELECT CONVERT(NVARCHAR, DATEADD(s, MAX(DATEDIFF(s, ''19700101'', b.backup_finish_date) - (CASE WHEN time_zone IS NOT NULL AND time_zone <> 127 THEN 60 * 15 * time_zone ELSE 0 END)), ''19700101''), 120) AS last_backup_date, CAST(b.type AS NVARCHAR(MAX)) AS type, CAST(b.machine_name AS NVARCHAR(MAX)) AS machine_name, -- ISNULL(CONVERT(NVARCHAR(40), rep.is_primary_replica), '''') AS is_primary_replica, CAST(''1'' AS NVARCHAR(12)) AS is_primary_replica, -- patch rep.is_local, -- ISNULL(CONVERT(NVARCHAR(40), rep.replica_id), '''') AS replica_id, CAST('''' AS NVARCHAR(12)) AS replica_id, -- patch CAST(b.machine_name AS NVARCHAR(MAX)) AS machine_name, CAST(db.name AS NVARCHAR(MAX)) AS database_name FROM msdb.dbo.backupset b LEFT OUTER JOIN sys.databases db ON CAST(b.database_name AS NVARCHAR(MAX)) = CAST(db.name AS NVARCHAR(MAX)) LEFT OUTER JOIN sys.dm_hadr_database_replica_states rep ON db.database_id = rep.database_id WHERE (rep.is_local is null or rep.is_local = 1) -- AND (rep.is_primary_replica is null or rep.is_primary_replica = ''True'') AND UPPER(machine_name) = UPPER(CAST(SERVERPROPERTY(''Machinename'') AS NVARCHAR(120))) GROUP BY type, rep.replica_id, rep.is_primary_replica, rep.is_local, CAST(db.name AS NVARCHAR(MAX)), CAST(b.machine_name AS NVARCHAR(MAX)), rep.synchronization_state, rep.synchronization_health ' END EXEC (@SQLCommand)Deploy via the Agent Bakery
Navigate to Setup → Agents → Windows, Linux, Solaris → Agent rules → Deploy custom files.
Create a rule called
MSSQL backup on secondary replica.Upload the
backup.sqlfile.
Windows destination path:C:\ProgramData\checkmk\agent\config\mssql\backup.sqlRestrict the rule to the secondary AG node.
Bake and deploy the agent.
Validate
Run the following test on the SQL host:
PS C:\ProgramData\checkmk\agent\config> ..\plugins\mk-sql.exe -c .\mk-sql.ymlYou should now see the expected backup job data from the secondary replica.
After confirming, perform a service discovery in Checkmk to verify that MSSQL backup services now report correctly.
Notes
This solution affects only the configured secondary node.
Do not deploy the modified SQL globally, as it intentionally changes replica filtering.
Attach agent outputs and SQL versions when opening support tickets for traceability.
Related articles