Why Checkmk uses memory_working_set for K8s container monitoring
This article explains why Checkmk uses working set memory and why its values may differ from GKE.
APPLICABLE TO ALL CHECKMK VERSIONS
Overview
Checkmk's Kubernetes monitoring alerts on containers' non-evictable memory (container_memory_working_set_bytes). It’s THE indicator of memory pressure and OOM risk. This metric excludes evictable memory (e.g., reclaimable cache).
In Google Kubernetes Engine (GKE), many dashboards include both evictable and non-evictable memory by default
Background: Usage vs. Cache vs. Working sets
Kubernetes collects container memory data through cAdvisor, which reports three main metrics:
container_memory_usage_bytes
The total memory used by the container, including filesystem cache.container_memory_cache
The amount of memory used for page cache (both active and inactive).container_memory_working_set_bytes
The working set, meaning the non-evictable memory your app actually requires. It is calculated as usage minus inactive file pages.
Why Working Set Is the Right Metric
Checkmk relies on working set because:
It tracks the memory that matters for stability.
Represents the non-evictable footprint of your app, the part that grows as your app truly needs more RAM.It aligns with Kubernetes’ own logic.
In the kubelet’s Summary API,availableBytesis defined aslimit - workingSetBytes. This is the basis for how Kubernetes determines available memory.It’s widely used across tools.
For example, OpenShift’soc adm topreports pod memory using the working set metric.It avoids noise.
Alerting onusage_bytesleads to false alarms because evictable cache can inflate usage. Alerting on working set better reflects real OOM risk and supports right-sizing.
Why Checkmk and GKE Show Different Numbers
Metric labeling in GKE:
GKE adds amemory_typelabel (evictable vs. non-evictable). If a GKE chart sums both, it will show higher values than Checkmk (which uses non-evictable only).Dashboard differences:
Some GKE views define “Used” as non-evictable (close to Checkmk).
Others show totals including evictable cache unless the query is adjusted.
Bottom Line
Checkmk uses working set memory (container_memory_working_set_bytes) because it is the most accurate signal of application health, stability, and OOM risk.
If GKE dashboards appear higher, verify you are charting non-evictable memory only for a fair comparison.