PerDiskStats reported from cgroups were not being surfaced into
prometheus. In order to properly correlate the metrics, we need to
assign a device label to each metric (which is the FS or device path).
Since blkio cgroup tracks devices, we create a synthetic device
`/dev/NAME` for the metric.
Assign a Device label to each PerDiskStat for the handlers up front, and
then surface the PerDiskStat values into the prometheus metrics. Report
two new metrics - total bytes read and total bytes written.
Only include cpu's in the Prometheus metrics endpoint that were used.
In recent kernels the cpuacct.statcpus behavior has changed to include
all possible cpu's. This can results in a high number of stale metrics
in the Prometheus endpoint.
This change generalizes the existing ContainerNameToLabelsFunc to allow the user to fully control all labels attached to exported Prometheus metrics. The existing behavior is available as DefaultContainerLabelsFunc and is used if no custom function is provided.
This will allow Kubernetes to filter out its internal Docker labels.
If CPU quota is configured (cpu.cfs_quota != -1) the CFS will provide
stats about elapsed periods and throtting in cpu.stats. This change
makes these information available as container_cpu_cfs_* metrics.
collection would stay counted in the gauge "container_scrape_errors",
making that particular metric useless. Instead, it must be reset on
every scrape.
This add Envs to container spec as a metadata source. When using prometheus
exposition format, they will be merged into the list of metrics' labels.
Also changed the cli flag to docker_env_metadata_whitelist, and add refenrences
of whitelist envs to API
Signed-off-by: Daniel Dao <dqminh@cloudflare.com>
With this change, all definitions and functionality for a given metric
are in a single place only instead of being distributed all over the
file. This makes it easier to inspect the code for correctness and
adding/changing metrics.