The structure is generic to support most hardware accelerators like
GPUs, TPUs etc.
Note that the prometheus label for id is called acc_id, so that it
doesn't conflict with some other label that maybe called id.
PerDiskStats reported from cgroups were not being surfaced into
prometheus. In order to properly correlate the metrics, we need to
assign a device label to each metric (which is the FS or device path).
Since blkio cgroup tracks devices, we create a synthetic device
`/dev/NAME` for the metric.
Assign a Device label to each PerDiskStat for the handlers up front, and
then surface the PerDiskStat values into the prometheus metrics. Report
two new metrics - total bytes read and total bytes written.
If CPU quota is configured (cpu.cfs_quota != -1) the CFS will provide
stats about elapsed periods and throtting in cpu.stats. This change
makes these information available as container_cpu_cfs_* metrics.
This add Envs to container spec as a metadata source. When using prometheus
exposition format, they will be merged into the list of metrics' labels.
Also changed the cli flag to docker_env_metadata_whitelist, and add refenrences
of whitelist envs to API
Signed-off-by: Daniel Dao <dqminh@cloudflare.com>
Remove spec-related fields from stat.
We can simplify the stats a bit further by handling Int and Float better.
But this was big enough change already.
Verified v1 and v2 spec/stats/appmetrics APIs.
The current capacity and usage numbers are insufficient to figure out
actual bytes available for a non-root user for the fs. Available is the
value used by df and the one we need to track to detect low diskspace
condition.