Per-CPU stats are more expensive to transport and store, and that
level of detail is not required in many cases.
We export overall total cpu in the same metric as per-cpu, so that
dashboards which previously summed over cpu will work identically.
The structure is generic to support most hardware accelerators like
GPUs, TPUs etc.
Note that the prometheus label for id is called acc_id, so that it
doesn't conflict with some other label that maybe called id.
If CPU quota is configured (cpu.cfs_quota != -1) the CFS will provide
stats about elapsed periods and throtting in cpu.stats. This change
makes these information available as container_cpu_cfs_* metrics.
collection would stay counted in the gauge "container_scrape_errors",
making that particular metric useless. Instead, it must be reset on
every scrape.
This add Envs to container spec as a metadata source. When using prometheus
exposition format, they will be merged into the list of metrics' labels.
Also changed the cli flag to docker_env_metadata_whitelist, and add refenrences
of whitelist envs to API
Signed-off-by: Daniel Dao <dqminh@cloudflare.com>