cadvisor

Author	SHA1	Message	Date
WanLinghao	4eab5b671e	Add support to disable diskIO metrics	2019-01-15 09:43:33 +08:00
Davanum Srinivas	4da6d809be	Move from glog to klog Change-Id: Ic92f57c2d7f268d8d985797974883c1a537d6993	2018-11-08 18:06:28 -05:00
Sashank Appireddy	da29418c31	cache process metrics	2018-11-06 13:29:14 -08:00
David Ashpole	2fa6c624a2	Merge pull request #2034 from usabilla/mapped_file Adds mapped_file metric	2018-08-29 10:25:29 -07:00
Gijs Kunze	9e175e9ea9	Adds mapped_file metric	2018-08-09 15:14:46 +02:00
Valentyn Boginskey	b09b68c4a9	Fix cache reporting with cgroup hierarchy	2018-07-28 07:20:42 -04:00
David Ashpole	c225d06adf	don't emit prometheus metrics that are ignored	2018-07-09 13:17:49 -07:00
nielsole	08f0c2397c	Adding /proc/<pid>/schedstat (#1872 ) Add /proc/<pid>/schedstat metrics for scheduler metrics	2018-03-08 09:27:06 -08:00
David Ashpole	e1d602d7af	create libcontainer handler for common code	2018-02-21 08:53:42 -08:00
Bryan Boreham	ec6da3acae	Prometheus metrics: optionally export total CPU instead of per-CPU Per-CPU stats are more expensive to transport and store, and that level of detail is not required in many cases. We export overall total cpu in the same metric as per-cpu, so that dashboards which previously summed over cpu will work identically.	2018-02-20 13:58:44 +00:00
Euan Kemp	1ecd24ea8d	libcontainer: Use first cgroup subsystem found (#1792 ) libcontainer: Use first cgroup subsystem found	2017-11-06 15:33:59 -08:00
Rohit Agarwal	4a35130019	Collect container-level GPU metrics using NVML. When cAdvisor starts up, it would read the `vendor` files in `/sys/bus/pci/devices/*` to see if any NVIDIA devices (vendor ID: 0x10de) are attached to the node. If no NVIDIA devices are found, this code path would become dormant for the rest of cAdvisor lifetime. If NVIDIA devices are found, we would start a goroutine that would check for the presence of NVML by trying to dynamically load it at regular intervals. We need to do this regular checking instead of doing it just once because it may happen that cAdvisor is started before the NVIDIA drivers and NVML are installed. Once the NVML dynamic loading succeeds, we would use NVML’s query methods to find out how many devices exist on the node and create a map from their minor numbers to their handles and cache that map. The goroutine would exit at this point. If we detected the presence of NVML in the previous step, whenever a new container is detected by cAdvisor, cAdvisor would read the `devices.list` file from the container's devices cgroup. The `devices.list` file lists the major:minor number of all the devices that the container is allowed to access. If we find any device with major number 195 (which is the major number assigned to NVIDIA devices), we would cache the list of corresponding minor numbers for that container. During every housekeeping operation, in addition to collecting all the existing metrics, we will use the cached NVIDIA device minor numbers and the map from minor numbers to device handles to get metrics for GPU devices attached to the container.	2017-11-06 11:54:59 -08:00
Euan Kemp	587691c7f3	libcontainer: ignore nil cpustats Cadvisor can inotify watch for new cgroups. This leads to it racing fairly tightly with cgroup creation... So tightly, that sometimes cpustats are nil. The runc library code we call (https://github.com/opencontainers/runc/blob/v1.0.0-rc4/libcontainer/cgroups/fs/apply_raw.go#L179-L182) doesn't actaully consider this an error, so we have to handle that scenario ourselves. This fixes https://github.com/google/cadvisor/issues/1765	2017-10-20 13:08:23 -07:00
Derek Carr	9ea61176bf	Expose memory.max_usage_in_bytes in container stats	2017-10-10 17:31:31 -04:00
Euan Kemp	d2e11efba2	libcontainer: use real number of CPUs for usage As of the 4.7 kernel, the cpustats field returned from libcontainer contains values for every possible cpu (including nonexistent ones). The extra values are all 0s. If we assume that hotplug events won't happen, we can get a more accurage cpu count by using runtime.NumCPU and then ignoring any values beyond that.	2017-08-30 14:26:26 -07:00
Derek Carr	6fa48d9048	Expose total_rss when hierarchy is enabled	2017-08-23 14:56:59 -04:00
Derek Carr	d493f11f0b	Reduce log spam when unable to get network stats	2017-08-18 16:11:03 -04:00
Tristan Colgate	227bb3a79d	Add udp and udp6 network statistics	2017-04-10 20:41:51 +01:00
derekwaynecarr	b84046f12c	Look at all cgroup mounts	2016-09-22 15:34:59 -04:00
Florian Koch	3ce98a46c4	add cgropu swap usage and export as prometheus metric	2016-08-09 07:33:37 +02:00
Tobias Schmidt	1653733ea7	Expose cpu cgroup CFS prometheus metrics If CPU quota is configured (cpu.cfs_quota != -1) the CFS will provide stats about elapsed periods and throtting in cpu.stats. This change makes these information available as container_cpu_cfs_* metrics.	2016-08-06 18:08:26 -04:00
Michael Taufen	307d1b1cb3	Modify working set memory stats calculation Change working set calculation to usage - total_inactive_file, rather than usage - total_inactive_anon - total_inactive_file. Since writes to tmpfs get tracked as total_inactive_anon when swap is disabled, the old calculation would under-report memory pressure. See this Kubernetes issue for context: https://github.com/kubernetes/kubernetes/issues/28619	2016-07-15 10:58:25 -07:00
Tim St. Clair	4c506006f2	Don't validate docker state file, since it's no longer used	2016-05-06 19:29:24 -07:00
Tim St. Clair	dc6415aef7	Check docker container existance the same way as raw & rkt	2016-04-15 11:35:31 -07:00
Tim St. Clair	4a8f3e4c93	Read docker container spec from cgroupfs, rather than libcontainer spec	2016-04-14 17:10:03 -07:00
Tim St. Clair	7b1820b1d4	Look for container state in containerd path	2016-04-13 15:09:08 -07:00
Vishnu kannan	e2717d8bb7	Avoid collecting network stats for non root cgroups in raw handler. Signed-off-by: Vishnu kannan <vishnuk@google.com>	2016-03-15 12:16:11 -07:00
Vishnu kannan	36415f465a	Support opt out for metrics. Signed-off-by: Vishnu kannan <vishnuk@google.com>	2016-02-24 15:57:31 -08:00
Jimmi Dyson	33386f899b	bump(github.com/opencontainers/runc/libcontainer) Fixes issues with breaking changes to ``GetPids` which is affecting downstream consumers of cadvisor (e.g. Kubernetes).	2016-01-26 09:46:59 +00:00
Shimin Guo	a26b58ec8e	expose page cache size	2016-01-15 08:45:51 -08:00
Shimin Guo	1a867bdadd	expose RSS	2016-01-15 08:45:51 -08:00
Lei Xue	15b34b0131	add test case for compatibility.go	2015-12-02 11:01:50 +08:00
Lei Xue	7343ae4583	fix unmarshal container config failure with Docker 1.8.3	2015-12-02 11:01:12 +08:00
Lei Xue	dbbe38dfed	re-order the import package	2015-11-30 16:43:22 +08:00
Jimmi Dyson	82810f13cd	Remove unused code (via deadcode linter)	2015-11-27 21:48:33 +00:00
Jimmi Dyson	360c73c6fd	Improve perf of interface stats parsing	2015-11-27 14:12:41 +00:00
Jimmi Dyson	f9eb56e800	Merge pull request #966 from afein/godep_update_runc [Godeps] changed docker/libcontainer dependency to runc/libcontainer	2015-11-26 15:19:28 +00:00
Jimmi Dyson	d1fce20304	Regexp tidy up	2015-11-26 09:14:26 +00:00
Alex Mavrogiannis	4533dd7d18	changed libcontainer dependency to runc	2015-11-21 14:04:01 -08:00
Jimmi Dyson	561cc1da4f	Use file reader directly for net stats	2015-10-28 12:51:19 +00:00
Jimmi Dyson	c72e0c23a5	Add test for net dev stats	2015-10-28 12:51:13 +00:00
Jimmi Dyson	da771a0977	Drop regexp for net stats parsing Reported in kubernetes/kubernetes#16296	2015-10-27 20:16:49 +00:00
Jimmi Dyson	8b6e002e0a	Disable tcp stats collection Fixes #938	2015-10-22 21:05:46 +01:00
Jimmi Dyson	5a5d0575f5	Docker, libcontainer, docker client bumps	2015-10-20 09:22:12 +01:00
Tomas Kral	bd61caf0c3	add failcnt	2015-10-02 14:24:22 +02:00
Florian Koch	e4262b91b1	move TCP and TCP6 stats to NetworkStats	2015-09-25 09:04:53 +02:00
Florian Koch	dd041457b5	some fixes	2015-09-24 15:44:42 +02:00
Florian Koch	c331982f21	add tcp/tcp6 statistics	2015-09-24 15:44:42 +02:00
Jimmi Dyson	7e10398a50	Use proc fs to get network stats. Reasons discussed in https://github.com/google/cadvisor/issues/822#issuecomment-135811901 & following.	2015-08-29 00:20:07 +01:00
Jimmi Dyson	d5fa97c998	Get network stats by switching network namespace on newer Docker versions. Fixes #822	2015-08-25 23:27:01 +01:00

1 2

81 Commits