Commit Graph

2334 Commits

Author SHA1 Message Date
David Ashpole
e6b4e4c38a changelog for v0.28.3 2017-12-07 09:46:35 -08:00
David Ashpole
fc6d4b920c
Merge pull request #1830 from jsravn/add-docker-timeouts
Add timeouts for docker queries
2017-12-07 09:37:53 -08:00
James Ravn
57e17d8be2 Add timeouts for docker queries
As these can otherwise block indefinitely due to docker issues.

This is to fix https://github.com/kubernetes/kubernetes/issues/53207,
where kubelet relies on cadvisor for gathering docker information as
part of its periodic node status update.
2017-12-05 13:50:48 +00:00
David Ashpole
0bde1c615c
Merge pull request #1831 from brian-brazil/prometheus-labels
Ensure all Prometheus metrics have the same labelnames.
2017-11-30 09:54:44 -08:00
Brian Brazil
27f103b266 Ensure all Prometheus metrics have the same labelnames.
Fixes #1704
2017-11-30 16:33:37 +00:00
David Ashpole
7d11f4243f
Merge pull request #1827 from tallclair/logging
Clean up cAdvisor logging
2017-11-29 10:16:58 -08:00
David Ashpole
b26bf6ebb2
Merge pull request #1826 from mindprince/gpu-docs
Add docs for using nvidia gpu monitoring.
2017-11-28 17:49:29 -08:00
Tim Allclair
1eb1355ae6
Default logging to V(2) 2017-11-27 19:49:49 -08:00
Tim Allclair
5b435b4b70
Clean up cAdvisor logging 2017-11-27 19:48:05 -08:00
Tim Allclair
3a40bbfc5c
Raise verbosity on runtime registration failure 2017-11-27 19:48:04 -08:00
Rohit Agarwal
6ba3fa4e8c Add docs for using nvidia gpu monitoring. 2017-11-27 17:43:14 -08:00
David Ashpole
49440c7e0a
Merge pull request #1818 from dashpole/changelog
changelog for v0.28.2
2017-11-21 16:32:31 -08:00
David Ashpole
9689d84e7f changelog for v0.28.2 2017-11-21 16:27:22 -08:00
David Ashpole
e420065e7d
Merge pull request #1817 from dashpole/util_clock
Switch from apimachinery clock to k8s.io/utils/clock
2017-11-21 16:24:51 -08:00
David Ashpole
3166cdae87 add utils/clock dependency 2017-11-21 16:19:57 -08:00
David Ashpole
3a347ec3fe Revert "add apimachinery clock dependency"
This reverts commit fd43dc16ba.
2017-11-21 14:21:47 -08:00
David Ashpole
5831d72df8
Merge pull request #1814 from mindprince/accelerator-data-race
Avoid race in accessing nvidiaDevices between Setup() and GetCollector()
2017-11-21 14:03:18 -08:00
Rohit Agarwal
3c3845e92f Avoid race in accessing nvidiaDevices between Setup() and GetCollector() 2017-11-21 13:53:47 -08:00
David Ashpole
7cb3faad02
Merge pull request #1811 from dashpole/changelog_0_28_1
changelog for v0.28.1
2017-11-20 15:13:49 -08:00
David Ashpole
1cd2620be6 changelog for v0.28.1 2017-11-20 15:08:11 -08:00
David Ashpole
17dcf1ca98
Merge pull request #1779 from dashpole/on_demand_metrics
On-Demand container metrics
2017-11-20 15:06:26 -08:00
David Ashpole
3d6ad6dd86 on demand metrics 2017-11-20 14:51:04 -08:00
David Ashpole
fd43dc16ba add apimachinery clock dependency 2017-11-20 13:15:15 -08:00
David Ashpole
ece1334172 update testify dependency 2017-11-17 16:15:28 -08:00
David Ashpole
a27bed7b9d
Merge pull request #1807 from dashpole/revert_1760
Revert "fix #1708; move from inotify to fsnotify"
2017-11-17 15:03:42 -08:00
David Ashpole
577f63f3da
Merge pull request #1808 from dashpole/update_ui
Update jquery and bootstrap dependencies
2017-11-17 14:42:58 -08:00
David Ashpole
ee8cbf1054 update jquery and bootstrap 2017-11-17 13:17:51 -08:00
David Ashpole
6988e70a3d Revert "fix #1708; move from inotify to fsnotify"
This reverts commit e6b6a1ac57.
2017-11-17 10:28:28 -08:00
David Ashpole
5231853e71
Merge pull request #1805 from andyxning/marshal_device_name_to_json_output
marshal device name to json output
2017-11-15 16:36:04 -08:00
Sławek Piotrowski
2648be083a fix #1607; use container creation time provided by Docker handler 2017-11-16 00:41:10 +01:00
David Ashpole
4466b4d9a0
Merge pull request #1795 from abhi/containerd
Containerd cadvisor integration
2017-11-15 10:24:53 -08:00
Andy Xie
9211091bdc marshal device name to json output 2017-11-15 13:45:57 +08:00
abhi
3495851c7a Updating godeps
Signed-off-by: abhi <abhi@docker.com>
2017-11-14 17:37:50 -08:00
abhi
6ad15431f4 Integrating containerd to cadvisor
This commit includes changes to integrate containerd
runtime to cadvisor to collect container stats

Signed-off-by: abhi <abhi@docker.com>

Test cases and minor changes

This commit include test cases and minor fixes
for the same

Signed-off-by: abhi <abhi@docker.com>
2017-11-14 17:37:36 -08:00
Derek Carr
49e0496c8f
Merge pull request #1773 from runcom/crio-socket-edit
container: crio: change crio socket
2017-11-14 11:18:15 -05:00
David Ashpole
c99f2418fa
Merge pull request #1801 from mindprince/update-gonvml
Update gonvml to workaround bazelbuild/rules_go#1003.
2017-11-10 15:34:33 -08:00
Rohit Agarwal
e1b4d79992 Update gonvml to workaround bazelbuild/rules_go#1003. 2017-11-10 15:06:44 -08:00
David Ashpole
e2c25110e0
Merge pull request #1799 from abhi/grpc
Updating grpc version to v1.3.0
2017-11-10 10:39:38 -08:00
abhi
e5d8730f4a Updating grpc version to v1.3.0
This commit includes godep change to update the grpc version
and also updates rkt version to v1.25.0.
A minor change has been made in the client based on how rkt
client is used in kubernetes/kubernetes.

Signed-off-by: abhi <abhi@docker.com>
2017-11-09 17:32:18 -08:00
David Ashpole
c3090a95c7
Merge pull request #1794 from kant/patch-1
Formatting change for documentation
2017-11-07 09:25:15 -08:00
Darío Hereñú
9d7cd18598
Minor proposal 2017-11-07 00:19:17 -03:00
Euan Kemp
1ecd24ea8d libcontainer: Use first cgroup subsystem found (#1792)
libcontainer: Use first cgroup subsystem found
2017-11-06 15:33:59 -08:00
David Ashpole
3d2e7fcfa3
Merge pull request #1791 from dashpole/changelog_v0.28.0
v0.28.0 changelog
2017-11-06 15:05:29 -08:00
David Ashpole
1f3f49af11 v0.28.0 changelog 2017-11-06 15:00:10 -08:00
David Ashpole
9bc6590461
Merge pull request #1762 from mindprince/gpu-metrics-1436
Add per container GPU metrics
2017-11-06 12:34:13 -08:00
Rohit Agarwal
4a35130019 Collect container-level GPU metrics using NVML.
When cAdvisor starts up, it would read the `vendor` files in
`/sys/bus/pci/devices/*` to see if any NVIDIA devices (vendor ID: 0x10de) are
attached to the node. If no NVIDIA devices are found, this code path would
become dormant for the rest of cAdvisor lifetime. If NVIDIA devices are found,
we would start a goroutine that would check for the presence of NVML by trying
to dynamically load it at regular intervals. We need to do this regular
checking instead of doing it just once because it may happen that cAdvisor is
started before the NVIDIA drivers and NVML are installed.  Once the NVML
dynamic loading succeeds, we would use NVML’s query methods to find out how
many devices exist on the node and create a map from their minor numbers to
their handles and cache that map. The goroutine would exit at this point.

If we detected the presence of NVML in the previous step, whenever a new
container is detected by cAdvisor, cAdvisor would read the `devices.list` file
from the container's devices cgroup. The `devices.list` file lists the
major:minor number of all the devices that the container is allowed to access.
If we find any device with major number 195 (which is the major number assigned
to NVIDIA devices), we would cache the list of corresponding minor numbers for
that container.

During every housekeeping operation, in addition to collecting all the existing
metrics, we will use the cached NVIDIA device minor numbers and the map from
minor numbers to device handles to get metrics for GPU devices attached to the
container.
2017-11-06 11:54:59 -08:00
Rohit Agarwal
318f28bef6 Vendor Go bindings for NVML. Don't build a static binary.
We can't build a static binary because that would require bundling the
closed source NVML library in cAdvisor.

Instead, gonvml uses dlopen to dynamically load NVML.
2017-11-01 14:41:35 -07:00
Rohit Agarwal
126fb2232e Add accelerator metrics to the API.
The structure is generic to support most hardware accelerators like
GPUs, TPUs etc.

Note that the prometheus label for id is called acc_id, so that it
doesn't conflict with some other label that maybe called id.
2017-11-01 14:41:35 -07:00
David Ashpole
31694e6e1e
Merge pull request #1786 from tklauser/utsname-x-sys-unix
Simplify Utsname string conversion
2017-10-31 10:46:13 -07:00
Tobias Klauser
24493b8458 Simplify Utsname string conversion
Use Utsname from golang.org/x/sys/unix which contains byte array
instead of int8/uint8 array members. This allows to simplify the string
conversions of these members.
2017-10-31 12:14:10 +01:00