Commit Graph

2249 Commits

Author SHA1 Message Date
abhi
6ad15431f4 Integrating containerd to cadvisor
This commit includes changes to integrate containerd
runtime to cadvisor to collect container stats

Signed-off-by: abhi <abhi@docker.com>

Test cases and minor changes

This commit include test cases and minor fixes
for the same

Signed-off-by: abhi <abhi@docker.com>
2017-11-14 17:37:36 -08:00
David Ashpole
c99f2418fa
Merge pull request #1801 from mindprince/update-gonvml
Update gonvml to workaround bazelbuild/rules_go#1003.
2017-11-10 15:34:33 -08:00
Rohit Agarwal
e1b4d79992 Update gonvml to workaround bazelbuild/rules_go#1003. 2017-11-10 15:06:44 -08:00
David Ashpole
e2c25110e0
Merge pull request #1799 from abhi/grpc
Updating grpc version to v1.3.0
2017-11-10 10:39:38 -08:00
abhi
e5d8730f4a Updating grpc version to v1.3.0
This commit includes godep change to update the grpc version
and also updates rkt version to v1.25.0.
A minor change has been made in the client based on how rkt
client is used in kubernetes/kubernetes.

Signed-off-by: abhi <abhi@docker.com>
2017-11-09 17:32:18 -08:00
David Ashpole
c3090a95c7
Merge pull request #1794 from kant/patch-1
Formatting change for documentation
2017-11-07 09:25:15 -08:00
Darío Hereñú
9d7cd18598
Minor proposal 2017-11-07 00:19:17 -03:00
Euan Kemp
1ecd24ea8d libcontainer: Use first cgroup subsystem found (#1792)
libcontainer: Use first cgroup subsystem found
2017-11-06 15:33:59 -08:00
David Ashpole
3d2e7fcfa3
Merge pull request #1791 from dashpole/changelog_v0.28.0
v0.28.0 changelog
2017-11-06 15:05:29 -08:00
David Ashpole
1f3f49af11 v0.28.0 changelog 2017-11-06 15:00:10 -08:00
David Ashpole
9bc6590461
Merge pull request #1762 from mindprince/gpu-metrics-1436
Add per container GPU metrics
2017-11-06 12:34:13 -08:00
Rohit Agarwal
4a35130019 Collect container-level GPU metrics using NVML.
When cAdvisor starts up, it would read the `vendor` files in
`/sys/bus/pci/devices/*` to see if any NVIDIA devices (vendor ID: 0x10de) are
attached to the node. If no NVIDIA devices are found, this code path would
become dormant for the rest of cAdvisor lifetime. If NVIDIA devices are found,
we would start a goroutine that would check for the presence of NVML by trying
to dynamically load it at regular intervals. We need to do this regular
checking instead of doing it just once because it may happen that cAdvisor is
started before the NVIDIA drivers and NVML are installed.  Once the NVML
dynamic loading succeeds, we would use NVML’s query methods to find out how
many devices exist on the node and create a map from their minor numbers to
their handles and cache that map. The goroutine would exit at this point.

If we detected the presence of NVML in the previous step, whenever a new
container is detected by cAdvisor, cAdvisor would read the `devices.list` file
from the container's devices cgroup. The `devices.list` file lists the
major:minor number of all the devices that the container is allowed to access.
If we find any device with major number 195 (which is the major number assigned
to NVIDIA devices), we would cache the list of corresponding minor numbers for
that container.

During every housekeeping operation, in addition to collecting all the existing
metrics, we will use the cached NVIDIA device minor numbers and the map from
minor numbers to device handles to get metrics for GPU devices attached to the
container.
2017-11-06 11:54:59 -08:00
Rohit Agarwal
318f28bef6 Vendor Go bindings for NVML. Don't build a static binary.
We can't build a static binary because that would require bundling the
closed source NVML library in cAdvisor.

Instead, gonvml uses dlopen to dynamically load NVML.
2017-11-01 14:41:35 -07:00
Rohit Agarwal
126fb2232e Add accelerator metrics to the API.
The structure is generic to support most hardware accelerators like
GPUs, TPUs etc.

Note that the prometheus label for id is called acc_id, so that it
doesn't conflict with some other label that maybe called id.
2017-11-01 14:41:35 -07:00
David Ashpole
31694e6e1e
Merge pull request #1786 from tklauser/utsname-x-sys-unix
Simplify Utsname string conversion
2017-10-31 10:46:13 -07:00
Tobias Klauser
24493b8458 Simplify Utsname string conversion
Use Utsname from golang.org/x/sys/unix which contains byte array
instead of int8/uint8 array members. This allows to simplify the string
conversions of these members.
2017-10-31 12:14:10 +01:00
Tobias Klauser
e5135c223d Update golang.org/x/sys dependency 2017-10-31 12:13:43 +01:00
David Ashpole
6d3841c68a
Merge pull request #1784 from majst01/misspelled-errors
Fix wrong error checking in fsHandler.go
2017-10-30 10:02:08 -07:00
David Ashpole
bfb7e13720
Merge pull request #1785 from sjenning/fix-msg
fix long du duration message
2017-10-30 09:57:31 -07:00
Seth Jennings
cc77f13a2b fix long du message 2017-10-30 10:00:47 -05:00
Stefan Majer
d5e2ffbef7 Fix wrong error checking in fsHandler.go 2017-10-30 07:44:57 +01:00
David Ashpole
eaa3532918
Merge pull request #1755 from bakins/update-zfs
Update go-zfs dependency
2017-10-28 12:38:18 -07:00
Brian Akins
752f2cd121 Update go-zfs dependency
Update go-zfs dependency
2017-10-28 09:34:43 -04:00
David Ashpole
53820123e6 Merge pull request #1336 from ronnielai/test
Don't rely on the returned value when there's an error
2017-10-24 15:56:55 -07:00
David Ashpole
479e4a97ed Merge pull request #1780 from majst01/master
Fix Memory leak in fs.go
2017-10-24 11:20:52 -07:00
Stefan Majer
4a778288ee Stop AfterFunc timer after findCmd.Wait regardless of errors to prevent memory leak 2017-10-24 15:18:07 +02:00
David Ashpole
99716b05db Merge pull request #1771 from runcom/fix-overlay2-crio
CRI-O: fix handling of overlay2 storage
2017-10-23 10:16:02 -07:00
Antonio Murdaca
088aaf1a32
CRI-O: fix handling of overlay2 storage
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2017-10-21 12:05:40 +02:00
David Ashpole
65f8fdd877 Merge pull request #1769 from euank/ignore-nil-cpustats
libcontainer: ignore nil cpustats
2017-10-20 13:38:22 -07:00
Euan Kemp
587691c7f3 libcontainer: ignore nil cpustats
Cadvisor can inotify watch for new cgroups. This leads to it racing
fairly tightly with cgroup creation... So tightly, that sometimes
cpustats are nil.

The runc library code we call
(https://github.com/opencontainers/runc/blob/v1.0.0-rc4/libcontainer/cgroups/fs/apply_raw.go#L179-L182)
doesn't actaully consider this an error, so we have to handle that
scenario ourselves.

This fixes https://github.com/google/cadvisor/issues/1765
2017-10-20 13:08:23 -07:00
Derek Carr
90bb0524fe Merge pull request #1770 from dashpole/fix_overlay2
Monitor diff directory for overlay2
2017-10-18 11:39:09 -04:00
David Ashpole
b7959da460 monitor diff directory for overlay2 2017-10-11 14:13:42 -07:00
David Ashpole
790b787399 Merge pull request #1768 from derekwaynecarr/max_usage_in_bytes
Container stats expose memory max usage
2017-10-10 14:53:11 -07:00
Derek Carr
9ea61176bf Expose memory.max_usage_in_bytes in container stats 2017-10-10 17:31:31 -04:00
Derek Carr
3e659ec0d1 Merge pull request #1766 from sjenning/adapt-long-du
adaptive longOp for du operation
2017-10-05 10:36:37 -04:00
Seth Jennings
fd9c6d2fde adaptive longOp for du operation 2017-10-05 09:22:54 -05:00
David Ashpole
b9ab5d6ba9 Merge pull request #1756 from dashpole/remove_engine_api
Move off of docker/engine-api
2017-09-28 14:37:22 -07:00
David Ashpole
888a529088 fix #1743; move off of docker/engine-api 2017-09-28 11:05:13 -07:00
David Ashpole
f1c4a432ac Merge pull request #1760 from dashpole/off_inotify
Move from inotify to fsnotify
2017-09-28 11:04:52 -07:00
David Ashpole
e6b6a1ac57 fix #1708; move from inotify to fsnotify 2017-09-28 10:57:49 -07:00
David Ashpole
8a59b6d8cf Merge pull request #1759 from dashpole/clarify_memory_usage
Update memory usage helper message
2017-09-28 10:57:26 -07:00
David Ashpole
1dcd0cee2b update description of memory usage 2017-09-28 10:48:07 -07:00
David Ashpole
ba91527651 Merge pull request #1763 from mindprince/cleanup
Minor cleanup.
2017-09-27 11:30:53 -07:00
Rohit Agarwal
919b369923 Minor cleanup.
- Remove unused travis file.
- Fix Makefile license.
- Update README with new link.
- Fix outdated comment.
2017-09-27 00:22:11 -07:00
David Ashpole
76538e77a5 Merge pull request #1754 from bsingr/master
Add memory reservation in prom `/metrics` endpoint.
2017-09-19 15:11:18 -07:00
Jens Bissinger
2599ea6764 Add memory reservation in prom /metrics endpoint. 2017-09-12 19:20:49 +02:00
David Ashpole
a2d3378b8a Merge pull request #1752 from dashpole/fix_builds
apt-get update, not apt-cache update
2017-09-07 11:09:24 -07:00
David Ashpole
d4a4d8c960 fix builds 2017-09-07 10:47:04 -07:00
David Ashpole
2f84d83d9c Merge pull request #1751 from dashpole/fix_builds
clean after apt-get install
2017-09-07 09:20:16 -07:00
David Ashpole
08136d0a3c clean after apt-get install 2017-09-06 16:45:35 -07:00