David Ashpole
c3090a95c7
Merge pull request #1794 from kant/patch-1
...
Formatting change for documentation
2017-11-07 09:25:15 -08:00
Darío Hereñú
9d7cd18598
Minor proposal
2017-11-07 00:19:17 -03:00
Euan Kemp
1ecd24ea8d
libcontainer: Use first cgroup subsystem found ( #1792 )
...
libcontainer: Use first cgroup subsystem found
2017-11-06 15:33:59 -08:00
David Ashpole
3d2e7fcfa3
Merge pull request #1791 from dashpole/changelog_v0.28.0
...
v0.28.0 changelog
2017-11-06 15:05:29 -08:00
David Ashpole
1f3f49af11
v0.28.0 changelog
2017-11-06 15:00:10 -08:00
David Ashpole
9bc6590461
Merge pull request #1762 from mindprince/gpu-metrics-1436
...
Add per container GPU metrics
2017-11-06 12:34:13 -08:00
Rohit Agarwal
4a35130019
Collect container-level GPU metrics using NVML.
...
When cAdvisor starts up, it would read the `vendor` files in
`/sys/bus/pci/devices/*` to see if any NVIDIA devices (vendor ID: 0x10de) are
attached to the node. If no NVIDIA devices are found, this code path would
become dormant for the rest of cAdvisor lifetime. If NVIDIA devices are found,
we would start a goroutine that would check for the presence of NVML by trying
to dynamically load it at regular intervals. We need to do this regular
checking instead of doing it just once because it may happen that cAdvisor is
started before the NVIDIA drivers and NVML are installed. Once the NVML
dynamic loading succeeds, we would use NVML’s query methods to find out how
many devices exist on the node and create a map from their minor numbers to
their handles and cache that map. The goroutine would exit at this point.
If we detected the presence of NVML in the previous step, whenever a new
container is detected by cAdvisor, cAdvisor would read the `devices.list` file
from the container's devices cgroup. The `devices.list` file lists the
major:minor number of all the devices that the container is allowed to access.
If we find any device with major number 195 (which is the major number assigned
to NVIDIA devices), we would cache the list of corresponding minor numbers for
that container.
During every housekeeping operation, in addition to collecting all the existing
metrics, we will use the cached NVIDIA device minor numbers and the map from
minor numbers to device handles to get metrics for GPU devices attached to the
container.
2017-11-06 11:54:59 -08:00
Rohit Agarwal
318f28bef6
Vendor Go bindings for NVML. Don't build a static binary.
...
We can't build a static binary because that would require bundling the
closed source NVML library in cAdvisor.
Instead, gonvml uses dlopen to dynamically load NVML.
2017-11-01 14:41:35 -07:00
Rohit Agarwal
126fb2232e
Add accelerator metrics to the API.
...
The structure is generic to support most hardware accelerators like
GPUs, TPUs etc.
Note that the prometheus label for id is called acc_id, so that it
doesn't conflict with some other label that maybe called id.
2017-11-01 14:41:35 -07:00
David Ashpole
31694e6e1e
Merge pull request #1786 from tklauser/utsname-x-sys-unix
...
Simplify Utsname string conversion
2017-10-31 10:46:13 -07:00
Tobias Klauser
24493b8458
Simplify Utsname string conversion
...
Use Utsname from golang.org/x/sys/unix which contains byte array
instead of int8/uint8 array members. This allows to simplify the string
conversions of these members.
2017-10-31 12:14:10 +01:00
Tobias Klauser
e5135c223d
Update golang.org/x/sys dependency
2017-10-31 12:13:43 +01:00
David Ashpole
6d3841c68a
Merge pull request #1784 from majst01/misspelled-errors
...
Fix wrong error checking in fsHandler.go
2017-10-30 10:02:08 -07:00
David Ashpole
bfb7e13720
Merge pull request #1785 from sjenning/fix-msg
...
fix long du duration message
2017-10-30 09:57:31 -07:00
Seth Jennings
cc77f13a2b
fix long du message
2017-10-30 10:00:47 -05:00
Stefan Majer
d5e2ffbef7
Fix wrong error checking in fsHandler.go
2017-10-30 07:44:57 +01:00
David Ashpole
eaa3532918
Merge pull request #1755 from bakins/update-zfs
...
Update go-zfs dependency
2017-10-28 12:38:18 -07:00
Brian Akins
752f2cd121
Update go-zfs dependency
...
Update go-zfs dependency
2017-10-28 09:34:43 -04:00
David Ashpole
53820123e6
Merge pull request #1336 from ronnielai/test
...
Don't rely on the returned value when there's an error
2017-10-24 15:56:55 -07:00
David Ashpole
479e4a97ed
Merge pull request #1780 from majst01/master
...
Fix Memory leak in fs.go
2017-10-24 11:20:52 -07:00
Stefan Majer
4a778288ee
Stop AfterFunc timer after findCmd.Wait regardless of errors to prevent memory leak
2017-10-24 15:18:07 +02:00
David Ashpole
99716b05db
Merge pull request #1771 from runcom/fix-overlay2-crio
...
CRI-O: fix handling of overlay2 storage
2017-10-23 10:16:02 -07:00
Antonio Murdaca
088aaf1a32
CRI-O: fix handling of overlay2 storage
...
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2017-10-21 12:05:40 +02:00
David Ashpole
65f8fdd877
Merge pull request #1769 from euank/ignore-nil-cpustats
...
libcontainer: ignore nil cpustats
2017-10-20 13:38:22 -07:00
Euan Kemp
587691c7f3
libcontainer: ignore nil cpustats
...
Cadvisor can inotify watch for new cgroups. This leads to it racing
fairly tightly with cgroup creation... So tightly, that sometimes
cpustats are nil.
The runc library code we call
(https://github.com/opencontainers/runc/blob/v1.0.0-rc4/libcontainer/cgroups/fs/apply_raw.go#L179-L182 )
doesn't actaully consider this an error, so we have to handle that
scenario ourselves.
This fixes https://github.com/google/cadvisor/issues/1765
2017-10-20 13:08:23 -07:00
Antonio Murdaca
54b661236d
container: crio: change crio socket
...
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2017-10-19 00:17:18 +02:00
Derek Carr
90bb0524fe
Merge pull request #1770 from dashpole/fix_overlay2
...
Monitor diff directory for overlay2
2017-10-18 11:39:09 -04:00
David Ashpole
b7959da460
monitor diff directory for overlay2
2017-10-11 14:13:42 -07:00
David Ashpole
790b787399
Merge pull request #1768 from derekwaynecarr/max_usage_in_bytes
...
Container stats expose memory max usage
2017-10-10 14:53:11 -07:00
Derek Carr
9ea61176bf
Expose memory.max_usage_in_bytes in container stats
2017-10-10 17:31:31 -04:00
Derek Carr
3e659ec0d1
Merge pull request #1766 from sjenning/adapt-long-du
...
adaptive longOp for du operation
2017-10-05 10:36:37 -04:00
Seth Jennings
fd9c6d2fde
adaptive longOp for du operation
2017-10-05 09:22:54 -05:00
David Ashpole
b9ab5d6ba9
Merge pull request #1756 from dashpole/remove_engine_api
...
Move off of docker/engine-api
2017-09-28 14:37:22 -07:00
David Ashpole
888a529088
fix #1743 ; move off of docker/engine-api
2017-09-28 11:05:13 -07:00
David Ashpole
f1c4a432ac
Merge pull request #1760 from dashpole/off_inotify
...
Move from inotify to fsnotify
2017-09-28 11:04:52 -07:00
David Ashpole
e6b6a1ac57
fix #1708 ; move from inotify to fsnotify
2017-09-28 10:57:49 -07:00
David Ashpole
8a59b6d8cf
Merge pull request #1759 from dashpole/clarify_memory_usage
...
Update memory usage helper message
2017-09-28 10:57:26 -07:00
David Ashpole
1dcd0cee2b
update description of memory usage
2017-09-28 10:48:07 -07:00
David Ashpole
ba91527651
Merge pull request #1763 from mindprince/cleanup
...
Minor cleanup.
2017-09-27 11:30:53 -07:00
Rohit Agarwal
919b369923
Minor cleanup.
...
- Remove unused travis file.
- Fix Makefile license.
- Update README with new link.
- Fix outdated comment.
2017-09-27 00:22:11 -07:00
David Ashpole
76538e77a5
Merge pull request #1754 from bsingr/master
...
Add memory reservation in prom `/metrics` endpoint.
2017-09-19 15:11:18 -07:00
Jens Bissinger
2599ea6764
Add memory reservation in prom /metrics
endpoint.
2017-09-12 19:20:49 +02:00
David Ashpole
a2d3378b8a
Merge pull request #1752 from dashpole/fix_builds
...
apt-get update, not apt-cache update
2017-09-07 11:09:24 -07:00
David Ashpole
d4a4d8c960
fix builds
2017-09-07 10:47:04 -07:00
David Ashpole
2f84d83d9c
Merge pull request #1751 from dashpole/fix_builds
...
clean after apt-get install
2017-09-07 09:20:16 -07:00
David Ashpole
08136d0a3c
clean after apt-get install
2017-09-06 16:45:35 -07:00
David Ashpole
bed4ed57dc
Merge pull request #1750 from dashpole/patch_release
...
0.27.1 changelog update
2017-09-06 16:20:25 -07:00
David Ashpole
debd14b35a
0.27.1 changelog
2017-09-06 16:09:30 -07:00
David Ashpole
32f77e9f14
Merge pull request #1749 from dashpole/fix_builds
...
Add apt-get update to dockerfile
2017-09-06 16:07:15 -07:00
David Ashpole
a996ef85a4
fix kubernetes/test-infra #4403
2017-09-06 15:47:58 -07:00