The latest kubernetes deployment on Arm64 VM-s always fails.
Because k8s always get num_cores=0 from cAdvisor on Arm64 VM-s.
The reason is that, there is no cache info on Arm64 VM-s.
And the good news is that, we can get cache info on Arm64 hosts.
When this patch was merged, I will deliver a patch to update the version
of cAdvisor in kubernetes as soon as possible.
Signed-off-by: bblu <bin.lu@arm.com>
* Move sysfs related functions needed to get nodes' information and tests into utils/sysfs
* Add tests for sysfs related functions
Signed-off-by: Katarzyna Kujawa <katarzyna.kujawa@intel.com>
This commit fixes the bug in #2232 where cadvisor was not able to detect
the cloud provider if it's running on a custom AMI derived from
Amazon Linux 2.
It does so by checking /etc/os-release. However, from what I've read,
/etc/os-release is pretty much a systemd thing. Although Amazon Linux 2
comes with systemd, cadvisor cannot assume the existence of systemd in
other AMIs / OSes, therefore we would only be checking for
/etc/os-release if all other methods fail us.
context: kubernetes/kubernetes#68478
The inotify code was removed from golang.org/x/exp several years ago. Therefore
importing it from that path prevents downstream consumers from using any module
that makes use of more recent features of golang.org/x/exp.
Given that this code is by definition frozen and that the long term path should
be to migrate to fsnotify, replacing the current code by an identical standalone
copy doesn't have maintenance cost, and will unlock other activities for
kubernetes for example.
The oomparser logic would end up stuck, unable to detect the end of a
given oom trace, for any process with a name that didn't match \w+.
This includes processes like 'python3.4' due to the '.', or
'docker-containerd' due to the '-'.
This fix was included in pr #1544 last year, but since that PR seems
dead it seems like a good idea to break this more important fix out.
I've updated the tests such that they would have caught this issue.
IN_ATTRIB inotify events are generated when atime / mtime is changed,
which would cause the tail to be reset, and reread the same log
again (generating duplicate events). Instead, watch the directory for
file delete / move.
Also, use an exponential backoff when retrying opening the file.