Commit Graph

2917 Commits

Author SHA1 Message Date
David Ashpole
65fa5b44d3
Merge pull request #2613 from dashpole/changelog-v0.37.0
changelog for v0.37.0
2020-07-07 17:22:28 -07:00
David Ashpole
4163b9759d changelog for v0.37.0 2020-07-07 17:12:38 -07:00
David Ashpole
7fe71443dc
Merge pull request #2607 from katarzyna-z/kk-fix-metrics-list
Improve information about metrics
2020-07-07 09:14:54 -07:00
David Ashpole
0587e3d173
Merge pull request #2603 from dqminh/prometheus-on-demand
Allow on-demand metrics collection for prometheus
2020-07-06 11:03:00 -07:00
David Ashpole
9ec2495af9
Merge pull request #2610 from iwankgb/golangci-lint-upgrade
Upgrading golangci-lint to 1.28.0
2020-07-06 08:30:20 -07:00
Maciej "Iwan" Iwanowski
8ba056c4df
Upgrading golangci-lint to 1.28.0
Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@critical.today>
2020-07-04 23:33:50 +02:00
Daniel Dao
20e306ab03
Allow on-demand metrics collection for prometheus
This allows user to set v2.RequestOptions for PrometheusCollector to be
used when retrieving container stats. As prometheus collector API doesnt
allow us to pass customized option per collection, we create a new
collector per metrics request. The approach is also seen in
node_exporter (https://github.com/prometheus/node_exporter#filtering-enabled-collectors)

One can design the on-demand metrics collection as such to minimize CPU
usage in housekeeping routine:

- assuming prometheus scrape interval is 60s
- increase housekeeping interval to a high value such as 300s
- set max_age paramether in scrape to a low value ( such as 1s ), so
that if we scrape just as housekeeping finish, we don't do any extra
work. When we scrape, if current metrics are older than 1s, we will
collect the metrics again.

Signed-off-by: Daniel Dao <dqminh89@gmail.com>
2020-07-01 17:31:26 +01:00
David Ashpole
59f894acbd
Merge pull request #2605 from nightah/update-prom-client-golang
Update prometheus/client_golang to v1.7.1
2020-06-30 10:01:40 -07:00
Katarzyna Kujawa
2b023c6d02 Add information about build flags required for metrics
Signed-off-by: Katarzyna Kujawa <katarzyna.kujawa@intel.com>
2020-06-30 10:17:43 +02:00
Katarzyna Kujawa
9b44e0bd71 Corrected information about perf metrics
Signed-off-by: Katarzyna Kujawa <katarzyna.kujawa@intel.com>
2020-06-30 09:11:08 +02:00
Amir Zarrinkafsh
f68797e965
Update prometheus/client_golang to v1.7.1 2020-06-30 10:10:06 +10:00
David Ashpole
e26cd1220e
Merge pull request #2601 from katarzyna-z/kk-libpfm4-config
Add information about configuring perf events using libpfm4
2020-06-26 14:08:33 -07:00
David Ashpole
c4f167a6ec
Merge pull request #2598 from Creatone/creatone/disable-metrics-docs
Add "-disable_metrics" column to prometheus metrics table.
2020-06-26 14:07:52 -07:00
Katarzyna Kujawa
0c017c2102 Add information about configuring perf events using libpfm4
Signed-off-by: Katarzyna Kujawa <katarzyna.kujawa@intel.com>
2020-06-26 14:46:16 +02:00
Paweł Szulik
dc0aa9b279 Ignore Resctrl metrics by default.
Signed-off-by: Paweł Szulik <pawel.szulik@intel.com>
2020-06-26 11:40:18 +02:00
Paweł Szulik
f1f1bc77d9 Add "-disable_metrics" column to prometheus metrics table.
Signed-off-by: Paweł Szulik <pawel.szulik@intel.com>
2020-06-26 11:40:18 +02:00
David Ashpole
6f30891d89
Merge pull request #2600 from dqminh/raw-container-process-stat
Allow raw container to retrieve process stats
2020-06-25 16:19:30 -07:00
Daniel Dao
8fdcc6d0ee
Allow raw container to retrieve process stats
Raw containers ( such as systemd services ) don't have main PID, so
before this change they weren't allowed to retrieve various stats from
the cgroup such as networking stats.

However, some process stats such as file descriptor counts or number of
processes are still valuable to those containers and we should be able
to retrieve them without depending on the main pid.

For example, on my laptop:

Before:
```
container_processes{id="/system.slice/containerd.service"} 0 1593123685900
```

After:
```
container_processes{id="/system.slice/containerd.service"} 4 1593123707235
```
2020-06-25 23:26:42 +01:00
David Ashpole
8851fa6b9d
Merge pull request #2599 from dqminh/jquery-fix
Update some jquery references to 3.5.1
2020-06-25 15:21:55 -07:00
Daniel Dao
9223ccddb1
Update some jquery references to 3.5.1
commit 70bfdcb195 updates jquery version
to 3.5.1, however there are still references to 3.0.0 in some places,
so UI doesn't render properly.

Signed-off-by: Daniel Dao <dqminh89@gmail.com>
2020-06-25 23:07:16 +01:00
David Ashpole
7e94078940
Merge pull request #2597 from katarzyna-z/kk-fix-dockerfile
Remove redundant installation of libpfm4
2020-06-25 09:49:15 -07:00
David Ashpole
922aa83feb
Merge pull request #2596 from katarzyna-z/kk-logs-resctrl
Verbosity level for resctrl and perf logs
2020-06-25 09:47:21 -07:00
Katarzyna Kujawa
5d5a79bf29 Remove redundant installation of libpfm4
Signed-off-by: Katarzyna Kujawa <katarzyna.kujawa@intel.com>
2020-06-25 13:35:50 +02:00
Katarzyna Kujawa
f0721fff43 Add checking if resctrl path exists
Signed-off-by: Katarzyna Kujawa <katarzyna.kujawa@intel.com>
2020-06-25 05:01:36 +02:00
Katarzyna Kujawa
a8139dcf2e Add verbosity level to perf Info logs
Signed-off-by: Katarzyna Kujawa <katarzyna.kujawa@intel.com>
2020-06-24 11:32:18 +02:00
Katarzyna Kujawa
b74cdd2214 Change warning for resctrl path to Info with verbosity level
Change info log to Info with verbosity level
Change imports grouping

Signed-off-by: Katarzyna Kujawa <katarzyna.kujawa@intel.com>
2020-06-24 11:08:56 +02:00
David Ashpole
8450c56c21
Merge pull request #2525 from Creatone/creatone/perf-uncore
Add perf uncore events support.
2020-06-23 10:14:04 -07:00
Katarzyna Kujawa
3fcc88c533
Add stats to InfluxDB storage (#2593)
* Fix unit tests for InfluxDB; Add stats to InfluxDB storage
- memory stats
- hugetlb stats
- perf stats
- resctrl stats
- referenced memory

Signed-off-by: Katarzyna Kujawa <katarzyna.kujawa@intel.com>
2020-06-23 09:56:33 -07:00
Paweł Szulik
5641a0feae Add perf uncore events
Signed-off-by: Paweł Szulik <pawel.szulik@intel.com>
2020-06-23 18:56:28 +02:00
Paweł Szulik
0ac6b77bee
Add Resctrl metrics. (#2563)
* Add Resctrl metrics.

Signed-off-by: Paweł Szulik <pawel.szulik@intel.com>
2020-06-19 14:11:24 -07:00
David Ashpole
68ab079485
Merge pull request #2587 from katarzyna-z/kk-fix-memory-numa-stats
Fix #2583, update opencontainers/runc
2020-06-16 12:35:50 -07:00
David Ashpole
024aa97c6e
Merge pull request #2588 from Creatone/creatone/perf-cgroupv2
Add cgroups v2 support for perf events.
2020-06-16 09:43:26 -07:00
Paweł Szulik
dafbfa54dc Add cgroups v2 support for perf events.
Signed-off-by: Paweł Szulik <pawel.szulik@intel.com>
2020-06-16 13:18:04 +02:00
Katarzyna Kujawa
26fba038b3 Fix #2583, update opencontainers/runc
Signed-off-by: Katarzyna Kujawa <katarzyna.kujawa@intel.com>
2020-06-16 10:34:19 +02:00
David Ashpole
71fa79d18e
Merge pull request #2586 from hakman/fix-imagef-detection
Fix incorrect detection of images path
2020-06-15 10:00:51 -07:00
Ciprian Hacman
4d9315d31b Fix incorrect detection of images path 2020-06-13 09:29:50 +03:00
David Ashpole
8ddc989fec
Merge pull request #2585 from dashpole/update_jquery
update to jquery 3.5.1
2020-06-12 10:02:47 -07:00
David Ashpole
70bfdcb195 update to jquery 3.5.1 2020-06-12 09:32:53 -07:00
David Ashpole
4059641aa7
Merge pull request #2579 from iwankgb/skip_offline_cpus
Fixed topology with offline CPUs on x86
2020-06-11 08:53:56 -07:00
Maciej "Iwan" Iwanowski
29a53c9373
online file might not exist on x86 for cpu0
Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@critical.today>
2020-06-10 22:48:54 +02:00
Maciej "Iwan" Iwanowski
a948687621
Fixing CPU count on ARM along with TestPhysicalCoresReadingFromCpuBus test
Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@critical.today>
2020-06-10 22:48:54 +02:00
Maciej "Iwan" Iwanowski
fced3c1490
Fixing code style violations
Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@critical.today>
2020-06-10 22:48:54 +02:00
Maciej "Iwan" Iwanowski
bc76b661b9
Fixed topology with offline CPUs on x86
Signed-off-by: Maciej "Iwan" Iwanowski <maciej.iwanowski@critical.today>
2020-06-10 22:48:54 +02:00
David Ashpole
bb5fbc9748
Merge pull request #2581 from ondrejsika/bump-cadvisor-version-in-readme-quickstart
Bump cAdvisor version in README Quickstart
2020-06-10 10:09:31 -07:00
Ondrej Sika
d37c6eeb5b
Bump cAdvisor version in README Quickstart 2020-06-10 07:28:25 +02:00
David Ashpole
1098996665
Merge pull request #2567 from katarzyna-z/kk-fix-core-id
Fix #2566 warning instead of error when core_id or physical_package_id not available
2020-06-09 10:46:19 -07:00
David Ashpole
196b510b52
Merge pull request #2574 from RenaudWasTaken/nvidia
Return a NoopManager if metricset does not container the accelerator value
2020-06-08 17:34:53 -07:00
David Ashpole
cdaec26f70
Merge pull request #2575 from harche/dm_fix
Fix incorrect diskstats for dm devices
2020-06-08 09:26:49 -07:00
David Ashpole
bd8c4d13f8
Merge pull request #2576 from iwankgb/docker_build_on_arm
Using multistage Dockerfile and upgrading Alpine to 3.12
2020-06-08 08:59:12 -07:00
Harshal Patil
3da5347947 Fix incorrect diskstats for dm devices
Signed-off-by: Harshal Patil <harpatil@redhat.com>
2020-06-08 12:13:47 +05:30