Earlier if the NVIDIA driver was not installed when cAdvisor was started we would start a goroutine to try to initialize NVML every minute. This resulted in a race. We can have a situation where: - goroutine tries to initialize NVML but fails. So, it sleeps for a minute. - the driver is installed. - a container that uses NVIDIA devices is started. This container would not get GPU stats because a minute has not passed since the last failed initialization attempt and so NVML is not initialized. |
||
---|---|---|
.. | ||
nvidia_test.go | ||
nvidia.go | ||
types.go |