Skip to main content
Version: v1.7.x

NVIDIA Monitoring

Collect and monitor general performance metrics of NVIDIA operating systems. NVIDIA monitoring requires the nvidia-smi command, which is installed together with the NVIDIA GPU driver. So when monitoring NVIDIA, we need to install the NVIDIA GPU driver.

Configuration Parameters

Parameter NameDescription
Monitoring HostThe IP address (IPv4/IPv6) or domain name of the monitored endpoint. Note ⚠️ do not include protocol headers (e.g., https://, http://).
Task NameThe name identifying this monitoring task, which needs to be unique.
PortThe port exposed for Linux SSH, default is 22.
UsernameSSH connection username, optional.
PasswordSSH connection password, optional.
Collection IntervalInterval for periodically collecting monitoring data, in seconds. The minimum interval is 30 seconds.
Probe Before MonitoringWhether to probe the monitoring endpoint to check its availability before adding it. Monitoring is added or modified only if the probe succeeds.
Description/RemarksAdditional notes and descriptions for this monitoring task. Users can add relevant information here.

Collected Metrics

Metric Set: basic

Metric NameUnitDescription
indexNoneGPU index
nameNoneGPU name
utilization.gpu[%]NoneGPU utilization
utilization.memory[%]NoneMemory utilization
memory.total[MiB]MiBTotal memory
memory.used[MiB]MiBUsed memory
memory.free[MiB]MiBFree memory
temperature.gpuNoneGPU temperature