Net-SNMP agent for NVIDIA GPUs
Peter Colberg ec77d82f9f Write documentation. 8 years ago
src Only #define NV_CTRL_TABLE_CACHE_TIMEOUT if not defined. 8 years ago
COPYING Add GNU General Public License version 2. 9 years ago
README Write documentation. 8 years ago


nvgpu-snmp Net-SNMP agent

nvgpu-snmp is an SNMP agent for `Net-SNMP `_
to gather information about NVIDIA GPUs, for example BIOS and driver versions,
GPU core and ambient temperatures, and GPU and memory clock frequencies.

It supports NVIDIA graphics cards (GeForce/Quadro series) as well as NVIDIA
Tesla computing processors, provided the system is running an X server with the
NVIDIA X display driver. It allows for effortless temperature monitoring in
both small and large GPGPU computing cluster environments.

nvgpu-snmp is licensed under the GNU General Public License version 2 or later.

This documentation is also available at

Getting the source code
nvgpu-snmp is maintained in a git repository. To get the source code, run::

git clone git://

nvgpu-snmp requires a running X server with the NVIDIA X display driver.
It suffices to have the login manager of KDM or GDM running.
To test whether the NV-CONTROL X extension is working, run as root::

export XAUTHORITY=/var/lib/gdm/:0.Xauth
export DISPLAY=:0
nvidia-settings -q gpus


2 GPUs on hal:0

[0] hal:0[gpu:0] (GeForce GTX 280)

[1] hal:0[gpu:1] (GeForce GTX 280)

to list the available GPUs in the system, where the ``XAUTHORITY`` environment
variable contains the path to the authority file of the current X session (in
this case of the GDM login screen).

Note that only one GPU needs to be configured as a screen in the ``xorg.conf``
file, which may also be a headless GPU such as an NVIDIA Tesla device: As long
as you manage to start an X server on any of the available NVIDIA GPUs, you may
query the NV-CONTROL X extension attributes of all of them.

The Net-SNMP, Xext and X11 headers and libraries are required for compilation.

To compile the source code, run::


This will produce the shared object module ````, which may be
installed along with the MIB module ``NV-CTRL-MIB.txt`` using::

make install

Otherwise, if your installation of Net-SNMP does not use the default paths for
shared object modules or MIB modules, copy them manually.

SNMP daemon configuration
Next, configure the SNMP daemon to load the module by including the following
snippet in the snmpd.conf file::

dlmod nvCtrlTable nvgpu-snmp

While you are at it, allow local access to the SNMP daemon for testing purposes::

com2sec server public

Before starting snmpd, the DISPLAY environment variable has to be set::

export DISPLAY=:0

This command may for example be included in the ``/etc/default/snmpd`` or
a similar file.

Granting X server access to the SNMP daemon
The tricky part is to allow the SNMP daemon to access the X server in order to
query the NV-CONTROL X extension.

One possibility is to give the user of the snmpd process read access to the X
authority file of the current X session and manually inform snmpd about its path
by setting the XAUTHORITY environment variable.

A slightly more elegant solution is to use ``xhost +si:localuser:snmp`` in the
current X session to grant the ``snmp`` user access to the X server. For the
example of the GDM login manager, this may be achieved by running as root::

export XAUTHORITY=/var/lib/gdm/:0.Xauth
export DISPLAY=:0
xhost +si:localuser:snmp

If your distribution has a ``/etc/X11/Xsession.d/`` directory to execute arbitrary
scripts upon X session initialization as the session user, placing a script such
as the following into this directory might also work::

# allow snmp daemon to access NV-CONTROL X extension
xhost +si:localuser:snmp


Use ``snmpwalk`` to query the nvgpu-snmp agent module::

snmpwalk localhost nvCtrl


NV-CTRL-MIB::nvCtrlProductName.0 = STRING: GeForce GTX 280
NV-CTRL-MIB::nvCtrlProductName.1 = STRING: GeForce GTX 280
NV-CTRL-MIB::nvCtrlVBiosVersion.0 = STRING: 62.00.0e.00.01
NV-CTRL-MIB::nvCtrlVBiosVersion.1 = STRING: 62.00.0e.00.01
NV-CTRL-MIB::nvCtrlNvidiaDriverVersion.0 = STRING: 185.18.14
NV-CTRL-MIB::nvCtrlNvidiaDriverVersion.1 = STRING: 185.18.14
NV-CTRL-MIB::nvCtrlVersion.0 = STRING: 1.18
NV-CTRL-MIB::nvCtrlVersion.1 = STRING: 1.18
NV-CTRL-MIB::nvCtrlBusType.0 = INTEGER: 2
NV-CTRL-MIB::nvCtrlBusType.1 = INTEGER: 2
NV-CTRL-MIB::nvCtrlBusRate.0 = INTEGER: 16
NV-CTRL-MIB::nvCtrlBusRate.1 = INTEGER: 16
NV-CTRL-MIB::nvCtrlVideoRam.0 = INTEGER: 1048576
NV-CTRL-MIB::nvCtrlVideoRam.1 = INTEGER: 1048576
NV-CTRL-MIB::nvCtrlIrq.0 = INTEGER: 16
NV-CTRL-MIB::nvCtrlIrq.1 = INTEGER: 19
NV-CTRL-MIB::nvCtrlGPUCoreTemp.0 = INTEGER: 49
NV-CTRL-MIB::nvCtrlGPUCoreTemp.1 = INTEGER: 46
NV-CTRL-MIB::nvCtrlGPUCoreThreshold.0 = INTEGER: 190
NV-CTRL-MIB::nvCtrlGPUCoreThreshold.1 = INTEGER: 190
NV-CTRL-MIB::nvCtrlGPUDefaultCoreThreshold.0 = INTEGER: 190
NV-CTRL-MIB::nvCtrlGPUDefaultCoreThreshold.1 = INTEGER: 190
NV-CTRL-MIB::nvCtrlGPUMaxCoreThreshold.0 = INTEGER: 190
NV-CTRL-MIB::nvCtrlGPUMaxCoreThreshold.1 = INTEGER: 190
NV-CTRL-MIB::nvCtrlGPUAmbientTemp.0 = INTEGER: 41
NV-CTRL-MIB::nvCtrlGPUAmbientTemp.1 = INTEGER: 40
NV-CTRL-MIB::nvCtrlGPUOverclockingState.0 = INTEGER: 0
NV-CTRL-MIB::nvCtrlGPUOverclockingState.1 = INTEGER: 0
NV-CTRL-MIB::nvCtrlGPU2DGPUClockFreq.0 = INTEGER: 300
NV-CTRL-MIB::nvCtrlGPU2DGPUClockFreq.1 = INTEGER: 300
NV-CTRL-MIB::nvCtrlGPU2DMemClockFreq.0 = INTEGER: 100
NV-CTRL-MIB::nvCtrlGPU2DMemClockFreq.1 = INTEGER: 100
NV-CTRL-MIB::nvCtrlGPU3DGPUClockFreq.0 = INTEGER: 602
NV-CTRL-MIB::nvCtrlGPU3DGPUClockFreq.1 = INTEGER: 602
NV-CTRL-MIB::nvCtrlGPU3DMemClockFreq.0 = INTEGER: 1107
NV-CTRL-MIB::nvCtrlGPU3DMemClockFreq.1 = INTEGER: 1107
NV-CTRL-MIB::nvCtrlGPUDefault2DGPUClockFreq.0 = INTEGER: 300
NV-CTRL-MIB::nvCtrlGPUDefault2DGPUClockFreq.1 = INTEGER: 300
NV-CTRL-MIB::nvCtrlGPUDefault2DMemClockFreq.0 = INTEGER: 100
NV-CTRL-MIB::nvCtrlGPUDefault2DMemClockFreq.1 = INTEGER: 100
NV-CTRL-MIB::nvCtrlGPUDefault3DGPUClockFreq.0 = INTEGER: 602
NV-CTRL-MIB::nvCtrlGPUDefault3DGPUClockFreq.1 = INTEGER: 602
NV-CTRL-MIB::nvCtrlGPUDefault3DMemClockFreq.0 = INTEGER: 1107
NV-CTRL-MIB::nvCtrlGPUDefault3DMemClockFreq.1 = INTEGER: 1107
NV-CTRL-MIB::nvCtrlGPUCurrentGPUClockFreq.0 = INTEGER: 300
NV-CTRL-MIB::nvCtrlGPUCurrentGPUClockFreq.1 = INTEGER: 300
NV-CTRL-MIB::nvCtrlGPUCurrentMemClockFreq.0 = INTEGER: 100
NV-CTRL-MIB::nvCtrlGPUCurrentMemClockFreq.1 = INTEGER: 100

Note that the module caches queries for 5 seconds by default. If you want to
change the timeout, define ``NV_CTRL_TABLE_CACHE_TIMEOUT`` as the time in
seconds upon compilation.

To debug the nvgpu-snmp agent module, start snmpd with::

DISPLAY=:0 sudo snmpd -u snmp -DnvCtrlTable -f

and query the ``nvCtrl`` table using snmpwalk.

For suggestions or bug reports, mail me: Peter Colberg