关于gpu的一些命令

常用命令

  • nvidia-smi

一个可视化的界面,能查看当前gpu的信息和一些进程信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
Mon May 30 11:28:03 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.39 Driver Version: 460.39 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla V100-PCIE... Off | 00000000:21:01.0 Off | 0 |
| N/A 31C P0 23W / 250W | 0MiB / 32510MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
WARNING: infoROM is corrupted at gpu 0000:21:01.0
  • nvidia-sim -L

查看显卡版本,以及uuid

1
GPU 0: Tesla V100-PCIE-32GB (UUID: GPU-854583bb-xxxxx-4b07-4571-xxxxxxxxxx)
  • nvidia-smi -a

查看详细的gpu卡信息

  • lspci

如果没有命令,请安装:

1
yum install pciutils
  • lspci | grep -i nvidia

查看当前机器的gpu卡设备

1
21:01.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 PCIe 32GB] (rev a1)
  • lspci -v -s 21:01.0

后面是显卡代号,可以查看显卡详情

  • 其他

如果还是查不到,可以到这里查询: 点击我