$ cat /proc/version
Linux version 6.5.0-27-generic (buildd@lcy02-amd64-059) (x86_64-linux-gnu-gcc-13 (Ubuntu 13.2.0-4ubuntu3) 13.2.0, GNU ld (GNU Binutils for Ubuntu) 2.41) #28-Ubuntu SMP PREEMPT_DYNAMIC Thu Mar 7 18:21:00 UTC 2024
************************************
* Waiting for client to connect... *
************************************
Perftest doesn't supports CUDA tests with inline messages: inline size set to 0
initializing CUDA
initializing CUDA
Listing all CUDA devices in system:
CUDA device 0: PCIe address is 07:00
CUDA device 1: PCIe address is 0F:00
CUDA device 2: PCIe address is 47:00
CUDA device 3: PCIe address is 4E:00
CUDA device 4: PCIe address is 87:00
CUDA device 5: PCIe address is 90:00
CUDA device 6: PCIe address is B7:00
CUDA device 7: PCIe address is BD:00
Picking device No. 1
[pid = 15582, dev = 1] device name = [NVIDIA A100-SXM4-40GB]
creating CUDA Ctx
Listing all CUDA devices in system:
CUDA device 0: PCIe address is 07:00
CUDA device 1: PCIe address is 0F:00
CUDA device 2: PCIe address is 47:00
CUDA device 3: PCIe address is 4E:00
CUDA device 4: PCIe address is 87:00
CUDA device 5: PCIe address is 90:00
CUDA device 6: PCIe address is B7:00
CUDA device 7: PCIe address is BD:00
Picking device No. 0
[pid = 15576, dev = 0] device name = [NVIDIA A100-SXM4-40GB]
creating CUDA Ctx
making it the current CUDA Ctx
cuMemAlloc() of a 16777216 bytes GPU buffer
allocated GPU buffer address at 00007c0146000000 pointer=0x7c0146000000
--------------------------------------------------------------------------------------- RDMA_Write BW Test
Dual-port : OFF Device : mlx5_6
Number of qps : 1 Transport type : IB
Connection type : RC Using SRQ : OFF
PCIe relax order: ON
making it the current CUDA Ctx
cuMemAlloc() of a 16777216 bytes GPU buffer
allocated GPU buffer address at 00007a08b4000000 pointer=0x7a08b4000000
--------------------------------------------------------------------------------------- RDMA_Write BW Test
Dual-port : OFF Device : mlx5_2
Number of qps : 1 Transport type : IB
Connection type : RC Using SRQ : OFF
PCIe relax order: ON
ibv_wr* API : ON
TX depth : 128
CQ Moderation : 100
Mtu : 4096[B]
Link type : IB
Max inline data : 0[B]
rdma_cm QPs : OFF
Data ex. method : Ethernet
---------------------------------------------------------------------------------------
ibv_wr* API : ON
CQ Moderation : 100
Mtu : 4096[B]
Link type : IB
Max inline data : 0[B]
rdma_cm QPs : OFF
Data ex. method : Ethernet
---------------------------------------------------------------------------------------
local address: LID 0x01 QPN 0x0029 PSN 0x6763b2 RKey 0x180eef VAddr 0x007a08b4800000
local address: LID 0x02 QPN 0x36ef PSN 0x149b7b RKey 0x180efd VAddr 0x007c0146800000
remote address: LID 0x02 QPN 0x36ef PSN 0x149b7b RKey 0x180efd VAddr 0x007c0146800000
---------------------------------------------------------------------------------------
#bytes #iterations BW peak[MB/sec] BW average[MB/sec] MsgRate[Mpps]
remote address: LID 0x01 QPN 0x0029 PSN 0x6763b2 RKey 0x180eef VAddr 0x007a08b4800000
---------------------------------------------------------------------------------------
#bytes #iterations BW peak[MB/sec] BW average[MB/sec] MsgRate[Mpps]
Conflicting CPU frequency values detected: 1500.000000 != 1857.916000. CPU Frequency is not max.
2 5000 3.93 3.91 2.049159
Conflicting CPU frequency values detected: 1500.000000 != 2250.000000. CPU Frequency is not max.
4 5000 7.90 7.86 2.061297
Conflicting CPU frequency values detected: 1500.000000 != 1751.986000. CPU Frequency is not max.
8 5000 15.78 15.72 2.060780
Conflicting CPU frequency values detected: 1500.000000 != 3393.685000. CPU Frequency is not max.
16 5000 31.55 31.55 2.067723
Conflicting CPU frequency values detected: 1500.000000 != 3393.672000. CPU Frequency is not max.
32 5000 63.17 63.16 2.069580
Conflicting CPU frequency values detected: 1500.000000 != 3393.684000. CPU Frequency is not max.
64 5000 125.18 124.79 2.044571
Conflicting CPU frequency values detected: 1500.000000 != 3393.682000. CPU Frequency is not max.
128 5000 251.97 251.63 2.061392
Conflicting CPU frequency values detected: 1500.000000 != 2250.000000. CPU Frequency is not max.
256 5000 503.47 502.38 2.057737
Conflicting CPU frequency values detected: 1500.000000 != 2250.000000. CPU Frequency is not max.
512 5000 1007.86 1002.91 2.053960
Conflicting CPU frequency values detected: 1500.000000 != 1464.256000. CPU Frequency is not max.
1024 5000 2008.34 2007.01 2.055178
Conflicting CPU frequency values detected: 1500.000000 != 2250.000000. CPU Frequency is not max.
2048 5000 3710.86 3561.43 1.823450
Conflicting CPU frequency values detected: 1500.000000 != 2250.000000. CPU Frequency is not max.
4096 5000 4482.59 4311.86 1.103835
Conflicting CPU frequency values detected: 1500.000000 != 2250.000000. CPU Frequency is not max.
8192 5000 4579.76 4293.50 0.549568
Conflicting CPU frequency values detected: 1500.000000 != 2250.000000. CPU Frequency is not max.
16384 5000 4427.76 4284.51 0.274209
Conflicting CPU frequency values detected: 1500.000000 != 2250.000000. CPU Frequency is not max.
32768 5000 4431.66 4325.50 0.138416
Conflicting CPU frequency values detected: 1500.000000 != 2250.000000. CPU Frequency is not max.
65536 5000 4456.39 4428.15 0.070850
Conflicting CPU frequency values detected: 1500.000000 != 2250.000000. CPU Frequency is not max.
131072 5000 4508.29 4499.07 0.035993
Conflicting CPU frequency values detected: 1500.000000 != 2250.000000. CPU Frequency is not max.
262144 5000 4535.99 4529.94 0.018120
Conflicting CPU frequency values detected: 1500.000000 != 2250.000000. CPU Frequency is not max.
524288 5000 4561.30 4553.91 0.009108
Conflicting CPU frequency values detected: 1500.000000 != 2250.000000. CPU Frequency is not max.
1048576 5000 4556.28 4554.37 0.004554
Conflicting CPU frequency values detected: 1500.000000 != 2250.000000. CPU Frequency is not max.
2097152 5000 4552.12 4551.60 0.002276
Conflicting CPU frequency values detected: 1500.000000 != 2250.000000. CPU Frequency is not max.
4194304 5000 4553.83 4553.14 0.001138
Conflicting CPU frequency values detected: 1500.000000 != 2250.000000. CPU Frequency is not max.
8388608 5000 4553.65 4552.61 0.000569
---------------------------------------------------------------------------------------
8388608 5000 4553.65 4552.61 0.000569
---------------------------------------------------------------------------------------
deallocating RX GPU buffer 00007a08b4000000
deallocating RX GPU buffer 00007c0146000000
destroying current CUDA Ctx
destroying current CUDA Ctx
= Verification =
$ cat /proc/version lcy02-amd64- 059) (x86_64- linux-gnu- gcc-13 (Ubuntu 13.2.0-4ubuntu3) 13.2.0, GNU ld (GNU Binutils for Ubuntu) 2.41) #28-Ubuntu SMP PREEMPT_DYNAMIC Thu Mar 7 18:21:00 UTC 2024
Linux version 6.5.0-27-generic (buildd@
ubuntu@ ubuntu: ~/autotest- client- tests/ubuntu_ performance_ gpudirect_ rdma/nvidia- peermem- test$ ./nvidia- peermem- test.sh -m peermem /ppa.launchpadc ontent. net/canonical- nvidia/ perftest+ cuda/ubuntu/ /launchpad. net/~canonical- nvidia/ +archive/ ubuntu/ perftest+ cuda sources. list.d/ canonical- nvidia- ubuntu- perftest_ cuda-mantic. sources archive. ubuntu. com/ubuntu mantic InRelease archive. ubuntu. com/ubuntu mantic-updates InRelease archive. ubuntu. com/ubuntu mantic-security InRelease archive. ubuntu. com/ubuntu mantic-backports InRelease archive. ubuntu. com/ubuntu mantic-proposed InRelease /ppa.launchpadc ontent. net/canonical- nvidia/ perftest+ cuda/ubuntu mantic InRelease /ppa.launchpadc ontent. net/dannf/ dannf/ubuntu mantic InRelease 0+0.38- 1+perftest+ cuda.1~ ubuntu23. 10.1). -use_cuda= <cuda device id> Use CUDA specific device for GPUDirect RDMA testing
Repository: 'Types: deb
URIs: https:/
Suites: mantic
Components: main
'
Description:
Used internal for kernel regression testing
More info: https:/
Adding repository.
Found existing deb entry in /etc/apt/
Hit:1 http://
Hit:2 http://
Hit:3 http://
Hit:4 http://
Hit:5 http://
Hit:6 https:/
Hit:7 https:/
Reading package lists... Done
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
perftest is already the newest version (24.01.
0 upgraded, 0 newly installed, 0 to remove and 10 not upgraded.
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
opensm is already the newest version (3.3.23-2).
0 upgraded, 0 newly installed, 0 to remove and 10 not upgraded.
-
Perftest doesn't supports CUDA tests with inline messages: inline size set to 0
******* ******* ******* ******* ******* * ******* ******* ******* ******* *
* Waiting for client to connect... *
*******
Perftest doesn't supports CUDA tests with inline messages: inline size set to 0
initializing CUDA
initializing CUDA
Listing all CUDA devices in system:
CUDA device 0: PCIe address is 07:00
CUDA device 1: PCIe address is 0F:00
CUDA device 2: PCIe address is 47:00
CUDA device 3: PCIe address is 4E:00
CUDA device 4: PCIe address is 87:00
CUDA device 5: PCIe address is 90:00
CUDA device 6: PCIe address is B7:00
CUDA device 7: PCIe address is BD:00
Picking device No. 1
[pid = 15582, dev = 1] device name = [NVIDIA A100-SXM4-40GB]
creating CUDA Ctx
Listing all CUDA devices in system:
CUDA device 0: PCIe address is 07:00
CUDA device 1: PCIe address is 0F:00
CUDA device 2: PCIe address is 47:00
CUDA device 3: PCIe address is 4E:00
CUDA device 4: PCIe address is 87:00
CUDA device 5: PCIe address is 90:00
CUDA device 6: PCIe address is B7:00
CUDA device 7: PCIe address is BD:00
Picking device No. 0 0x7c0146000000 ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ---
RDMA_ Write BW Test 0x7a08b4000000 ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ---
RDMA_ Write BW Test ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- --- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- --- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- --- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- --- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- --- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ---
[pid = 15576, dev = 0] device name = [NVIDIA A100-SXM4-40GB]
creating CUDA Ctx
making it the current CUDA Ctx
cuMemAlloc() of a 16777216 bytes GPU buffer
allocated GPU buffer address at 00007c0146000000 pointer=
-------
Dual-port : OFF Device : mlx5_6
Number of qps : 1 Transport type : IB
Connection type : RC Using SRQ : OFF
PCIe relax order: ON
making it the current CUDA Ctx
cuMemAlloc() of a 16777216 bytes GPU buffer
allocated GPU buffer address at 00007a08b4000000 pointer=
-------
Dual-port : OFF Device : mlx5_2
Number of qps : 1 Transport type : IB
Connection type : RC Using SRQ : OFF
PCIe relax order: ON
ibv_wr* API : ON
TX depth : 128
CQ Moderation : 100
Mtu : 4096[B]
Link type : IB
Max inline data : 0[B]
rdma_cm QPs : OFF
Data ex. method : Ethernet
-------
ibv_wr* API : ON
CQ Moderation : 100
Mtu : 4096[B]
Link type : IB
Max inline data : 0[B]
rdma_cm QPs : OFF
Data ex. method : Ethernet
-------
local address: LID 0x01 QPN 0x0029 PSN 0x6763b2 RKey 0x180eef VAddr 0x007a08b4800000
local address: LID 0x02 QPN 0x36ef PSN 0x149b7b RKey 0x180efd VAddr 0x007c0146800000
remote address: LID 0x02 QPN 0x36ef PSN 0x149b7b RKey 0x180efd VAddr 0x007c0146800000
-------
#bytes #iterations BW peak[MB/sec] BW average[MB/sec] MsgRate[Mpps]
remote address: LID 0x01 QPN 0x0029 PSN 0x6763b2 RKey 0x180eef VAddr 0x007a08b4800000
-------
#bytes #iterations BW peak[MB/sec] BW average[MB/sec] MsgRate[Mpps]
Conflicting CPU frequency values detected: 1500.000000 != 1857.916000. CPU Frequency is not max.
2 5000 3.93 3.91 2.049159
Conflicting CPU frequency values detected: 1500.000000 != 2250.000000. CPU Frequency is not max.
4 5000 7.90 7.86 2.061297
Conflicting CPU frequency values detected: 1500.000000 != 1751.986000. CPU Frequency is not max.
8 5000 15.78 15.72 2.060780
Conflicting CPU frequency values detected: 1500.000000 != 3393.685000. CPU Frequency is not max.
16 5000 31.55 31.55 2.067723
Conflicting CPU frequency values detected: 1500.000000 != 3393.672000. CPU Frequency is not max.
32 5000 63.17 63.16 2.069580
Conflicting CPU frequency values detected: 1500.000000 != 3393.684000. CPU Frequency is not max.
64 5000 125.18 124.79 2.044571
Conflicting CPU frequency values detected: 1500.000000 != 3393.682000. CPU Frequency is not max.
128 5000 251.97 251.63 2.061392
Conflicting CPU frequency values detected: 1500.000000 != 2250.000000. CPU Frequency is not max.
256 5000 503.47 502.38 2.057737
Conflicting CPU frequency values detected: 1500.000000 != 2250.000000. CPU Frequency is not max.
512 5000 1007.86 1002.91 2.053960
Conflicting CPU frequency values detected: 1500.000000 != 1464.256000. CPU Frequency is not max.
1024 5000 2008.34 2007.01 2.055178
Conflicting CPU frequency values detected: 1500.000000 != 2250.000000. CPU Frequency is not max.
2048 5000 3710.86 3561.43 1.823450
Conflicting CPU frequency values detected: 1500.000000 != 2250.000000. CPU Frequency is not max.
4096 5000 4482.59 4311.86 1.103835
Conflicting CPU frequency values detected: 1500.000000 != 2250.000000. CPU Frequency is not max.
8192 5000 4579.76 4293.50 0.549568
Conflicting CPU frequency values detected: 1500.000000 != 2250.000000. CPU Frequency is not max.
16384 5000 4427.76 4284.51 0.274209
Conflicting CPU frequency values detected: 1500.000000 != 2250.000000. CPU Frequency is not max.
32768 5000 4431.66 4325.50 0.138416
Conflicting CPU frequency values detected: 1500.000000 != 2250.000000. CPU Frequency is not max.
65536 5000 4456.39 4428.15 0.070850
Conflicting CPU frequency values detected: 1500.000000 != 2250.000000. CPU Frequency is not max.
131072 5000 4508.29 4499.07 0.035993
Conflicting CPU frequency values detected: 1500.000000 != 2250.000000. CPU Frequency is not max.
262144 5000 4535.99 4529.94 0.018120
Conflicting CPU frequency values detected: 1500.000000 != 2250.000000. CPU Frequency is not max.
524288 5000 4561.30 4553.91 0.009108
Conflicting CPU frequency values detected: 1500.000000 != 2250.000000. CPU Frequency is not max.
1048576 5000 4556.28 4554.37 0.004554
Conflicting CPU frequency values detected: 1500.000000 != 2250.000000. CPU Frequency is not max.
2097152 5000 4552.12 4551.60 0.002276
Conflicting CPU frequency values detected: 1500.000000 != 2250.000000. CPU Frequency is not max.
4194304 5000 4553.83 4553.14 0.001138
Conflicting CPU frequency values detected: 1500.000000 != 2250.000000. CPU Frequency is not max.
8388608 5000 4553.65 4552.61 0.000569
-------
8388608 5000 4553.65 4552.61 0.000569
-------
deallocating RX GPU buffer 00007a08b4000000
deallocating RX GPU buffer 00007c0146000000
destroying current CUDA Ctx
destroying current CUDA Ctx