Skip to content

Error on P2P on 2 RTX 5090: mapping of buffer object failed or code=205(cudaErrorMapBufferObjectFailed) "cudaDeviceEnablePeerAccess(gpuid[1], 0) #44

@Panchovix

Description

@Panchovix

NVIDIA Open GPU Kernel Modules Version

570.133, 570.144, 570.148, 570.153

Operating System and Version

Fedora 41

Hardware

RTX 5090x2+RTX4090x2+RTX3090x2+A6000

AMD Ryzen 7 7800X3D

192GB RAM

MSI Carbon X670E

Kernel Release

6.14.9

Please confirm you are running a stable release kernel (e.g. not a -rc). We do not accept bug reports for unreleased kernels.

  • I am running on a stable kernel release.

Build Command

Terminal output/Build Log

More Info

Hi there. I'm trying to use P2P on two RTX 5090, but I get the next issues:

pancho@fedora:~/cuda-samples/build/Samples/5_Domain_Specific/p2pBandwidthLatencyTest$ export CUDA_VISIBLE_DEVICES=2,3
pancho@fedora:~/cuda-samples/build/Samples/5_Domain_Specific/p2pBandwidthLatencyTest$ ./p2pBandwidthLatencyTest 
[P2P (Peer-to-Peer) GPU Bandwidth Latency Test]
Device: 0, NVIDIA GeForce RTX 5090, pciBusID: 1, pciDeviceID: 0, pciDomainID:0
Device: 1, NVIDIA GeForce RTX 5090, pciBusID: 3, pciDeviceID: 0, pciDomainID:0
Device=0 CAN Access Peer Device=1
Device=1 CAN Access Peer Device=0

***NOTE: In case a device doesn't have P2P access to other one, it falls back to normal memcopy procedure.
So you can see lesser Bandwidth (GB/s) and unstable Latency (us) in those cases.

P2P Connectivity Matrix
     D\D     0     1
     0       1     1
     1       1     1
Unidirectional P2P=Disabled Bandwidth Matrix (GB/s)
   D\D     0      1 
     0 1728.49  24.63 
     1  24.70 1761.56 
Unidirectional P2P=Enabled Bandwidth (P2P Writes) Matrix (GB/s)
Cuda failure /home/pancho/cuda-samples/Samples/5_Domain_Specific/p2pBandwidthLatencyTest/p2pBandwidthLatencyTest.cu:192: 'mapping of buffer object failed'
pancho@fedora:~/cuda-samples/build/Samples/0_Introduction/simpleP2P$ ./simpleP2P 
[./simpleP2P] - Starting...
Checking for multiple GPUs...
CUDA-capable device count: 2

Checking GPU(s) for support of peer to peer memory access...
> Peer access from NVIDIA GeForce RTX 5090 (GPU0) -> NVIDIA GeForce RTX 5090 (GPU1) : Yes
> Peer access from NVIDIA GeForce RTX 5090 (GPU1) -> NVIDIA GeForce RTX 5090 (GPU0) : Yes
Enabling peer access between GPU0 and GPU1...
CUDA error at /home/pancho/cuda-samples/Samples/0_Introduction/simpleP2P/simpleP2P.cu:130 code=205(cudaErrorMapBufferObjectFailed) "cudaDeviceEnablePeerAccess(gpuid[1], 0)" 

I have other GPUs on my system (A6000, 3090s, 4090s) and P2P works fine on these cards.

I did try with the patch mentioned here #29 (comment), but as I mentioned on #29 (comment), it doesn't seem to work. I edited the files mentioned but there was no difference.

I have also tried https://github.com/tinygrad/open-gpu-kernel-modules/tree/570.148.08-p2p branch, but issue is still there.

I installed the driver -> then the patch with ./install.sh.

IOMMU and PCIe ACS are disabled.

Any help is appreciated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions