NVML:Unable to retrieve Nvlink information as all links are inActive
cuda version:590.48.01
A single SIM card failure occurs randomly on the same node, and the issue is resolved after each reboot. In most cases, nvidia-bug-report.sh freezes during execution. A xid 149 error was found in the only successfully exported log file. The problem is currently unidentified.
Originally posted by @jed-hacker in #1149
NVML:Unable to retrieve Nvlink information as all links are inActive
cuda version:590.48.01
A single SIM card failure occurs randomly on the same node, and the issue is resolved after each reboot. In most cases,
nvidia-bug-report.shfreezes during execution. Axid 149error was found in the only successfully exported log file. The problem is currently unidentified.Originally posted by @jed-hacker in #1149