Skip to content

NVML:Unable to retrieve Nvlink information as all links are inActive #1150

@jed-hacker

Description

@jed-hacker

NVML:Unable to retrieve Nvlink information as all links are inActive
cuda version:590.48.01
A single SIM card failure occurs randomly on the same node, and the issue is resolved after each reboot. In most cases, nvidia-bug-report.sh freezes during execution. A xid 149 error was found in the only successfully exported log file. The problem is currently unidentified.

Originally posted by @jed-hacker in #1149

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions