Skip to content

Adding ROCm-LLVM#1473

Draft
casparvl wants to merge 2 commits intoEESSI:mainfrom
casparvl:rocm_llvm
Draft

Adding ROCm-LLVM#1473
casparvl wants to merge 2 commits intoEESSI:mainfrom
casparvl:rocm_llvm

Conversation

@casparvl
Copy link
Copy Markdown
Collaborator

@casparvl casparvl commented Apr 16, 2026

Just going to try something here, see what we are still missing for AMD GPU support...

@casparvl
Copy link
Copy Markdown
Collaborator Author

bot: build repo:eessi.io-2025.06-software instance:eessi-bot-mc-aws on:arch=zen2 for:arch=x86_64/amd/zen2,accel=amd/gfx90a

@eessi-bot-aws
Copy link
Copy Markdown

eessi-bot-aws bot commented Apr 16, 2026

New job on instance eessi-bot-mc-aws for repository eessi.io-2025.06-software
Building on: amd-zen2
Building for: x86_64/amd/zen2 and accelerator amd/gfx90a
Job dir: /project/def-users/SHARED/jobs/2026.04/pr_1473/148423

date job status comment
Apr 16 16:06:43 UTC 2026 submitted job id 148423 awaits release by job manager
Apr 16 16:07:13 UTC 2026 released job awaits launch by Slurm scheduler
Apr 16 16:13:16 UTC 2026 running job 148423 is running
Apr 16 16:15:03 UTC 2026 finished job id 148423 was cancelled
Apr 16 16:15:19 UTC 2026 finished
🤷 UNKNOWN (click triangle for detailed information)
  • Job results file _bot_job148423.result does not exist in job directory or reading it failed.
  • No artefacts were found/reported.
Apr 16 16:15:19 UTC 2026 test result
🤷 UNKNOWN (click triangle for detailed information)
  • Job test file _bot_job148423.test does not exist in job directory or reading it failed.

@casparvl
Copy link
Copy Markdown
Collaborator Author

bot:cancel jobid:148423

@casparvl
Copy link
Copy Markdown
Collaborator Author

bot: build repo:eessi.io-2025.06-software instance:eessi-bot-mc-aws on:arch=zen2 for:arch=x86_64/amd/zen2,accel=amd/gfx90a

@eessi-bot-aws
Copy link
Copy Markdown

eessi-bot-aws bot commented Apr 16, 2026

New job on instance eessi-bot-mc-aws for repository eessi.io-2025.06-software
Building on: amd-zen2
Building for: x86_64/amd/zen2 and accelerator amd/gfx90a
Job dir: /project/def-users/SHARED/jobs/2026.04/pr_1473/148424

date job status comment
Apr 16 16:16:02 UTC 2026 submitted job id 148424 awaits release by job manager
Apr 16 16:16:22 UTC 2026 released job awaits launch by Slurm scheduler
Apr 16 16:17:24 UTC 2026 running job 148424 is running
Apr 16 16:20:14 UTC 2026 finished job id 148424 was cancelled
Apr 16 16:20:28 UTC 2026 finished
🤷 UNKNOWN (click triangle for detailed information)
  • Job results file _bot_job148424.result does not exist in job directory or reading it failed.
  • No artefacts were found/reported.
Apr 16 16:20:28 UTC 2026 test result
🤷 UNKNOWN (click triangle for detailed information)
  • Job test file _bot_job148424.test does not exist in job directory or reading it failed.

@casparvl
Copy link
Copy Markdown
Collaborator Author

Lmod has detected the following error: Incorrect value for
$EESSI_ACCELERATOR_TARGET: accel/amd/gfx90a
While processing the following module(s):
    Module fullname                 Module Filename
    ---------------                 ---------------
    EESSI-extend/2025.06-easybuild  /cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/modules/all/EESSI-extend/2025.06-easybuild.lua

And then later:

ESC[32mFound EasyBuild version 5.3.0, looking good!ESC[0m
#
# Current EasyBuild configuration
# (C: command line argument, D: default value, E: environment variable, F: configuration file)
#
buildpath      (D) = /eessi_bot_job/.local/easybuild/build
containerpath  (D) = /eessi_bot_job/.local/easybuild/containers
installpath    (E) = /cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2
repositorypath (D) = /eessi_bot_job/.local/easybuild/ebfiles_repo
robot-paths    (D) = /cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/EasyBuild/5.3.0/easybuild/easyconfigs
rpath          (D) = True
sourcepath     (E) = /project/def-users/bot/shared/easybuild/sources:
#

Work to be done I guess...

@casparvl
Copy link
Copy Markdown
Collaborator Author

bot:cancel jobid:148424

@casparvl
Copy link
Copy Markdown
Collaborator Author

-- Check if we have GPU capabilities and configure CUDA compute capabilities
eessi_accelerator_target = os.getenv("EESSI_ACCELERATOR_TARGET")
if (eessi_accelerator_target ~= nil) then
  cuda_compute_capability = string.match(eessi_accelerator_target, "^accel/nvidia/cc([0-9]+)$")
  if (cuda_compute_capability ~= nil) then
    -- The last digit should be the minor version, insert a dot in the one-but-last position
    major_version = cuda_compute_capability:sub(1, #cuda_compute_capability - 1)
    minor_version = cuda_compute_capability:sub(#cuda_compute_capability)
    easybuild_cuda_compute_capabilities = string.format("%s.%s", major_version, minor_version)
  else
    LmodError("Incorrect value for $EESSI_ACCELERATOR_TARGET: " .. eessi_accelerator_target)
  end

  -- If architectures are 9.0, 10.0 or 12.0, enable architecture or family-specific optimizations
  if easybuild_cuda_compute_capabilities == '9.0' then
    easybuild_cuda_compute_capabilities = '9.0a'
  elseif easybuild_cuda_compute_capabilities == '10.0' then
    easybuild_cuda_compute_capabilities = '10.0f'
  elseif easybuild_cuda_compute_capabilities == '12.0' then
    easybuild_cuda_compute_capabilities = '12.0f'
  end
end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant