Skip to content

Add Cuda support#108

Open
tychedelia wants to merge 2 commits intoprocessing:mainfrom
tychedelia:cuda
Open

Add Cuda support#108
tychedelia wants to merge 2 commits intoprocessing:mainfrom
tychedelia:cuda

Conversation

@tychedelia
Copy link
Copy Markdown
Member

@tychedelia tychedelia commented Apr 7, 2026

Adds support for Cuda.

For the most part, this means doing some copies into external memory that can be used by Cuda and vice-versa.

We expect the user to have installed Cuda (min version 11040) and are using dynamic linking.

Our Python iterop is via __cuda__array__inter.

Some substantive changes to the rest of the code base are the elimination of ImageTextures and GraphicsTargets. These were attempts to pipe gpu resources from the render world to the main world to make accessing gpu resources in the main world easier. We take a new approach here and instead just eagerly look up those resources in the render World and pass them as arguments to stuff as needed.

To test

You can run just py-run cuda.py --features="cuda" in order to test the cuda example, or open the repl with mewnala installed and call .cuda() on an image created with create_image.

@hx2A
Copy link
Copy Markdown

hx2A commented Apr 8, 2026

I am able test this! I can run examples/cuda.py and see a white circle with feathered edges moving in a circular direction with a black background. Here is the output:

$ python ./examples/cuda.py
2026-04-08T04:17:38.103700Z  WARN bevy_asset::io::source: Skip creating file watcher because path "/home/jim/INSTALL/anaconda3/bin/assets" does not exist.
2026-04-08T04:17:38.103715Z  WARN bevy_asset::io::source: AssetSourceId::Default does not have an AssetWatcher configured. Consider adding an "assets" directory.
2026-04-08T04:17:38.205259Z  INFO bevy_render::renderer: AdapterInfo { name: "NVIDIA GeForce RTX 4090", vendor: 4318, device: 9860, device_type: DiscreteGpu, device_pci_bus_id: "0000:01:00.0", driver: "NVIDIA", driver_info: "590.48.01", backend: Vulkan, subgroup_min_size: 32, subgroup_max_size: 32, transient_saves_memory: false }
2026-04-08T04:17:38.367188Z ERROR bevy_asset: AssetSourceId::Name(assets_directory) must be registered before `AssetPlugin` (typically added as part of `DefaultPlugins`)
2026-04-08T04:17:38.367206Z ERROR bevy_asset: AssetSourceId::Name(sketch_directory) must be registered before `AssetPlugin` (typically added as part of `DefaultPlugins`)
2026-04-08T04:17:38.375032Z  INFO bevy_pbr::cluster: GPU clustering is supported on this device.
2026-04-08T04:17:38.375067Z  INFO bevy_render::batching::gpu_preprocessing: GPU preprocessing is fully supported on this device.
2026-04-08T04:17:38.375867Z ERROR bevy_asset::server: Asset Source 'AssetSourceId::Name(sketch_directory)' does not exist
Segmentation fault (core dumped)

The Segmentation fault appears a few seconds after I exited the Sketch by closing the window.

Getting the build to run took some time. This computer is a relatively fresh install of ElementaryOS and was missing a lot of headers needed for the compilation. ElementaryOS doesn't seem to have current versions of some necessary libraries. The package manager gives me version 1.21 of just which is super old apparently and I had to manually compile glfw3 to get 3.4 and not 3.3.10. There was also an additional mess I had to clean up because of Anaconda, which I why I ran the python ./examples/cuda.py command after maturin develop --release --features=cuda completed the build. But, I got it to work.

@catilac
Copy link
Copy Markdown
Contributor

catilac commented Apr 8, 2026

tagging @hx2A here. I can do code review, but cannot run on my system

@hx2A
Copy link
Copy Markdown

hx2A commented Apr 9, 2026

tagging @hx2A here. I can do code review, but cannot run on my system

@tychedelia I was able to run the build for this PR and execute the test. Except for the seg fault when I exited the Sketch, everything looked OK to me. The output is in my previous comment. Would you like me to run the test again?

@tychedelia
Copy link
Copy Markdown
Member Author

@hx2A If you can, would you mind sharing the core dump or just the stack trace? I bet it will be obvious what you hit.

@hx2A
Copy link
Copy Markdown

hx2A commented Apr 9, 2026

@hx2A If you can, would you mind sharing the core dump or just the stack trace? I bet it will be obvious what you hit.

Sure thing, I'll do that tomorrow morning

@hx2A
Copy link
Copy Markdown

hx2A commented Apr 9, 2026

@tychedelia Now I can't reproduce the core dump. I guess that's a good thing? Last night I was pretty tired and I probably made a setup mistake somewhere. Today I straightened out some environment stuff and now it can run more cleanly.

Here's the output when I run it with just:

$ just py-run cuda.py --features="cuda"
cd crates/processing_pyo3; uv run maturin develop --release --features=cuda
🍹 Building a mixed python/rust project
🔗 Found pyo3 bindings
🐍 Found CPython 3.13 at /home/jim/Projects/learning/rust/libprocessing/crates/processing_pyo3/.venv/bin/python
Audited 2 packages in 7ms
warning: processing_render@0.0.1: skipped file/dir: not a WGSL ident `wavelet-vorticity` "../../lygia/test/wesl/shaders/wavelet-vorticity"
warning: processing_render@0.0.1: skipped file/dir: not a WGSL ident `draw-stroke` "../../lygia/test/wesl/shaders/draw-stroke"
warning: processing_render@0.0.1: skipped file/dir: not a WGSL ident `snoise-fbm` "../../lygia/test/wesl/shaders/snoise-fbm"
warning: processing_render@0.0.1: skipped file/dir: not a WGSL ident `windmill-tile` "../../lygia/test/wesl/shaders/windmill-tile"
warning: processing_render@0.0.1: skipped file/dir: not a WGSL ident `mirror-tile` "../../lygia/test/wesl/shaders/mirror-tile"
warning: processing_render@0.0.1: skipped file/dir: not a WGSL ident `brick-tile` "../../lygia/test/wesl/shaders/brick-tile"
warning: processing_render@0.0.1: skipped file/dir: not a WGSL ident `tri-tile` "../../lygia/test/wesl/shaders/tri-tile"
warning: processing_render@0.0.1: skipped file/dir: not a WGSL ident `pnoise-tiling` "../../lygia/test/wesl/shaders/pnoise-tiling"
warning: processing_render@0.0.1: skipped file/dir: not a WGSL ident `worley-cellular` "../../lygia/test/wesl/shaders/worley-cellular"
warning: processing_render@0.0.1: skipped file/dir: not a WGSL ident `perlin-noise-fbm` "../../lygia/test/wesl/shaders/perlin-noise-fbm"
warning: processing_render@0.0.1: skipped file/dir: not a WGSL ident `sprite-megaman` "../../lygia/test/wesl/shaders/sprite-megaman"
warning: processing_render@0.0.1: skipped file/dir: not a WGSL ident `draw-shapes` "../../lygia/test/wesl/shaders/draw-shapes"
warning: processing_render@0.0.1: skipped file/dir: not a WGSL ident `draw-aa` "../../lygia/test/wesl/shaders/draw-aa"
warning: processing_render@0.0.1: skipped file/dir: not a WGSL ident `checker-tile` "../../lygia/test/wesl/shaders/checker-tile"
warning: processing_render@0.0.1: skipped file/dir: not a WGSL ident `hex-tile` "../../lygia/test/wesl/shaders/hex-tile"
warning: processing_render@0.0.1: skipped file/dir: not a WGSL ident `2mat3` "../../lygia/math/quat/2mat3"
warning: processing_render@0.0.1: skipped file/dir: not a WGSL ident `2mat4` "../../lygia/math/quat/2mat4"
warning: processing_render@0.0.1: skipped file/dir: not a WGSL ident `const` "../../lygia/math/const"
warning: processing_render@0.0.1: skipped file/dir: not a WGSL ident `.github` "../../lygia/.github"
    Finished `release` profile [optimized] target(s) in 0.19s
⚠️  Warning: Failed to set rpath for /home/jim/Projects/learning/rust/libprocessing/target/release/libmewnala.so: Failed to execute 'patchelf', did you install it? Hint: Try `pip install maturin[patchelf]` (or just `pip install patchelf`)
📦 Built wheel for CPython 3.13 to /tmp/.tmp6cU1s0/mewnala-0.0.1-cp313-cp313-linux_x86_64.whl
✏️  Setting installed package as editable
🛠 Installed mewnala-0.0.1
cd crates/processing_pyo3; uv run python ./examples/cuda.py
Uninstalled 1 package in 0.35ms
Installed 1 package in 2ms
2026-04-09T23:53:35.200975Z  WARN bevy_asset::io::source: Skip creating file watcher because path "/home/jim/INSTALL/anaconda3/bin/assets" does not exist.
2026-04-09T23:53:35.200991Z  WARN bevy_asset::io::source: AssetSourceId::Default does not have an AssetWatcher configured. Consider adding an "assets" directory.
2026-04-09T23:53:35.289506Z  INFO bevy_render::renderer: AdapterInfo { name: "NVIDIA GeForce RTX 4090", vendor: 4318, device: 9860, device_type: DiscreteGpu, device_pci_bus_id: "0000:01:00.0", driver: "NVIDIA", driver_info: "590.48.01", backend: Vulkan, subgroup_min_size: 32, subgroup_max_size: 32, transient_saves_memory: false }
ALSA lib dlmisc.c:339:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_pipewire.so (/home/jim/INSTALL/anaconda3/lib/alsa-lib/libasound_module_pcm_pipewire.so: cannot open shared object file: No such file or directory)
ALSA lib dlmisc.c:339:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_pipewire.so (/home/jim/INSTALL/anaconda3/lib/alsa-lib/libasound_module_pcm_pipewire.so: cannot open shared object file: No such file or directory)
ALSA lib dlmisc.c:339:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_pipewire.so (/home/jim/INSTALL/anaconda3/lib/alsa-lib/libasound_module_pcm_pipewire.so: cannot open shared object file: No such file or directory)
2026-04-09T23:53:35.488676Z ERROR bevy_asset: AssetSourceId::Name(assets_directory) must be registered before `AssetPlugin` (typically added as part of `DefaultPlugins`)
2026-04-09T23:53:35.488707Z ERROR bevy_asset: AssetSourceId::Name(sketch_directory) must be registered before `AssetPlugin` (typically added as part of `DefaultPlugins`)
2026-04-09T23:53:35.500024Z  INFO bevy_pbr::cluster: GPU clustering is supported on this device.
2026-04-09T23:53:35.500058Z  INFO bevy_render::batching::gpu_preprocessing: GPU preprocessing is fully supported on this device.
2026-04-09T23:53:35.500801Z ERROR bevy_asset::server: Asset Source 'AssetSourceId::Name(sketch_directory)' does not exist
error: Recipe `py-run` failed on line 15 with exit code 139

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants