fix: prevent HDF5 dataset collision when data_path is reused (#744)#746
Merged
Merged
Conversation
Demonstrates the two bugs before the fix: - options_to_kwargs() always forwarded data_path='/tmp', causing every run to target /tmp/kwave_input.h5 and crash on the second call - prepare() gave a cryptic h5py ValueError instead of a clear FileExistsError when the input file already existed
options_to_kwargs() was always forwarding data_path='/tmp' (the SimulationOptions default), bypassing the safe mkdtemp path in prepare() and causing h5py to crash with 'name already exists' on any second run. - compat.py: skip forwarding data_path when it equals gettempdir() - cpp_simulation.py: raise FileExistsError with a clear message when an explicit data_path already contains kwave_input.h5 Fixes waltsims#744
Contributor
There was a problem hiding this comment.
Pull request overview
Fixes repeated-run failures in the C++ backend caused by reusing a non-unique default data_path, and improves the resulting error when an explicit path already contains a previous input file.
Changes:
- Update
options_to_kwargs()to avoid forwarding theSimulationOptionsdefault temp directory, allowingCppSimulation.prepare()to create a unique per-run temp directory. - Add an explicit
FileExistsErrorinCppSimulation.prepare()whenkwave_input.h5already exists in a user-specifieddata_path. - Add unit tests covering default-path forwarding behavior and the new collision error.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
kwave/compat.py |
Stops forwarding data_path when it equals the OS temp dir default to avoid HDF5 input collisions. |
kwave/solvers/cpp_simulation.py |
Raises a clear FileExistsError if kwave_input.h5 already exists in the target directory; minor formatting cleanup. |
tests/test_compat.py |
Adds tests ensuring default data_path is not forwarded and custom paths still are. |
tests/test_cpp_simulation.py |
Adds a test asserting prepare() raises when kwave_input.h5 already exists. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
normalize both paths before comparing Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #746 +/- ##
==========================================
+ Coverage 75.54% 75.57% +0.02%
==========================================
Files 57 57
Lines 8188 8195 +7
Branches 1598 1600 +2
==========================================
+ Hits 6186 6193 +7
Misses 1381 1381
Partials 621 621
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Owner
|
Thanks for the fix! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
options_to_kwargs()was always forwardingdata_path='/tmp'(theSimulationOptionsdefault), bypassing the safemkdtemppath inprepare()and causing h5py to crash withValueError: name already existson any second runcompat.py: skip forwardingdata_pathwhen it equalsgettempdir(), soprepare()falls back to a fresh unique temp dir per runcpp_simulation.py: raiseFileExistsErrorwith a clear message when an explicitdata_pathalready containskwave_input.h5Test plan
test_default_data_path_not_forwarded— assertsdata_pathis not forwarded when using defaultSimulationOptions()(would have failed before fix)test_custom_data_path_is_forwarded— asserts an explicitly-setdata_pathis still forwarded correctlytest_raises_when_input_already_exists— assertsprepare()raisesFileExistsErrorwith a clear message when the input file already exists (would have failed before fix)Fixes #744
Greptile Summary
This PR fixes a
ValueError: name already existscrash from h5py that occurred on any second simulation run, caused byoptions_to_kwargs()always forwarding the defaultdata_path=gettempdir()and bypassing themkdtemp-based unique-directory logic inprepare().kwave/compat.py: Uses normalizedos.path.realpathcomparison to skip forwardingdata_pathwhen it holds the system temp dir default, allowingprepare()to fall back to a freshmkdtempdirectory per run.kwave/solvers/cpp_simulation.py: Adds an earlyFileExistsErrorguard inprepare()so that an explicitdata_pathreuse produces a clear, actionable error instead of a cryptic h5py exception.FileExistsErroron collision.Confidence Score: 5/5
Safe to merge — the fix correctly targets the default-path collision scenario and does not regress explicit data_path forwarding.
Both changed code paths are straightforward: the compat guard uses realpath+normpath normalization which handles platform symlink differences (e.g. macOS /tmp → /private/tmp), and the FileExistsError guard in prepare() only fires for explicit data_path values (mkdtemp always produces a new empty directory). All three new test cases directly cover the fixed scenarios. No new defects were found beyond style observations.
kwave/compat.py — the import placement is unusual but purely cosmetic. No files require attention for correctness.
Important Files Changed
Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A["options_to_kwargs(simulation_options)"] --> B{opts.data_path is not None?} B -- No --> E[skip] B -- Yes --> C{normalized data_path == normalized gettempdir?} C -- Yes default temp dir --> D[skip forwarding data_path] C -- No user-set custom path --> F["kwargs['data_path'] = opts.data_path"] G["CppSimulation.run(data_path=...)"] --> H["prepare(data_path)"] H --> I{data_path is None?} I -- Yes --> J["mkdtemp() fresh unique dir"] I -- No --> K["makedirs(data_path, exist_ok=True)"] J --> L{kwave_input.h5 exists?} K --> L L -- Yes --> M["raise FileExistsError"] L -- No --> N["_write_hdf5(input_file)"] N --> O["return input_file, output_file"] O --> P["_execute(...)"] P --> Q{cleanup?} Q -- True data_path was None --> R["shutil.rmtree(data_dir)"] Q -- False explicit data_path --> S[files remain on disk]Reviews (2): Last reviewed commit: "Potential fix for pull request finding" | Re-trigger Greptile