Skip to content

Fix init_prism centroid; make test-prism's parallel loops thread-safe#81

Merged
stevengj merged 2 commits into
NanoComp:masterfrom
Luochenghuang:fix-prism-thread-safety
May 23, 2026
Merged

Fix init_prism centroid; make test-prism's parallel loops thread-safe#81
stevengj merged 2 commits into
NanoComp:masterfrom
Luochenghuang:fix-prism-thread-safety

Conversation

@Luochenghuang
Copy link
Copy Markdown
Contributor

@Luochenghuang Luochenghuang commented May 22, 2026

After #75 + #79, test-prism with --enable-openmp was still failing at OMP_NUM_THREADS >= 2 with heap corruption and intermittent point-inclusion mismatches. Two issues.

1. init_prism writes prsm->centroid before the optional shift

The line prsm->centroid = centroid = ... happens before the else block that shifts both vertices and the local centroid to land on o->center. The shift only updates the local — prsm->centroid stays at the pre-shift value, while the vertex arrays reflect the shifted geometry. Block-vs-prism point-inclusion then disagrees on the difference between old and new placement.

The pre-PR auto-fix in point_in_objectp masked this in serial code (the second call recomputed centroid from the now-shifted vertices), but #79's parallel loop exposed it.

Fix: move prsm->centroid = centroid; to after the if/else. Idempotent in serial code too — not OpenMP-specific.

2. test-prism's parallel loops were calling the thread-unsafe wrappers

point_in_objectp / normal_to_object / point_in_periodic_objectp call geom_fix_object_ptr on every invocation, which for prisms does free/malloc on shared per-prism arrays. Two threads racing on the same prism corrupt the allocator — the original crash from #79.

The wrappers are kept as-is (preserving API contract for callers that mutate object parameters between queries) but their doc comments now mark them as not thread-safe and direct concurrent callers at the *_fixed_* variants.

In test-prism:

  • test_point_inclusion: switch to point_in_fixed_objectp.
  • test_normal_to_object (disabled but still parallel): switch to normal_to_fixed_object.
  • run_unit_tests: call geom_fix_object_ptr on the_block and the_prism inside the P_SHIFT block, before the parallel loops run.

Verification

make check passes. Three trials each at OMP_NUM_THREADS = 1, 2, 4, 8, 16, 64, 166: 21/21 clean (0/10000 points failed, 0/1000 segments failed, no heap-corruption signatures).

…aths

Two related bugs surfaced by running the --enable-openmp test suite
against test-prism, both pre-existing:

1) init_prism assigned prsm->centroid BEFORE the shift-to-o->center
   logic, then the shift only updated a local `centroid` variable.
   After construction, any prism whose center was modified ended up
   with prsm->centroid stale relative to the (shifted) vertices.

2) point_in_objectp, normal_to_object, and point_in_periodic_objectp
   each called geom_fix_object_ptr(&o) on every query. For prisms
   that path calls reinit_prism, which free()s + malloc()s shared
   per-prism arrays (vertices_p, vertices_top_p, ...). Under OpenMP
   the test was crashing with "malloc(): unaligned tcache chunk" /
   "free(): double free". This matches what PR NanoComp#75 already did for
   geom_get_bounding_box -- callers are now responsible for calling
   geom_fix_object_ptr after mutating o.center.

test-prism: after the random-shift branch, call geom_fix_object_ptr
on the block and prism once, single-threaded, before the parallel
test loops run.

Verified at OMP_NUM_THREADS = 1, 2, 4, 8, 16, 32, 64, 128, 166: all
clean (0 point failures, 0 segment failures, no heap corruption).
make check passes.
Comment thread utils/geom.c
Per review feedback, point_in_objectp / normal_to_object /
point_in_periodic_objectp are restored to their original behavior
(call geom_fix_object_ptr internally) rather than being silently
turned into thin aliases for the *_fixed_* variants. Their doc
comments now state they are NOT thread-safe and direct concurrent
callers at the *_fixed_* entry points.

test-prism's parallel loops switched from point_in_objectp /
normal_to_object to point_in_fixed_objectp / normal_to_fixed_object,
which are pure reads and safe to call from multiple threads against
an object that was fixed up once beforehand.

Combined with the init_prism centroid fix in the previous commit,
make check passes cleanly across OMP_NUM_THREADS = 1, 2, 4, 8, 16,
64, 166 (21 trials each: 0 point-inclusion failures, 0 segment
failures, no heap-corruption signatures).
@Luochenghuang Luochenghuang changed the title Fix prism centroid + remove racy per-query auto-fix from query paths Fix init_prism centroid; make test-prism's parallel loops thread-safe May 22, 2026
@Luochenghuang Luochenghuang requested a review from stevengj May 22, 2026 23:03
@stevengj
Copy link
Copy Markdown
Collaborator

The revised PR looks good to me.

@stevengj stevengj merged commit cd191af into NanoComp:master May 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants