Cross-platform stability fixes and FP-contract standardization#283
Cross-platform stability fixes and FP-contract standardization#283Madreag wants to merge 19 commits into
Conversation
Swap the per-tick hitMOAtoms thread-locals from unordered_map to std::map so impulse accumulation iterates MOID-ascending, removing a source of cross-run bit-noise in the physics step.
Add deviceId as a secondary sort key in WeaponSearch and ToolSearch so the pickup ordering is stable when scores collide; scheduler-dependent async callback order otherwise leaves the table order non-deterministic.
On macOS, route RTEError message/abort/assert boxes off worker threads to stderr: Cocoa dispatches the dialog onto the main thread, which deadlocks when that thread is blocked waiting on the worker. Windows/Linux tolerate cross-thread boxes and keep their behaviour. Guard the save-list file_clock -> system_clock conversion on macOS libc++, whose __int128 file_clock rep is rejected when constructing the time_point directly; an explicit duration_cast lands it in range.
CalculatePathAsync runs on the parallel ThreadedUpdateAI workers (AI scripts call Scene:CalculatePathAsync from HumanBehaviors); the static-int increment could hand two concurrent requests the same id and silently drop one Lua callback slot. Make the counter std::atomic with relaxed fetch_add.
Pin floating-point so same-arch Win/Linux/macOS builds produce bit-identical results. meson: -ffp-contract=off plus the no-fast-math family for GCC/Clang, /fp:precise + /arch:SSE2 for MSVC. RTEA.vcxproj: FloatingPointModel=Precise on every config (was Fast). Document the FPU + libm path in Source/CI.
The release branch reassigned extra_args to ['-w'], discarding -ffp-contract=off and the rest of the cross-platform FP block. Append instead of overwrite.
Atom::Clear left m_IntPos and the Bresenham step fields uninitialized. SetupPos branches on m_IntPos, and StepForward steps on the Bresenham state, before SetupSeg runs for a freshly-pooled Atom, so the stale pool value (which varies with allocation order) made collision stepping nondeterministic run to run. This was the dominant source of the cross-platform determinism divergence on the combat scenarios.
The sim-frame counter was incremented each tick but never initialized, so HitWhatMOID and HitWhatTerrMaterial read an uninitialized value when comparing a per-MO collision frame against it.
A SoundContainer's FMOD channels keep its address in userData, so the positional and channel-ended passes read freed memory once the container is destroyed. Null the back-reference on destroy and skip channels whose container is gone.
When every atom rasterizes to zero steps, stepsOnSeg is 0 and the step ratio divided 0/0. Use 0 in that case.
broadcastMsg.activeTime goes negative when the broadcast timestamp predates m_epoch, tripping Tracy's own assert(activeTime >= 0) on startup. Clamp to 0.
Vendored luabind 0.7.1 and boost 1.75 still use the C++17-removed APIs (auto_ptr, unary/binary_function, binders) that libc++ 19+ hides behind these macros, so the macOS build fails to compile without them.
Drop the global removed-API block (luabind covers those) and keep only the two that expose <execution>, behind a clang gate so they reach Apple's clang/libc++ but stay off the GCC command line.
The wider file_clock rep is a libc++ trait, not Apple-specific, so a libc++ build on any OS takes the duration_cast path.
Vendored-profiler change, unrelated to the cross-platform and FP work in this PR.
The disown does not close the async-GC teardown race, and it is off-topic for the cross-platform and FP work here.
Drop two stale LuaJIT lines that rode in from a rebase, drop the SoundContainer entry, and narrow the message-box note to macOS (the worker-thread guard is macOS-only).
The CXX17-removed umbrella is a no-op on libc++ 19+, so set the two individual macros luabind 0.7.1 and boost 1.75 use: std::auto_ptr and std::unary_function.
|
Pushed the changes. The 2 One correction to what I said though... dropping the root macros broke luabind on mac, xcode is already on libc++ 20 where the Also removed two stale LuaJIT changelog bullets that snuck in from a rebase (already merged in #280). Rebuilt and booted on Windows and the arm64 mac (apple clang 17 / libc++ 20). |
These are cross platform stability fixes that came up while bringing the determinism work over to Mac and Linux. Most are small correctness issues that Windows happened to hide (its heap fill pattern and thread scheduling masked a few), so they only showed up once the same code ran on the other two platforms. see commit messages.
AtomGroupcollision iteration now runs in MOID order (the two thread localhitMOAtomsmaps go fromunordered_maptomap;MOIgnoreMapis left alone since it's lookup only and never iterated). Plus a guard against a 0/0 step ratio when every atom in a segment rasterizes to zero steps, which left a dead NaN that still trippedFE_INVALID.HumanBehaviorsweapon and tool pickup sort gets adeviceIdtie break, since the async pickup callbacks fire in scheduler dependent order and equal scores need a stable total order.FP contract pinned across toolchains:
-ffp-contract=offon GCC/Clang andFloatingPointModel=Preciseon every MSVC config, so the compiler can't fold FMA or reassociate floats differently per platform. A follow on fixes a Meson bug where the release branch reassignedextra_argsand silently dropped those flags on every release TU (checked incompile_commands.json: 203 TUs now, was 0).LuaAdaptersasync callback id counter is now astd::atomic<int>, closing a race under parallel AI.Two uninitialized reads off the pooled allocator, both returning prior occupant bytes so the value tracked allocation order:
Atom's integer step state, andMovableMan::m_SimUpdateFrameNumber. Caught with Valgrind/ASan.An
AudioManuse after free: an MO gibbed mid tick frees itsSoundContainer, but its still playing FMOD channels keep the freed address inuserData. The channels now get disowned on destroy (they keep playing) and the readers skip a container that's gone.A few Mac only fixes: message boxes raised from a worker thread now log to stderr instead of popping a dialog (a cross thread Cocoa alert deadlocks), an explicit
file_time_typetosystem_clockconversion in the save menu, and the libc++ removed-API macros that luabind 0.7.1 and boost 1.75 still need to build on libc++ 19+.a one line guard in Tracy (
TracyProfiler.cpp) that clampsbroadcastMsg.activeTimeto 0 when the timestamp predates the profiler epoch, otherwise it trips Tracy's own assert on startup.Built and ran clean on Windows, Linux, and macOS arm64. A few of these (the iteration order, FP contract, and uninitialized reads) are prerequisites for the determinism series this is part of, but they all stand on their own as fixes.