Fix flaky optical flow test (compare numbers, not rendered image) by AmitMY · Pull Request #227 · sign-language-processing/pose

AmitMY · 2026-06-20T08:10:15Z

Problem

After merging #226, master CI failed on tests/optical_flow_test.py::test_optical_flow (Python 3.11/3.12) — unrelated to #226's changes. It also flaked on PRs.

The test rendered the optical flow as a matplotlib heatmap, saved it to PNG, and compared against a checked-in reference image with a 0.001 RMS tolerance. That comparison depends on matplotlib/freetype rendering, which drifts across CI runner images and matplotlib versions. With unpinned deps, 3.11/3.12 pull the newest matplotlib and render a hair differently → flake. The underlying computation never changed (byte-identical output locally, RMS match at tol=0).

Fix

OpticalFlowCalculator is pure, deterministic numpy. Compare its numeric output against a .npy reference via np.allclose, dropping matplotlib from the test entirely. Same thing under test, no rendering dependency.

Replaced tests/data/optical_flow.png with tests/data/optical_flow.npy
Test runs ~10x faster and is environment-independent

Note

There's a separate genuinely seed-flaky test, pose_tensorflow_test.py::test_pose_tf_posebody_normalize_distribution_eager_mode_correct_result (the one that flaked on #226's PR run): when a masked column has ≤1 unmasked value, std=0 makes the NumPy reference produce an unfilled nan while the pose path runs fix_nan→0, so allclose fails. Not addressed here to keep this PR focused — happy to fix in a follow-up.

🤖 Generated with Claude Code

The test rendered the optical flow as a matplotlib heatmap and compared the PNG against a checked-in reference with a 0.001 RMS tolerance. That comparison depends on matplotlib/freetype rendering, which drifts across CI runners and matplotlib versions (unpinned deps pull the newest on 3.11/3.12), so the test flaked on master while the underlying computation never changed. OpticalFlowCalculator is pure, deterministic numpy. Compare its output to a numeric .npy reference via np.allclose instead, removing matplotlib from the test entirely. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

AmitMY merged commit dfdc0bc into master Jun 20, 2026
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix flaky optical flow test (compare numbers, not rendered image)#227

Fix flaky optical flow test (compare numbers, not rendered image)#227
AmitMY merged 1 commit into
masterfrom
fix-flaky-optical-flow-test

AmitMY commented Jun 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

AmitMY commented Jun 20, 2026

Problem

Fix

Note

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant