fix(docker): replace PyPI opencv wheel with ffmpeg-free build [security]#569
Merged
lawrence-u10d merged 1 commit intomainfrom Apr 22, 2026
Merged
fix(docker): replace PyPI opencv wheel with ffmpeg-free build [security]#569lawrence-u10d merged 1 commit intomainfrom
lawrence-u10d merged 1 commit intomainfrom
Conversation
Mirrors Unstructured-IO/unstructured#4336. After uv sync, the Dockerfile now downloads a source-built opencv-contrib-python-headless wheel (WITH_FFMPEG=OFF) from the upstream release, hash-verifies it, and substitutes it for the PyPI opencv variant installed from uv.lock. This eliminates the 14 bundled ffmpeg 5.1.x CVEs shipped in PyPI opencv wheels. Bumps service version 0.1.3 -> 0.1.4. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
qued
approved these changes
Apr 22, 2026
4 tasks
lawrence-u10d
added a commit
that referenced
this pull request
Apr 22, 2026
## Summary Follow-up to #569 (v0.1.4). That PR replaced the PyPI `opencv-python` wheel with an ffmpeg-free build, but image scanners were still flagging the 14 ffmpeg CVEs against v0.1.4. Root cause is scanner scope, not a broken replacement. ## Root cause `uv pip uninstall` only drops a package from `site-packages`. The extracted wheel archive stays in the uv cache. Inspecting the pushed v0.1.4 image: - ✅ `cv2.__version__` reports `4.12.0` (our replacement wheel) - ✅ `site-packages/cv2/` has no `.libs/` directory - ❌ `/home/notebook-user/.cache/uv/archive-v0/<hash>/opencv_python.libs/` still contains the full extracted old wheel: - `libavcodec-*.so.59.37.100` - `libavformat-*.so.59.27.100` - `libavutil-*.so.57.28.100` - plus `libavfilter`, `libavdevice`, `libswscale`, `libswresample` SO-version suffixes (avcodec 59.37 / avformat 59.27 / avutil 57.28) are ffmpeg 5.1.x — matching the CVE set the upstream PR called out. Scanners walk the whole filesystem and flag these even though nothing links against them at runtime. `UV_LINK_MODE=copy` (set globally in this Dockerfile) compounds it — the cache keeps its own copy independent of `site-packages`. ## Fix Add `uv cache clean` to the end of the opencv replacement `RUN` to wipe the cache (including the old opencv wheel archive) from the final image layer. Single minimal change — scoped to the opencv-fix RUN, not a broader image-slimming pass. Safe because `UV_LINK_MODE=copy` means the live venv copies files out of cache — wiping the cache doesn't affect the installed packages. ## False positives ignored (not fixed here) Two other `libav*` filenames in the image that are **not** ffmpeg and don't trigger these CVEs: - `/usr/lib/libreoffice/program/libavmedia{gst,lo}.so` — LibreOffice's \"avmedia\" framework shim - `pillow.libs/libavif-*.so.16` — AV1 image codec ## Version / Changelog - Bumps service version `0.1.4` → `0.1.5` - `CHANGELOG.md` entry under `0.1.5` → Security - No `uv lock` changes ## Test plan - [ ] `make docker-build` succeeds on `amd64` and `arm64` - [ ] In the rebuilt image, `find / -name \"libavcodec*\" -o -name \"libavformat*\" -o -name \"libswscale*\"` returns nothing under `/home/notebook-user/.cache/uv/` and nothing under `site-packages/cv2/.libs/` - [ ] `cv2.__version__` still reports `4.12.0.88` and `import cv2; cv2.imdecode(...)` smoke check works - [ ] Container scan of the rebuilt image no longer flags the 14 ffmpeg CVEs 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- CURSOR_SUMMARY --> --- > [!NOTE] > **Low Risk** > Low risk: a single Docker build-step cleanup (`uv cache clean`) plus version/changelog bumps; main risk is unintended impact on Docker layer caching or build time, not runtime behavior. > > **Overview** > Removes leftover ffmpeg `.so` files from the built image by adding `uv cache clean` after uninstalling/reinstalling OpenCV wheels in the Dockerfile, preventing scanners from flagging CVEs from cached wheel contents. > > Bumps the service version to `0.1.5` and adds a matching `CHANGELOG.md` security entry describing the cache purge. > > <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit f73143d. Bugbot is set up for automated code reviews on this repo. Configure [here](https://www.cursor.com/dashboard/bugbot).</sup> <!-- /CURSOR_SUMMARY --> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Mirrors Unstructured-IO/unstructured#4336 in this repo so the
quay.io/unstructured-io/unstructured-apiimage no longer ships the 14 ffmpeg 5.1.x CVEs bundled in PyPIopencv-pythonwheels.After
uv sync, the Dockerfile now:opencv-contrib-python-headlesswheel (built withWITH_FFMPEG=OFF+ENABLE_CONTRIB=1+ENABLE_HEADLESS=1) from the upstreamUnstructured-IO/unstructuredGitHub release (opencv-4.12.0.88)build-opencv-wheels.ymlworkflow--no-depsThe contrib-headless variant is a strict superset of the
cv2API exposed byopencv-python,opencv-python-headless, andopencv-contrib-python, so a single wheel transparently replaces whichever variant is present.One deviation from upstream
Upstream uninstalls all four opencv variants in a single
uv pip uninstall …call because their image pulls all four transitively (viaunstructured-paddleocr). Ouruv.lockcurrently only resolvesopencv-python, so a single combined uninstall would fail on the three that aren't installed. Replaced with a per-package loop using|| true— same end state, robust if transitive deps change.Version / Changelog
0.1.3→0.1.4CHANGELOG.mdentry under0.1.4→ Securityuv lockchanges needed; the lockfile still resolvesopencv-python 4.13.0.92, and we overlay the 4.12.0.88 contrib-headless wheel only at image build time (upstream 4.13.0.92 has no sdist on PyPI, which is why the build-from-source workflow is pinned to 4.12.0.88).Test plan
make docker-buildsucceeds onamd64andarm64; the opencv replacement step resolves the architecture-specific wheel and the SHA-256 check passesdocker run … python -c "import cv2; print(cv2.__version__)"prints4.12.0.88inside the built imagemake docker-testpasses against the rebuilt image🤖 Generated with Claude Code
Note
Medium Risk
Medium risk because it changes a core binary dependency (
opencv) at image build time via an external wheel download and forced uninstall/reinstall, which could impact image build reliability or runtime CV2 behavior across architectures.Overview
Updates the Docker build to remove vulnerable ffmpeg-bundled PyPI OpenCV wheels by downloading an arch-specific, SHA-256-verified
opencv-contrib-python-headlesswheel built withWITH_FFMPEG=OFF, uninstalling any installed OpenCV variants, and reinstalling the verified wheel.Bumps the service version to
0.1.4and adds aCHANGELOG.mdsecurity entry documenting the OpenCV/ffmpeg CVE mitigation.Reviewed by Cursor Bugbot for commit 7e23afc. Bugbot is set up for automated code reviews on this repo. Configure here.