Skip to content

Add processed/unprocessed filter to captures list view#1326

Open
mihow wants to merge 14 commits into
mainfrom
worktree-capture-filters
Open

Add processed/unprocessed filter to captures list view#1326
mihow wants to merge 14 commits into
mainfrom
worktree-capture-filters

Conversation

@mihow
Copy link
Copy Markdown
Collaborator

@mihow mihow commented May 29, 2026

Adds a Processing status filter and a sortable Last processed column to the Captures list. Anna is reviewing.

Processing status filter

I wanted a quick way to see which captures are still waiting on a detection run. While building it we landed on a distinction that matters, so the backend now exposes two separate things:

  • processed=true|false — was the capture run through a detection pipeline? This counts any Detection row, including the null markers we write for the "processed, but found nothing" case. The UI "Processing status" filter uses this param.
  • has_detections=true|false — does the capture have real detections (a bounding box)? Null markers are excluded.

The split is what you'd expect from the data: on the Vermont project, processed=true returns 10,924 captures but has_detections=true returns 9,790 — the 1,134-capture gap is exactly the captures that were processed and found nothing. This mirrors how the capture set list already separates its processed count from its detections count (NULL_DETECTIONS_FILTER in models.py).

The UI control is a small three-state dropdown (Processed / Not processed / cleared = all), modeled on the verification-status filter. I didn't reuse the generic boolean filter because its "No" option clears the filter instead of filtering to false, which would make "not processed" impossible to pick.

Processing status filter applied on the Captures list

Last processed column

A sortable column on the Captures list showing the most recent detection created_at for each capture — i.e. when that capture was last run through a detection pipeline. Captures that were never processed show blank and sort last. It's exposed as a last_processed annotation and orderable via ?ordering=last_processed.

It's a correlated subquery (LIMIT 1) rather than a join-plus-Max, which keeps the row count stable for pagination. To keep it cheap without denormalizing a timestamp onto the capture, I added a composite index on Detection(source_image, -created_at) so the per-capture lookup is an index-only scan. The annotation only runs for the rows actually returned, so a normal page load is fast — an unsorted page over the full 929k-capture Vermont project is ~15ms, and a single session of ~4,000 captures sorts by last_processed in ~30ms.

Performance caveats (worth a look before merge)

The Captures list defaults to project scope, so both of these are reachable on a very large project (Vermont, ~929k captures). Both are about the pagination COUNT / sort — not the page of rows. The actual 25-row page comes back in ~3.5ms in every case (index walk on (project, -timestamp) + early LIMIT); all the cost below is the extra work LimitOffsetPagination does alongside it.

  1. processed=false / has_detections=false COUNT. The slow part is the SELECT COUNT(*) the paginator runs, which has no LIMIT to prune. The cost grows with how much of the project the filter returns: project-only COUNT ~1.25s, =true ~4.8s (the EXISTS semi-join short-circuits on the small matching set of 10,924), =false ~12.8s. The negative case is worst because a NOT EXISTS anti-join can't short-circuit — to prove a capture has no detection you must check all of them — so the planner full-seq-scans the whole project's captures. This is pre-existing behaviour: the filter logic already existed as has_detections; this PR mostly surfaces it in the UI and splits the semantics.

  2. Sorting by last_processed on a full project. Ordering all 929k captures by the annotation is slow (~14s warm, and can hit the statement timeout cold), because Postgres has to evaluate the subquery for every row before the top-N sort — the LIMIT 25 can't prune a computed sort key. Worth knowing this is the same behaviour the existing annotation-sort columns already have: sorting the same project by occurrences_count (~16s) or detections_count (~15s) costs about the same, so last_processed isn't a new cliff — and the new index keeps its per-row detection lookup an index-only scan (~1µs each), slightly cheaper than those count aggregations. A default page load (timestamp order) is unaffected; only an explicit sort-by-this-column on a huge project is slow.

Planned general fix: #1328. Caching or denormalizing a column per filter doesn't scale (many more filters are coming), so #1328 proposes an estimated-count paginator that reads Postgres's planner row estimate (measured ~3% accurate on the large sets that actually hurt, <15ms, no execution) with an exact-count fallback below a threshold. That makes caveat #1 cheap for every current and future filter. For the negative processed case specifically there's also a targeted subtraction trick (total − processed-true via grouping detections by source image, ~3s vs 12.8s), noted in #1328. Caveat #2 (the sort) needs an indexable precomputed column to be fast and is left as-is here, consistent with the existing count-column sorts. The ProjectPagination.get_count annotation-strip trick won't help caveat #1 (the cost is the anti-join scan, not annotations), and this endpoint's COUNT already drops the unused subquery annotation, so the column adds no COUNT cost.

Note on scope

This PR carries two related-but-separable pieces: the captures filter and the captures column. They're linked (both fell out of the same "what counts as processed" question), but if you'd rather review them apart I can split the column into its own PR.

Changes

  • ami/main/api/views.py — new processed= filter (reuses the existing with_was_processed() annotation); has_detections= scoped to real detections; last_processed annotation + ordering on the captures list (and detail).
  • ami/main/api/serializers.pylast_processed on the source-image list serializer.
  • ami/main/models.py + migration 0088 — composite index on Detection(source_image, -created_at) for the subquery.
  • ami/main/tests.py — the processed-vs-has_detections split (including a null-marker-only capture) and the captures last_processed field/ordering (processed capture has a timestamp, untouched one is null).
  • ui/... — the ProcessingStatusFilter dropdown wired to processed, and the Last processed column on the Captures list.

Testing

Backend tests pass (3 focused cases, plus the source-image query-count regression test stays green — the always-on annotation adds no per-row queries). tsc --noEmit and eslint are clean. I validated against the running stack on the Vermont project: processed 10,924 / not-processed 918,010; has_detections 9,790 / 919,144; the last_processed subquery uses the new index (index-only scan) and scoped sorts return real timestamps.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Processing-status filter for captures (Processed / Not processed / All) and a sortable "Last processed" column on captures and deployments.
  • Tests

    • API tests for captures filtering (processed vs has_detections) and ordering/annotation of last_processed.
    • Tests validating deployment last-processed annotation behavior.
  • Documentation

    • Planning and reference docs describing filter design, UI wiring, and count/ pagination strategies.
  • Chores

    • Database index added to support last-processed lookups.

Review Change Stack

mihow and others added 7 commits May 28, 2026 21:00
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 29, 2026 04:17
@netlify
Copy link
Copy Markdown

netlify Bot commented May 29, 2026

Deploy Preview for antenna-ssec canceled.

Name Link
🔨 Latest commit 7c4d1da
🔍 Latest deploy log https://app.netlify.com/projects/antenna-ssec/deploys/6a19fc50bb43d70007bfc463

@netlify
Copy link
Copy Markdown

netlify Bot commented May 29, 2026

Deploy Preview for antenna-preview canceled.

Name Link
🔨 Latest commit 7c4d1da
🔍 Latest deploy log https://app.netlify.com/projects/antenna-preview/deploys/6a19fc5051b7430007a03ab6

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 29, 2026

📝 Walkthrough

Walkthrough

This PR implements a processing-status filter for the captures list and a read-only last_processed timestamp for sorting and display, refines has_detections semantics to exclude null-marker detections, annotates captures/deployments with last_processed via a correlated subquery, adds a DB index and migration, wires a new frontend filter component and column, and adds backend regression tests and docs.

Changes

Captures Processing Status Filter and Last-Processed Annotation

Layer / File(s) Summary
Planning and design documents
docs/claude/planning/2026-05-28-captures-processed-filter-design.md, docs/claude/planning/2026-05-28-captures-processed-filter-plan.md
Design specification and implementation plan for the processing-status filter and frontend/backend tasks, including testing and verification steps.
Backend: capture filtering and last_processed annotation
ami/main/api/views.py
Adds filter_by_processed (uses with_was_processed() and processed param), tightens filter_by_has_detections to exclude null-marker detections via an Exists subquery, introduces annotate_last_processed (correlated Subquery over Detection.created_at), and exposes last_processed in ordering fields.
Backend: serializer field and database migration
ami/main/api/serializers.py, ami/main/migrations/0088_detection_det_srcimg_created_idx.py, ami/main/models.py
Adds read-only last_processed DateTimeField to SourceImageListSerializer and a model/index migration adding det_srcimg_created_idx on (source_image, -created_at).
Backend regression tests: processed filter and last_processed ordering
ami/main/tests.py
Adds TestCapturesProcessedFilter (processed vs has_detections behavior including null-marker detections) and TestCapturesLastProcessed (annotation and ordering). Also a small import reformat.
Frontend: language localization and Capture model
ui/src/utils/language.ts, ui/src/data-services/models/capture.ts
Adds STRING.FIELD_LABEL_LAST_PROCESSED, STRING.PROCESSED, STRING.NOT_PROCESSED with English strings; adds Capture.lastProcessed getter converting backend last_processed to `Date
Frontend: ProcessingStatusFilter component
ui/src/components/filtering/filters/processing-status-filter.tsx
New React Select component mapping string↔boolean values and emitting onAdd for processed/not-processed selection.
Frontend: filter registration and AVAILABLE_FILTERS
ui/src/components/filtering/filter-control.tsx, ui/src/utils/useFilters.ts
Registers ProcessingStatusFilter in the filter ComponentMap and inserts the "Processing status" entry into AVAILABLE_FILTERS with tooltip.
Frontend: captures page filter and last-processed column
ui/src/pages/captures/captures.tsx, ui/src/pages/captures/capture-columns.tsx
Enables last-processed column in default columns, renders FilterControl for processed on the captures page, and adds a last-processed table column showing item.lastProcessed via DateTableCell (sortable by last_processed).

Sequence Diagram

sequenceDiagram
  participant User
  participant UI as ProcessingStatusFilter
  participant Captures Page
  participant API as /api/v2/captures/
  participant ViewSet as SourceImageViewSet
  participant DB as Detection rows

  User->>UI: select "Processed" or "Not processed"
  UI->>Captures Page: onAdd(value) → URL search param processed=true/false
  Captures Page->>API: GET /api/v2/captures/?processed=true/false
  API->>ViewSet: list(queryset)
  ViewSet->>ViewSet: annotate_last_processed(queryset)
  ViewSet->>DB: Subquery: MAX(Detection.created_at) per capture
  ViewSet->>ViewSet: filter_by_processed(queryset)
  ViewSet->>DB: with_was_processed() annotation + filter on was_processed
  DB-->>ViewSet: filtered captures with last_processed timestamps
  ViewSet-->>API: JSON: {id, name, last_processed, ...}
  API-->>Captures Page: render rows with timestamps
  Captures Page->>UI: display in table, sorted by last_processed
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • RolnickLab/antenna#1093: Introduces NULL_DETECTIONS_FILTER/with_was_processed()/was_processed semantics that this PR builds upon.
  • RolnickLab/antenna#1300: Modifies SourceImageViewSet.get_queryset and relates to capture list queryset shaping and ordering.

Suggested labels

data-processing

Suggested reviewers

  • annavik

Poem

🐰 A filter springs forth, not quite gray,
Processed or pending, you choose the way,
Last-touched timestamps in rows align,
Order by when the detections defined,
Captures now sorted in SQL's design!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: adding a processed/unprocessed filter to the captures list view.
Description check ✅ Passed The PR description is comprehensive and well-structured, covering summary, list of changes, testing, and deployment notes, though it lacks a formal checklist section and could improve the related issues reference format.
Linked Issues check ✅ Passed The PR fully implements the captured processing-status filter and last-processed column with backend and frontend changes, supporting #1328's planned estimated-count paginator by documenting performance caveats and deferring the general solution.
Out of Scope Changes check ✅ Passed All changes are directly scoped to the processing-status filter and last-processed column features. Documentation files (planning, reference, design) and the composite database index are all justified by the core functionality.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch worktree-capture-filters
⚔️ Resolve merge conflicts
  • Resolve merge conflict in branch worktree-capture-filters

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a Captures list “Processing status” filter that maps UI selections to the existing has_detections=true|false backend query parameter, plus backend regression coverage and planning docs.

Changes:

  • Adds ProcessingStatusFilter and registers/renders it in the Captures filter panel.
  • Adds translated “Processed” / “Not processed” labels and filter registry metadata.
  • Adds backend tests for the captures has_detections list filter.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
ami/main/tests.py Adds regression tests for captures filtered by detection presence.
ui/src/components/filtering/filters/processing-status-filter.tsx Adds the select component for processed/not processed filtering.
ui/src/components/filtering/filter-control.tsx Registers the new filter component for has_detections.
ui/src/pages/captures/captures.tsx Renders the processing-status filter on the Captures page.
ui/src/utils/useFilters.ts Adds the filter metadata to the available filter registry.
ui/src/utils/language.ts Adds user-facing processing-status labels.
docs/claude/planning/2026-05-28-captures-processed-filter-plan.md Adds implementation plan documentation.
docs/claude/planning/2026-05-28-captures-processed-filter-design.md Adds design notes and scope for the filter.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread ami/main/tests.py Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
ami/main/tests.py (1)

1406-1430: ⚡ Quick win

Strengthen filter assertions to verify IDs and null-marker behavior.

At Line 1425 and Line 1430, asserting only count can miss wrong memberships. Also, the suite currently doesn’t explicitly prove the “null detection marker counts as processed” contract from the class docstring (Line 1402-1404).

Suggested test hardening
 class TestCapturesProcessedFilter(APITestCase):
@@
     def setUp(self) -> None:
         self.project, self.deployment = setup_test_project(reuse=False)
         self.captures = create_captures(self.deployment, num_nights=1, images_per_night=4)
-        # Mark the first two captures as processed by giving them a detection.
-        for capture in self.captures[:2]:
-            create_detections(capture, bboxes=[(0.1, 0.1, 0.2, 0.2)])
+        # Mark first capture as processed with a bbox detection.
+        create_detections(self.captures[0], bboxes=[(10, 10, 20, 20)])
+        # Mark second capture as processed with a null detection marker.
+        Detection.objects.create(
+            source_image=self.captures[1],
+            timestamp=self.captures[1].timestamp,
+            bbox=None,
+        )
+        self.processed_ids = {self.captures[0].pk, self.captures[1].pk}
+        self.unprocessed_ids = {self.captures[2].pk, self.captures[3].pk}
@@
     def test_has_detections_true_returns_only_processed(self):
         response = self.client.get(f"{self.list_url}&has_detections=true")
         self.assertEqual(response.status_code, status.HTTP_200_OK)
-        self.assertEqual(response.json()["count"], 2)
+        ids = {row["id"] for row in response.json()["results"]}
+        self.assertEqual(ids, self.processed_ids)
@@
     def test_has_detections_false_returns_only_unprocessed(self):
         response = self.client.get(f"{self.list_url}&has_detections=false")
         self.assertEqual(response.status_code, status.HTTP_200_OK)
-        self.assertEqual(response.json()["count"], 2)
+        ids = {row["id"] for row in response.json()["results"]}
+        self.assertEqual(ids, self.unprocessed_ids)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ami/main/tests.py` around lines 1406 - 1430, Update the two tests
(test_has_detections_true_returns_only_processed and
test_has_detections_false_returns_only_unprocessed) to assert membership by IDs,
not just count: after setUp, capture the expected processed_ids (the two
captures you called create_detections on) and unprocessed_ids (the remaining
captures from create_captures), then assert the response JSON results contain
exactly those IDs. Also add a case in setUp (or a new small helper) that creates
a "null marker" detection via create_detections on one of the captures (e.g.,
with a special flag or null metadata used by your app) and confirm that the
capture with the null marker is included in processed_ids so the tests validate
the "null detection marker counts as processed" contract.
docs/claude/planning/2026-05-28-captures-processed-filter-plan.md (1)

278-278: 💤 Low value

Fix markdown list continuation indentation.

The continuation paragraph under Step 3 should be indented to align with the list item content for proper markdown formatting.

📝 Suggested fix
 - [ ] **Step 3: Manual verification against the running stack**
 
-Start the stack (`docker compose up -d` from repo root; for worktree code-only changes use the bind-mount Option A in CLAUDE.md if testing against the main stack). Then in the UI at `http://localhost:4000`, open a project's Captures list and:
+  Start the stack (`docker compose up -d` from repo root; for worktree code-only changes use the bind-mount Option A in CLAUDE.md if testing against the main stack). Then in the UI at `http://localhost:4000`, open a project's Captures list and:
   - Select **Processed** → URL gains `?has_detections=true`, result count drops to processed captures only.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/claude/planning/2026-05-28-captures-processed-filter-plan.md` at line
278, The continuation paragraph after Step 3 ("Start the stack (`docker compose
up -d` from repo root; for worktree code-only changes use the bind-mount Option
A in CLAUDE.md if testing against the main stack). Then in the UI at
`http://localhost:4000`, open a project's Captures list and:") must be indented
to match the list item content so it renders as part of the same bullet; update
the markdown by adding the same indentation (spaces) before the continuation
paragraph so it aligns under the Step 3 list item text.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@ui/src/components/filtering/filters/processing-status-filter.tsx`:
- Around line 6-7: The component ProcessingStatusFilter currently destructures
props as `{ value: string, onAdd }` which renames the prop to `string`
(shadowing the global type) and then creates another `value` variable; change
the prop destructuring to `{ value, onAdd }: FilterProps` and update the
subsequent conversion to use that prop (e.g., `const isProcessing =
stringToBoolean(value)` or similar) so you no longer rename the prop or shadow
built-in identifiers; update references inside ProcessingStatusFilter
accordingly.

---

Nitpick comments:
In `@ami/main/tests.py`:
- Around line 1406-1430: Update the two tests
(test_has_detections_true_returns_only_processed and
test_has_detections_false_returns_only_unprocessed) to assert membership by IDs,
not just count: after setUp, capture the expected processed_ids (the two
captures you called create_detections on) and unprocessed_ids (the remaining
captures from create_captures), then assert the response JSON results contain
exactly those IDs. Also add a case in setUp (or a new small helper) that creates
a "null marker" detection via create_detections on one of the captures (e.g.,
with a special flag or null metadata used by your app) and confirm that the
capture with the null marker is included in processed_ids so the tests validate
the "null detection marker counts as processed" contract.

In `@docs/claude/planning/2026-05-28-captures-processed-filter-plan.md`:
- Line 278: The continuation paragraph after Step 3 ("Start the stack (`docker
compose up -d` from repo root; for worktree code-only changes use the bind-mount
Option A in CLAUDE.md if testing against the main stack). Then in the UI at
`http://localhost:4000`, open a project's Captures list and:") must be indented
to match the list item content so it renders as part of the same bullet; update
the markdown by adding the same indentation (spaces) before the continuation
paragraph so it aligns under the Step 3 list item text.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8437f2c0-9c2c-4861-953a-7d20ec8f857d

📥 Commits

Reviewing files that changed from the base of the PR and between c9b417d and 8e71fd5.

📒 Files selected for processing (8)
  • ami/main/tests.py
  • docs/claude/planning/2026-05-28-captures-processed-filter-design.md
  • docs/claude/planning/2026-05-28-captures-processed-filter-plan.md
  • ui/src/components/filtering/filter-control.tsx
  • ui/src/components/filtering/filters/processing-status-filter.tsx
  • ui/src/pages/captures/captures.tsx
  • ui/src/utils/language.ts
  • ui/src/utils/useFilters.ts

Comment thread ui/src/components/filtering/filters/processing-status-filter.tsx Outdated
@mihow mihow requested a review from annavik May 29, 2026 04:43
… column

Captures filter now uses a new ?processed= param (any Detection row, including
the null markers that mean 'processed, found nothing'), and the Processing status
UI filter points at it. ?has_detections= is scoped to real detections only
(NULL_DETECTIONS_FILTER excluded), matching how the capture set list separates
its processed count from its detections count.

Deployments list gains a sortable 'Last processed' column: the latest detection
created_at across the deployment's captures, via a correlated subquery annotation
(stable row count for pagination), exposed as last_processed and orderable.

Tests cover the processed vs has_detections split including a null-marker-only
capture, plus the deployment last_processed field and ordering.

Co-Authored-By: Claude <noreply@anthropic.com>
@mihow mihow force-pushed the worktree-capture-filters branch from 3cc989c to 658d58c Compare May 29, 2026 04:58
mihow and others added 2 commits May 28, 2026 22:00
Avoid renaming the value prop to `string` (shadows the global type); use a
`booleanValue` local instead. Addresses CodeRabbit review comment.

Co-Authored-By: Claude <noreply@anthropic.com>
Keeps the 'processed' definition in one place (the existing SourceImageQuerySet
annotation) instead of duplicating the Exists(Detection) subquery in the viewset.

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
ami/main/tests.py (1)

1469-1475: ⚡ Quick win

This test doesn’t actually verify ordering behavior yet.

With only one deployment in the fixture, ordering=-last_processed can’t fail meaningfully. Add a second deployment with a different detection timestamp and assert returned order to guard the ordering contract.

💡 Suggested assertion pattern
     def test_last_processed_annotated_and_orderable(self):
-        # One request exercises the annotation, the serializer field, and the
-        # ordering registration together.
+        # Create another deployment with an older processed timestamp to verify ordering.
+        other_deployment = Deployment.objects.create(name="older", project=self.project)
+        older_capture = create_captures(other_deployment, num_nights=1, images_per_night=1)[0]
+        old_detection = create_detections(older_capture, bboxes=[(10, 10, 20, 20)])[0]
+        Detection.objects.filter(pk=old_detection.pk).update(created_at=datetime.datetime(2020, 1, 1))
+
         response = self.client.get(f"{self.url}&ordering=-last_processed")
         self.assertEqual(response.status_code, status.HTTP_200_OK)
-        self.assertIsNotNone(self._deployment_row(response.json())["last_processed"])
+        data = response.json()
+        self.assertIsNotNone(self._deployment_row(data)["last_processed"])
+        self.assertEqual(data["results"][0]["id"], self.deployment.pk)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ami/main/tests.py` around lines 1469 - 1475, The test method
test_last_processed_annotated_and_orderable currently uses a single deployment
so ordering cannot be validated; create a second deployment record with a
different detection timestamp (earlier or later than the existing one) before
calling self.client.get(f"{self.url}&ordering=-last_processed") and then assert
the returned list ordering via _deployment_row(response.json()) such that the
first returned deployment has the expected last_processed value (i.e., the later
timestamp when using "-last_processed") to verify ordering; ensure the two
timestamps are distinct and cleanup uses existing fixtures/factories used
elsewhere in the test class.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@ami/main/tests.py`:
- Line 1420: The test is passing normalized [0–1] bbox values to
create_detections but this codebase expects absolute pixel coordinates
(Detection.bbox/BoundingBox in pixel space); update the calls to
create_detections (e.g., the invocation with bboxes=[(0.1, 0.1, 0.2, 0.2)] and
the similar one at the other location) to use integer pixel boxes (x1,y1,x2,y2)
that match the capture fixture dimensions (use actual pixel coordinates or
compute pixels from the fixture size) so downstream validation uses canonical
pixel-space boxes.

---

Nitpick comments:
In `@ami/main/tests.py`:
- Around line 1469-1475: The test method
test_last_processed_annotated_and_orderable currently uses a single deployment
so ordering cannot be validated; create a second deployment record with a
different detection timestamp (earlier or later than the existing one) before
calling self.client.get(f"{self.url}&ordering=-last_processed") and then assert
the returned list ordering via _deployment_row(response.json()) such that the
first returned deployment has the expected last_processed value (i.e., the later
timestamp when using "-last_processed") to verify ordering; ensure the two
timestamps are distinct and cleanup uses existing fixtures/factories used
elsewhere in the test class.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 93499240-3544-4236-9877-3dc9154105e0

📥 Commits

Reviewing files that changed from the base of the PR and between 8e71fd5 and 097e5d2.

📒 Files selected for processing (11)
  • ami/main/api/serializers.py
  • ami/main/api/views.py
  • ami/main/tests.py
  • ui/src/components/filtering/filter-control.tsx
  • ui/src/components/filtering/filters/processing-status-filter.tsx
  • ui/src/data-services/models/deployment.ts
  • ui/src/pages/captures/captures.tsx
  • ui/src/pages/deployments/deployment-columns.tsx
  • ui/src/pages/deployments/deployments.tsx
  • ui/src/utils/language.ts
  • ui/src/utils/useFilters.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • ui/src/utils/useFilters.ts
  • ui/src/components/filtering/filters/processing-status-filter.tsx

Comment thread ami/main/tests.py
self.captures = create_captures(self.deployment, num_nights=1, images_per_night=4)
# Two captures get a real detection (bounding box present).
for capture in self.captures[:2]:
create_detections(capture, bboxes=[(0.1, 0.1, 0.2, 0.2)])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use pixel-space bbox fixtures in these new detection rows.

These new tests use normalized-looking bbox values, but this repo’s detection bbox convention is pixel coordinates. Please switch to integer pixel boxes to match canonical behavior and avoid brittle assumptions in future validation paths.

💡 Proposed test fixture update
-            create_detections(capture, bboxes=[(0.1, 0.1, 0.2, 0.2)])
+            create_detections(capture, bboxes=[(10, 10, 20, 20)])
...
-        create_detections(self.captures[0], bboxes=[(0.1, 0.1, 0.2, 0.2)])
+        create_detections(self.captures[0], bboxes=[(10, 10, 20, 20)])

Based on learnings: Detection.bbox/BoundingBox values in this repo use absolute pixel coordinate space (not normalized [0–1] floats).

Also applies to: 1460-1460

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ami/main/tests.py` at line 1420, The test is passing normalized [0–1] bbox
values to create_detections but this codebase expects absolute pixel coordinates
(Detection.bbox/BoundingBox in pixel space); update the calls to
create_detections (e.g., the invocation with bboxes=[(0.1, 0.1, 0.2, 0.2)] and
the similar one at the other location) to use integer pixel boxes (x1,y1,x2,y2)
that match the capture fixture dimensions (use actual pixel coordinates or
compute pixels from the fixture size) so downstream validation uses canonical
pixel-space boxes.

The previous commit added a sortable "Last processed" column to the
deployments (Stations) list. Move it to the captures list instead, where
it shows the most recent detection created_at for each capture — i.e.
when that capture was last run through a detection pipeline. Exposed as
the `last_processed` annotation on the captures endpoint and orderable
via ?ordering=last_processed. Reverts the deployments-list column.

A correlated subquery (LIMIT 1) keeps the pagination row count stable.
A new composite index on Detection(source_image, -created_at) makes the
per-capture lookup an index-only scan, so the column stays fast for the
scoped event/deployment/collection views that the UI actually uses,
without denormalizing a timestamp field onto SourceImage.

Co-Authored-By: Claude <noreply@anthropic.com>
mihow and others added 2 commits May 29, 2026 10:15
The captures list COUNT for ?processed= and ?has_detections= ran an
EXISTS / NOT EXISTS subquery with no LIMIT to prune, so on a large
project (~900k captures) NOT EXISTS became a full anti-join over the
wide source-image table (~12s for processed=false, ~11s for
has_detections=false).

SourceImagePagination now counts the captures *with* detections off the
Detection table (COUNT(DISTINCT source_image_id), a small index scan)
and subtracts from the project total:

    processed=true   -> processed_count
    processed=false  -> total - processed_count

Both counts are exact and cost scales with the number of detection rows
rather than the processed/unprocessed ratio, so it is fast in both
directions (measured: false ~1.7s vs 12.8s, true ~1.5s vs 4.8s on a
929k-capture project). Any other filter combination falls back to the
default count.

Co-Authored-By: Claude <noreply@anthropic.com>
Deploy-time benchmarking on the Serbia dev box (hardware comparable to
production) showed the slow COUNT that motivated the subtraction paginator
does not reproduce there. The 12.8s NOT EXISTS anti-join for processed=false
on the 929k-capture project was a local-environment artifact (cold cache,
small RAM); Serbia runs the same anti-join in ~1.4s. The index added for the
last_processed sort is not used by the count plan either way, and the
subtraction's detection-side IN-subquery had its own cold-plan spike (7.71s
on a smaller project's first disk touch).

Revert to the default DRF count for these filters. The general fix for slow
counts on large filtered lists is the estimated-count paginator (#1328), not
a per-filter bespoke count.

The subtraction strategy, implementation notes, and full benchmarks are kept
for reference in docs/claude/reference/captures-processed-count-strategies.md.

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/claude/reference/captures-processed-count-strategies.md`:
- Around line 37-44: The two fenced code blocks in the capture/processed_count
docs are missing language identifiers (triggering MD040); update both fences
shown (the block starting with "total = COUNT(*)..." and the block starting with
"Finalize Aggregate (actual time=...)" ) to include a language tag such as
```text (or ```sql for the SQL-looking block) so the linter stops flagging
MD040; ensure you change both occurrences (the first block around the
processed_count explanation and the second block showing the execution plan).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 10bbf985-608d-41b9-b509-d4c22076782d

📥 Commits

Reviewing files that changed from the base of the PR and between dae73c5 and 7c4d1da.

📒 Files selected for processing (1)
  • docs/claude/reference/captures-processed-count-strategies.md

Comment on lines +37 to +44
```
total = COUNT(*) over the project/event/deployment/collection-scoped captures
processed_count = COUNT(DISTINCT source_image_id) off main_detection, scoped to the same captures
(has_detections: also .exclude(NULL_DETECTIONS_FILTER))

processed=true -> processed_count
processed=false -> total - processed_count
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add language identifiers to fenced code blocks.

Both fences are missing language tags, which triggers MD040 and keeps docs lint noisy.

Suggested fix
-```
+```text
 total           = COUNT(*) over the project/event/deployment/collection-scoped captures
 processed_count  = COUNT(DISTINCT source_image_id) off main_detection, scoped to the same captures
                    (has_detections: also .exclude(NULL_DETECTIONS_FILTER))
 ...
 processed=true   -> processed_count
 processed=false  -> total - processed_count
-   ```
+   ```text
    Finalize Aggregate (actual time=1541..1567)
      -> Parallel Hash Right Anti Join (rows=303665)
           -> Parallel Seq Scan on main_detection (rows=455239)
           -> Parallel Hash
                -> Parallel Seq Scan on main_sourceimage (Filter: project_id = 18)
    Execution Time: 1609 ms
    ```

Also applies to: 95-102

🧰 Tools
🪛 markdownlint-cli2 (0.22.1)

[warning] 37-37: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/claude/reference/captures-processed-count-strategies.md` around lines 37
- 44, The two fenced code blocks in the capture/processed_count docs are missing
language identifiers (triggering MD040); update both fences shown (the block
starting with "total = COUNT(*)..." and the block starting with "Finalize
Aggregate (actual time=...)" ) to include a language tag such as ```text (or
```sql for the SQL-looking block) so the linter stops flagging MD040; ensure you
change both occurrences (the first block around the processed_count explanation
and the second block showing the execution plan).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Estimated-count paginator for fast pagination on large filtered lists

2 participants