Add processed/unprocessed filter to captures list view#1326
Conversation
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
✅ Deploy Preview for antenna-ssec canceled.
|
✅ Deploy Preview for antenna-preview canceled.
|
📝 WalkthroughWalkthroughThis PR implements a processing-status filter for the captures list and a read-only last_processed timestamp for sorting and display, refines has_detections semantics to exclude null-marker detections, annotates captures/deployments with last_processed via a correlated subquery, adds a DB index and migration, wires a new frontend filter component and column, and adds backend regression tests and docs. ChangesCaptures Processing Status Filter and Last-Processed Annotation
Sequence DiagramsequenceDiagram
participant User
participant UI as ProcessingStatusFilter
participant Captures Page
participant API as /api/v2/captures/
participant ViewSet as SourceImageViewSet
participant DB as Detection rows
User->>UI: select "Processed" or "Not processed"
UI->>Captures Page: onAdd(value) → URL search param processed=true/false
Captures Page->>API: GET /api/v2/captures/?processed=true/false
API->>ViewSet: list(queryset)
ViewSet->>ViewSet: annotate_last_processed(queryset)
ViewSet->>DB: Subquery: MAX(Detection.created_at) per capture
ViewSet->>ViewSet: filter_by_processed(queryset)
ViewSet->>DB: with_was_processed() annotation + filter on was_processed
DB-->>ViewSet: filtered captures with last_processed timestamps
ViewSet-->>API: JSON: {id, name, last_processed, ...}
API-->>Captures Page: render rows with timestamps
Captures Page->>UI: display in table, sorted by last_processed
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
Adds a Captures list “Processing status” filter that maps UI selections to the existing has_detections=true|false backend query parameter, plus backend regression coverage and planning docs.
Changes:
- Adds
ProcessingStatusFilterand registers/renders it in the Captures filter panel. - Adds translated “Processed” / “Not processed” labels and filter registry metadata.
- Adds backend tests for the captures
has_detectionslist filter.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
ami/main/tests.py |
Adds regression tests for captures filtered by detection presence. |
ui/src/components/filtering/filters/processing-status-filter.tsx |
Adds the select component for processed/not processed filtering. |
ui/src/components/filtering/filter-control.tsx |
Registers the new filter component for has_detections. |
ui/src/pages/captures/captures.tsx |
Renders the processing-status filter on the Captures page. |
ui/src/utils/useFilters.ts |
Adds the filter metadata to the available filter registry. |
ui/src/utils/language.ts |
Adds user-facing processing-status labels. |
docs/claude/planning/2026-05-28-captures-processed-filter-plan.md |
Adds implementation plan documentation. |
docs/claude/planning/2026-05-28-captures-processed-filter-design.md |
Adds design notes and scope for the filter. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
ami/main/tests.py (1)
1406-1430: ⚡ Quick winStrengthen filter assertions to verify IDs and null-marker behavior.
At Line 1425 and Line 1430, asserting only
countcan miss wrong memberships. Also, the suite currently doesn’t explicitly prove the “null detection marker counts as processed” contract from the class docstring (Line 1402-1404).Suggested test hardening
class TestCapturesProcessedFilter(APITestCase): @@ def setUp(self) -> None: self.project, self.deployment = setup_test_project(reuse=False) self.captures = create_captures(self.deployment, num_nights=1, images_per_night=4) - # Mark the first two captures as processed by giving them a detection. - for capture in self.captures[:2]: - create_detections(capture, bboxes=[(0.1, 0.1, 0.2, 0.2)]) + # Mark first capture as processed with a bbox detection. + create_detections(self.captures[0], bboxes=[(10, 10, 20, 20)]) + # Mark second capture as processed with a null detection marker. + Detection.objects.create( + source_image=self.captures[1], + timestamp=self.captures[1].timestamp, + bbox=None, + ) + self.processed_ids = {self.captures[0].pk, self.captures[1].pk} + self.unprocessed_ids = {self.captures[2].pk, self.captures[3].pk} @@ def test_has_detections_true_returns_only_processed(self): response = self.client.get(f"{self.list_url}&has_detections=true") self.assertEqual(response.status_code, status.HTTP_200_OK) - self.assertEqual(response.json()["count"], 2) + ids = {row["id"] for row in response.json()["results"]} + self.assertEqual(ids, self.processed_ids) @@ def test_has_detections_false_returns_only_unprocessed(self): response = self.client.get(f"{self.list_url}&has_detections=false") self.assertEqual(response.status_code, status.HTTP_200_OK) - self.assertEqual(response.json()["count"], 2) + ids = {row["id"] for row in response.json()["results"]} + self.assertEqual(ids, self.unprocessed_ids)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ami/main/tests.py` around lines 1406 - 1430, Update the two tests (test_has_detections_true_returns_only_processed and test_has_detections_false_returns_only_unprocessed) to assert membership by IDs, not just count: after setUp, capture the expected processed_ids (the two captures you called create_detections on) and unprocessed_ids (the remaining captures from create_captures), then assert the response JSON results contain exactly those IDs. Also add a case in setUp (or a new small helper) that creates a "null marker" detection via create_detections on one of the captures (e.g., with a special flag or null metadata used by your app) and confirm that the capture with the null marker is included in processed_ids so the tests validate the "null detection marker counts as processed" contract.docs/claude/planning/2026-05-28-captures-processed-filter-plan.md (1)
278-278: 💤 Low valueFix markdown list continuation indentation.
The continuation paragraph under Step 3 should be indented to align with the list item content for proper markdown formatting.
📝 Suggested fix
- [ ] **Step 3: Manual verification against the running stack** -Start the stack (`docker compose up -d` from repo root; for worktree code-only changes use the bind-mount Option A in CLAUDE.md if testing against the main stack). Then in the UI at `http://localhost:4000`, open a project's Captures list and: + Start the stack (`docker compose up -d` from repo root; for worktree code-only changes use the bind-mount Option A in CLAUDE.md if testing against the main stack). Then in the UI at `http://localhost:4000`, open a project's Captures list and: - Select **Processed** → URL gains `?has_detections=true`, result count drops to processed captures only.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/claude/planning/2026-05-28-captures-processed-filter-plan.md` at line 278, The continuation paragraph after Step 3 ("Start the stack (`docker compose up -d` from repo root; for worktree code-only changes use the bind-mount Option A in CLAUDE.md if testing against the main stack). Then in the UI at `http://localhost:4000`, open a project's Captures list and:") must be indented to match the list item content so it renders as part of the same bullet; update the markdown by adding the same indentation (spaces) before the continuation paragraph so it aligns under the Step 3 list item text.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@ui/src/components/filtering/filters/processing-status-filter.tsx`:
- Around line 6-7: The component ProcessingStatusFilter currently destructures
props as `{ value: string, onAdd }` which renames the prop to `string`
(shadowing the global type) and then creates another `value` variable; change
the prop destructuring to `{ value, onAdd }: FilterProps` and update the
subsequent conversion to use that prop (e.g., `const isProcessing =
stringToBoolean(value)` or similar) so you no longer rename the prop or shadow
built-in identifiers; update references inside ProcessingStatusFilter
accordingly.
---
Nitpick comments:
In `@ami/main/tests.py`:
- Around line 1406-1430: Update the two tests
(test_has_detections_true_returns_only_processed and
test_has_detections_false_returns_only_unprocessed) to assert membership by IDs,
not just count: after setUp, capture the expected processed_ids (the two
captures you called create_detections on) and unprocessed_ids (the remaining
captures from create_captures), then assert the response JSON results contain
exactly those IDs. Also add a case in setUp (or a new small helper) that creates
a "null marker" detection via create_detections on one of the captures (e.g.,
with a special flag or null metadata used by your app) and confirm that the
capture with the null marker is included in processed_ids so the tests validate
the "null detection marker counts as processed" contract.
In `@docs/claude/planning/2026-05-28-captures-processed-filter-plan.md`:
- Line 278: The continuation paragraph after Step 3 ("Start the stack (`docker
compose up -d` from repo root; for worktree code-only changes use the bind-mount
Option A in CLAUDE.md if testing against the main stack). Then in the UI at
`http://localhost:4000`, open a project's Captures list and:") must be indented
to match the list item content so it renders as part of the same bullet; update
the markdown by adding the same indentation (spaces) before the continuation
paragraph so it aligns under the Step 3 list item text.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 8437f2c0-9c2c-4861-953a-7d20ec8f857d
📒 Files selected for processing (8)
ami/main/tests.pydocs/claude/planning/2026-05-28-captures-processed-filter-design.mddocs/claude/planning/2026-05-28-captures-processed-filter-plan.mdui/src/components/filtering/filter-control.tsxui/src/components/filtering/filters/processing-status-filter.tsxui/src/pages/captures/captures.tsxui/src/utils/language.tsui/src/utils/useFilters.ts
… column Captures filter now uses a new ?processed= param (any Detection row, including the null markers that mean 'processed, found nothing'), and the Processing status UI filter points at it. ?has_detections= is scoped to real detections only (NULL_DETECTIONS_FILTER excluded), matching how the capture set list separates its processed count from its detections count. Deployments list gains a sortable 'Last processed' column: the latest detection created_at across the deployment's captures, via a correlated subquery annotation (stable row count for pagination), exposed as last_processed and orderable. Tests cover the processed vs has_detections split including a null-marker-only capture, plus the deployment last_processed field and ordering. Co-Authored-By: Claude <noreply@anthropic.com>
3cc989c to
658d58c
Compare
Avoid renaming the value prop to `string` (shadows the global type); use a `booleanValue` local instead. Addresses CodeRabbit review comment. Co-Authored-By: Claude <noreply@anthropic.com>
Keeps the 'processed' definition in one place (the existing SourceImageQuerySet annotation) instead of duplicating the Exists(Detection) subquery in the viewset. Co-Authored-By: Claude <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
ami/main/tests.py (1)
1469-1475: ⚡ Quick winThis test doesn’t actually verify ordering behavior yet.
With only one deployment in the fixture,
ordering=-last_processedcan’t fail meaningfully. Add a second deployment with a different detection timestamp and assert returned order to guard the ordering contract.💡 Suggested assertion pattern
def test_last_processed_annotated_and_orderable(self): - # One request exercises the annotation, the serializer field, and the - # ordering registration together. + # Create another deployment with an older processed timestamp to verify ordering. + other_deployment = Deployment.objects.create(name="older", project=self.project) + older_capture = create_captures(other_deployment, num_nights=1, images_per_night=1)[0] + old_detection = create_detections(older_capture, bboxes=[(10, 10, 20, 20)])[0] + Detection.objects.filter(pk=old_detection.pk).update(created_at=datetime.datetime(2020, 1, 1)) + response = self.client.get(f"{self.url}&ordering=-last_processed") self.assertEqual(response.status_code, status.HTTP_200_OK) - self.assertIsNotNone(self._deployment_row(response.json())["last_processed"]) + data = response.json() + self.assertIsNotNone(self._deployment_row(data)["last_processed"]) + self.assertEqual(data["results"][0]["id"], self.deployment.pk)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ami/main/tests.py` around lines 1469 - 1475, The test method test_last_processed_annotated_and_orderable currently uses a single deployment so ordering cannot be validated; create a second deployment record with a different detection timestamp (earlier or later than the existing one) before calling self.client.get(f"{self.url}&ordering=-last_processed") and then assert the returned list ordering via _deployment_row(response.json()) such that the first returned deployment has the expected last_processed value (i.e., the later timestamp when using "-last_processed") to verify ordering; ensure the two timestamps are distinct and cleanup uses existing fixtures/factories used elsewhere in the test class.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@ami/main/tests.py`:
- Line 1420: The test is passing normalized [0–1] bbox values to
create_detections but this codebase expects absolute pixel coordinates
(Detection.bbox/BoundingBox in pixel space); update the calls to
create_detections (e.g., the invocation with bboxes=[(0.1, 0.1, 0.2, 0.2)] and
the similar one at the other location) to use integer pixel boxes (x1,y1,x2,y2)
that match the capture fixture dimensions (use actual pixel coordinates or
compute pixels from the fixture size) so downstream validation uses canonical
pixel-space boxes.
---
Nitpick comments:
In `@ami/main/tests.py`:
- Around line 1469-1475: The test method
test_last_processed_annotated_and_orderable currently uses a single deployment
so ordering cannot be validated; create a second deployment record with a
different detection timestamp (earlier or later than the existing one) before
calling self.client.get(f"{self.url}&ordering=-last_processed") and then assert
the returned list ordering via _deployment_row(response.json()) such that the
first returned deployment has the expected last_processed value (i.e., the later
timestamp when using "-last_processed") to verify ordering; ensure the two
timestamps are distinct and cleanup uses existing fixtures/factories used
elsewhere in the test class.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 93499240-3544-4236-9877-3dc9154105e0
📒 Files selected for processing (11)
ami/main/api/serializers.pyami/main/api/views.pyami/main/tests.pyui/src/components/filtering/filter-control.tsxui/src/components/filtering/filters/processing-status-filter.tsxui/src/data-services/models/deployment.tsui/src/pages/captures/captures.tsxui/src/pages/deployments/deployment-columns.tsxui/src/pages/deployments/deployments.tsxui/src/utils/language.tsui/src/utils/useFilters.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- ui/src/utils/useFilters.ts
- ui/src/components/filtering/filters/processing-status-filter.tsx
| self.captures = create_captures(self.deployment, num_nights=1, images_per_night=4) | ||
| # Two captures get a real detection (bounding box present). | ||
| for capture in self.captures[:2]: | ||
| create_detections(capture, bboxes=[(0.1, 0.1, 0.2, 0.2)]) |
There was a problem hiding this comment.
Use pixel-space bbox fixtures in these new detection rows.
These new tests use normalized-looking bbox values, but this repo’s detection bbox convention is pixel coordinates. Please switch to integer pixel boxes to match canonical behavior and avoid brittle assumptions in future validation paths.
💡 Proposed test fixture update
- create_detections(capture, bboxes=[(0.1, 0.1, 0.2, 0.2)])
+ create_detections(capture, bboxes=[(10, 10, 20, 20)])
...
- create_detections(self.captures[0], bboxes=[(0.1, 0.1, 0.2, 0.2)])
+ create_detections(self.captures[0], bboxes=[(10, 10, 20, 20)])Based on learnings: Detection.bbox/BoundingBox values in this repo use absolute pixel coordinate space (not normalized [0–1] floats).
Also applies to: 1460-1460
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@ami/main/tests.py` at line 1420, The test is passing normalized [0–1] bbox
values to create_detections but this codebase expects absolute pixel coordinates
(Detection.bbox/BoundingBox in pixel space); update the calls to
create_detections (e.g., the invocation with bboxes=[(0.1, 0.1, 0.2, 0.2)] and
the similar one at the other location) to use integer pixel boxes (x1,y1,x2,y2)
that match the capture fixture dimensions (use actual pixel coordinates or
compute pixels from the fixture size) so downstream validation uses canonical
pixel-space boxes.
The previous commit added a sortable "Last processed" column to the deployments (Stations) list. Move it to the captures list instead, where it shows the most recent detection created_at for each capture — i.e. when that capture was last run through a detection pipeline. Exposed as the `last_processed` annotation on the captures endpoint and orderable via ?ordering=last_processed. Reverts the deployments-list column. A correlated subquery (LIMIT 1) keeps the pagination row count stable. A new composite index on Detection(source_image, -created_at) makes the per-capture lookup an index-only scan, so the column stays fast for the scoped event/deployment/collection views that the UI actually uses, without denormalizing a timestamp field onto SourceImage. Co-Authored-By: Claude <noreply@anthropic.com>
The captures list COUNT for ?processed= and ?has_detections= ran an
EXISTS / NOT EXISTS subquery with no LIMIT to prune, so on a large
project (~900k captures) NOT EXISTS became a full anti-join over the
wide source-image table (~12s for processed=false, ~11s for
has_detections=false).
SourceImagePagination now counts the captures *with* detections off the
Detection table (COUNT(DISTINCT source_image_id), a small index scan)
and subtracts from the project total:
processed=true -> processed_count
processed=false -> total - processed_count
Both counts are exact and cost scales with the number of detection rows
rather than the processed/unprocessed ratio, so it is fast in both
directions (measured: false ~1.7s vs 12.8s, true ~1.5s vs 4.8s on a
929k-capture project). Any other filter combination falls back to the
default count.
Co-Authored-By: Claude <noreply@anthropic.com>
Deploy-time benchmarking on the Serbia dev box (hardware comparable to production) showed the slow COUNT that motivated the subtraction paginator does not reproduce there. The 12.8s NOT EXISTS anti-join for processed=false on the 929k-capture project was a local-environment artifact (cold cache, small RAM); Serbia runs the same anti-join in ~1.4s. The index added for the last_processed sort is not used by the count plan either way, and the subtraction's detection-side IN-subquery had its own cold-plan spike (7.71s on a smaller project's first disk touch). Revert to the default DRF count for these filters. The general fix for slow counts on large filtered lists is the estimated-count paginator (#1328), not a per-filter bespoke count. The subtraction strategy, implementation notes, and full benchmarks are kept for reference in docs/claude/reference/captures-processed-count-strategies.md. Co-Authored-By: Claude <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/claude/reference/captures-processed-count-strategies.md`:
- Around line 37-44: The two fenced code blocks in the capture/processed_count
docs are missing language identifiers (triggering MD040); update both fences
shown (the block starting with "total = COUNT(*)..." and the block starting with
"Finalize Aggregate (actual time=...)" ) to include a language tag such as
```text (or ```sql for the SQL-looking block) so the linter stops flagging
MD040; ensure you change both occurrences (the first block around the
processed_count explanation and the second block showing the execution plan).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 10bbf985-608d-41b9-b509-d4c22076782d
📒 Files selected for processing (1)
docs/claude/reference/captures-processed-count-strategies.md
| ``` | ||
| total = COUNT(*) over the project/event/deployment/collection-scoped captures | ||
| processed_count = COUNT(DISTINCT source_image_id) off main_detection, scoped to the same captures | ||
| (has_detections: also .exclude(NULL_DETECTIONS_FILTER)) | ||
|
|
||
| processed=true -> processed_count | ||
| processed=false -> total - processed_count | ||
| ``` |
There was a problem hiding this comment.
Add language identifiers to fenced code blocks.
Both fences are missing language tags, which triggers MD040 and keeps docs lint noisy.
Suggested fix
-```
+```text
total = COUNT(*) over the project/event/deployment/collection-scoped captures
processed_count = COUNT(DISTINCT source_image_id) off main_detection, scoped to the same captures
(has_detections: also .exclude(NULL_DETECTIONS_FILTER))
...
processed=true -> processed_count
processed=false -> total - processed_count- ```
+ ```text
Finalize Aggregate (actual time=1541..1567)
-> Parallel Hash Right Anti Join (rows=303665)
-> Parallel Seq Scan on main_detection (rows=455239)
-> Parallel Hash
-> Parallel Seq Scan on main_sourceimage (Filter: project_id = 18)
Execution Time: 1609 ms
```Also applies to: 95-102
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 37-37: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/claude/reference/captures-processed-count-strategies.md` around lines 37
- 44, The two fenced code blocks in the capture/processed_count docs are missing
language identifiers (triggering MD040); update both fences shown (the block
starting with "total = COUNT(*)..." and the block starting with "Finalize
Aggregate (actual time=...)" ) to include a language tag such as ```text (or
```sql for the SQL-looking block) so the linter stops flagging MD040; ensure you
change both occurrences (the first block around the processed_count explanation
and the second block showing the execution plan).
Adds a Processing status filter and a sortable Last processed column to the Captures list. Anna is reviewing.
Processing status filter
I wanted a quick way to see which captures are still waiting on a detection run. While building it we landed on a distinction that matters, so the backend now exposes two separate things:
processed=true|false— was the capture run through a detection pipeline? This counts any Detection row, including the null markers we write for the "processed, but found nothing" case. The UI "Processing status" filter uses this param.has_detections=true|false— does the capture have real detections (a bounding box)? Null markers are excluded.The split is what you'd expect from the data: on the Vermont project,
processed=truereturns 10,924 captures buthas_detections=truereturns 9,790 — the 1,134-capture gap is exactly the captures that were processed and found nothing. This mirrors how the capture set list already separates its processed count from its detections count (NULL_DETECTIONS_FILTERinmodels.py).The UI control is a small three-state dropdown (Processed / Not processed / cleared = all), modeled on the verification-status filter. I didn't reuse the generic boolean filter because its "No" option clears the filter instead of filtering to
false, which would make "not processed" impossible to pick.Last processed column
A sortable column on the Captures list showing the most recent detection
created_atfor each capture — i.e. when that capture was last run through a detection pipeline. Captures that were never processed show blank and sort last. It's exposed as alast_processedannotation and orderable via?ordering=last_processed.It's a correlated subquery (
LIMIT 1) rather than a join-plus-Max, which keeps the row count stable for pagination. To keep it cheap without denormalizing a timestamp onto the capture, I added a composite index onDetection(source_image, -created_at)so the per-capture lookup is an index-only scan. The annotation only runs for the rows actually returned, so a normal page load is fast — an unsorted page over the full 929k-capture Vermont project is ~15ms, and a single session of ~4,000 captures sorts bylast_processedin ~30ms.Performance caveats (worth a look before merge)
The Captures list defaults to project scope, so both of these are reachable on a very large project (Vermont, ~929k captures). Both are about the pagination COUNT / sort — not the page of rows. The actual 25-row page comes back in ~3.5ms in every case (index walk on
(project, -timestamp)+ early LIMIT); all the cost below is the extra workLimitOffsetPaginationdoes alongside it.processed=false/has_detections=falseCOUNT. The slow part is theSELECT COUNT(*)the paginator runs, which has no LIMIT to prune. The cost grows with how much of the project the filter returns: project-only COUNT ~1.25s,=true~4.8s (theEXISTSsemi-join short-circuits on the small matching set of 10,924),=false~12.8s. The negative case is worst because aNOT EXISTSanti-join can't short-circuit — to prove a capture has no detection you must check all of them — so the planner full-seq-scans the whole project's captures. This is pre-existing behaviour: the filter logic already existed ashas_detections; this PR mostly surfaces it in the UI and splits the semantics.Sorting by
last_processedon a full project. Ordering all 929k captures by the annotation is slow (~14s warm, and can hit the statement timeout cold), because Postgres has to evaluate the subquery for every row before the top-N sort — theLIMIT 25can't prune a computed sort key. Worth knowing this is the same behaviour the existing annotation-sort columns already have: sorting the same project byoccurrences_count(~16s) ordetections_count(~15s) costs about the same, solast_processedisn't a new cliff — and the new index keeps its per-row detection lookup an index-only scan (~1µs each), slightly cheaper than those count aggregations. A default page load (timestamp order) is unaffected; only an explicit sort-by-this-column on a huge project is slow.Planned general fix: #1328. Caching or denormalizing a column per filter doesn't scale (many more filters are coming), so #1328 proposes an estimated-count paginator that reads Postgres's planner row estimate (measured ~3% accurate on the large sets that actually hurt, <15ms, no execution) with an exact-count fallback below a threshold. That makes caveat #1 cheap for every current and future filter. For the negative
processedcase specifically there's also a targeted subtraction trick (total − processed-true via grouping detections by source image, ~3s vs 12.8s), noted in #1328. Caveat #2 (the sort) needs an indexable precomputed column to be fast and is left as-is here, consistent with the existing count-column sorts. TheProjectPagination.get_countannotation-strip trick won't help caveat #1 (the cost is the anti-join scan, not annotations), and this endpoint's COUNT already drops the unused subquery annotation, so the column adds no COUNT cost.Note on scope
This PR carries two related-but-separable pieces: the captures filter and the captures column. They're linked (both fell out of the same "what counts as processed" question), but if you'd rather review them apart I can split the column into its own PR.
Changes
ami/main/api/views.py— newprocessed=filter (reuses the existingwith_was_processed()annotation);has_detections=scoped to real detections;last_processedannotation + ordering on the captures list (and detail).ami/main/api/serializers.py—last_processedon the source-image list serializer.ami/main/models.py+ migration0088— composite index onDetection(source_image, -created_at)for the subquery.ami/main/tests.py— the processed-vs-has_detections split (including a null-marker-only capture) and the captureslast_processedfield/ordering (processed capture has a timestamp, untouched one is null).ui/...— theProcessingStatusFilterdropdown wired toprocessed, and the Last processed column on the Captures list.Testing
Backend tests pass (3 focused cases, plus the source-image query-count regression test stays green — the always-on annotation adds no per-row queries).
tsc --noEmitandeslintare clean. I validated against the running stack on the Vermont project: processed 10,924 / not-processed 918,010; has_detections 9,790 / 919,144; thelast_processedsubquery uses the new index (index-only scan) and scoped sorts return real timestamps.🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Tests
Documentation
Chores