Skip to content

feat(capture): route analytics through v1 submitter (capture v1, 4/6)#704

Draft
eli-r-ph wants to merge 2 commits into
capture-v1/03-transportfrom
capture-v1/04-wire
Draft

feat(capture): route analytics through v1 submitter (capture v1, 4/6)#704
eli-r-ph wants to merge 2 commits into
capture-v1/03-transportfrom
capture-v1/04-wire

Conversation

@eli-r-ph

Copy link
Copy Markdown

💡 Motivation and Context

Fourth PR in the stacked Capture V1 series (stacked on #703). This is the one that makes capture_mode actually do something: it wires send_v1_batch (from #703) into both send paths, so the previously-inert v1 transport is now selectable end to end. Default is still v0, so existing callers are unaffected.

  • Async path (Consumer.request) — refactored to route via _send_analytics / _send_ai. _send_analytics picks the submitter by capture_mode: v1 -> the partial-retry send_v1_batch, v0 -> the legacy batch_post loop. _send_ai always uses the legacy submitter.
  • Sync path (Client._enqueue, sync_mode=True) — same branch: an analytics event in v1 goes through send_v1_batch([msg]); everything else stays on batch_post.
  • The dedicated AI endpoint has no v1 form, so $ai_* events routed to it always use the legacy submitter regardless of capture_mode. When the dedicated AI endpoint is not enabled, all events (including $ai_*) go to the single analytics destination and therefore follow capture_mode — exactly mirroring the v0 single-endpoint behavior.

Supporting changes: Client.max_retries is now stored so the sync path can forward it, and gzip/timeout/retries/historical_migration are forwarded to the v1 submitter so it honors the same config as v0.

💚 How did you test it?

  • test_consumer.py — new TestConsumerCaptureModeRouting: v0 -> batch_post, v1 -> send_v1_batch, config-forwarding, dedicated-AI split (analytics -> v1 submitter / $ai_* -> legacy AI endpoint), and AI-only batch skips the v1 submitter.
  • test_client.py — new TestClientSyncCaptureMode: v0 vs v1 sync submitter selection, config forwarding, dedicated-AI $ai_* event stays legacy, dedicated-AI analytics event uses v1.

Full test_consumer.py (25) and the new client class green; ruff format/check clean; mypy clean on client.py + consumer.py; regenerated references/public_api_snapshot.txt.

📝 Checklist

  • I reviewed the submitted code.
  • I added tests to verify the changes.
  • I updated the docs if needed.
  • No breaking change (v0 remains the default wire protocol).

🤖 Agent context

Autonomy: Human-driven (agent-assisted)

Authored with Cursor (Claude Opus 4.8) per the agreed plan. The routing split intentionally keeps the dedicated-AI destination on the legacy submitter — there is no v1 AI ingestion endpoint, so capture_mode only swaps the analytics submitter. This mirrors posthog-go, where CaptureMode likewise governs only the analytics path.

@greptile-apps

greptile-apps Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Reviews (1): Last reviewed commit: "feat(capture): route analytics through v..." | Re-trigger Greptile

Comment thread posthog/consumer.py
Comment on lines +174 to +186
for events, label, sender in (
(analytics_events, "analytics", self._send_analytics),
(ai_events, AI_EVENTS_ENDPOINT, self._send_ai),
):
if not events:
continue
try:
self._send(events, path)
sender(events)
except Exception as e:
if first_exc is None:
first_exc = e
else:
self.log.error("error uploading to %s: %s", path, e)
self.log.error("error uploading to %s: %s", label, e)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Dead label entry for analytics in the loop tuple

The "analytics" string in the analytics tuple is logically unreachable in the self.log.error branch. Because analytics is always the first iteration, first_exc is always None when it runs — so a failure there takes the first_exc = e branch, never self.log.error(..., label, e). That log path is only reachable on the second failing iteration, which is always AI. The "analytics" label is therefore dead code in the logging context, and the inconsistency with AI_EVENTS_ENDPOINT (a path constant) adds unnecessary noise. Consider using a consistent format for both labels (e.g. two plain strings, "analytics" and "AI") and accepting that only the second sender's label ever reaches the log.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment thread posthog/test/test_consumer.py Outdated
@github-actions

github-actions Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

posthog-python Compliance Report

Date: 2026-06-28 00:55:19 UTC
Duration: 530123ms

✅ All Tests Passed!

45/45 tests passed


Capture Tests

29/29 tests passed

View Details
Test Status Duration
Format Validation.Event Has Required Fields 518ms
Format Validation.Event Has Uuid 10007ms
Format Validation.Event Has Lib Properties 10008ms
Format Validation.Distinct Id Is String 10007ms
Format Validation.Token Is Present 10007ms
Format Validation.Custom Properties Preserved 10006ms
Format Validation.Event Has Timestamp 10007ms
Retry Behavior.Retries On 503 18020ms
Retry Behavior.Does Not Retry On 400 12004ms
Retry Behavior.Does Not Retry On 401 10006ms
Retry Behavior.Respects Retry After Header 16014ms
Retry Behavior.Implements Backoff 30017ms
Retry Behavior.Retries On 500 13012ms
Retry Behavior.Retries On 502 16010ms
Retry Behavior.Retries On 504 16011ms
Retry Behavior.Max Retries Respected 30017ms
Deduplication.Generates Unique Uuids 7003ms
Deduplication.Preserves Uuid On Retry 16015ms
Deduplication.Preserves Uuid And Timestamp On Retry 23020ms
Deduplication.Preserves Uuid And Timestamp On Batch Retry 16005ms
Deduplication.No Duplicate Events In Batch 10002ms
Deduplication.Different Events Have Different Uuids 10007ms
Compression.Sends Gzip When Enabled 10007ms
Batch Format.Uses Proper Batch Structure 10007ms
Batch Format.Flush With No Events Sends Nothing 5005ms
Batch Format.Multiple Events Batched Together 10004ms
Error Handling.Does Not Retry On 403 12010ms
Error Handling.Does Not Retry On 413 10007ms
Error Handling.Retries On 408 14013ms

Feature_Flags Tests

16/16 tests passed

View Details
Test Status Duration
Request Payload.Request With Person Properties Device Id 9502ms
Request Payload.Flags Request Uses V2 Query Param 10006ms
Request Payload.Flags Request Hits Flags Path Not Decide 10007ms
Request Payload.Flags Request Omits Authorization Header 10007ms
Request Payload.Token In Flags Body Matches Init 10006ms
Request Payload.Groups Round Trip 10007ms
Request Payload.Groups Default To Empty Object 10007ms
Request Payload.Person Properties Distinct Id Auto Populated When Caller Omits It 10007ms
Request Payload.Disable Geoip False Propagates As Geoip Disable False 10007ms
Request Payload.Disable Geoip Omitted Defaults To False 10007ms
Request Payload.Flag Keys To Evaluate Contains Only Requested Key 10007ms
Request Lifecycle.No Flags Request On Init Alone 5003ms
Request Lifecycle.No Flags Request On Normal Capture 10508ms
Request Lifecycle.Two Flag Calls Produce Two Remote Requests 9511ms
Request Lifecycle.Mock Response Value Is Returned To Caller 10002ms
Side Effect Events.Get Feature Flag Captures Feature Flag Called Event 10509ms

@eli-r-ph eli-r-ph force-pushed the capture-v1/03-transport branch from a901fdc to 7fd7dcd Compare June 27, 2026 23:16
@eli-r-ph eli-r-ph force-pushed the capture-v1/04-wire branch 2 times, most recently from 3274be1 to 5026185 Compare June 27, 2026 23:58
@eli-r-ph eli-r-ph force-pushed the capture-v1/03-transport branch 2 times, most recently from 32d7b02 to 3677400 Compare June 28, 2026 00:20
@eli-r-ph eli-r-ph force-pushed the capture-v1/04-wire branch from 5026185 to d1875fc Compare June 28, 2026 00:20
@eli-r-ph eli-r-ph self-assigned this Jun 28, 2026
eli-r-ph added 2 commits June 27, 2026 17:39
Wires send_v1_batch into both send paths so capture_mode actually takes
effect end to end. The consumer's async path (Consumer.request) and the
client's sync path (_enqueue) now pick the analytics submitter by
capture_mode: v1 -> the partial-retry send loop, v0 -> the legacy
batch_post. The dedicated AI endpoint has no v1 form, so $ai_* events on it
always use the legacy submitter regardless of capture_mode.

Refactors Consumer.request to route via _send_analytics/_send_ai helpers,
stores Client.max_retries so the sync path can pass it through, and forwards
gzip/timeout/retries/historical_migration to the v1 submitter. Default is
still v0, so existing callers are unaffected.

Adds consumer routing-matrix tests (v0/v1, dedicated-AI split, config
forwarding) and client sync-mode tests (v0 vs v1, dedicated-AI event stays
legacy, analytics event uses v1). ruff/mypy clean.
Resolve capture_compression once on the client (kwarg > env >
legacy gzip flag > none) and thread it to the v1 submitter via the
consumer and the sync path. Parameterize the capture_mode routing
tests and use consistent submitter labels.
@eli-r-ph eli-r-ph force-pushed the capture-v1/04-wire branch from d1875fc to 73e034f Compare June 28, 2026 00:45
@eli-r-ph eli-r-ph force-pushed the capture-v1/03-transport branch from 3677400 to c698bd5 Compare June 28, 2026 00:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant