feat(capture): route analytics through v1 submitter (capture v1, 4/6)#704
feat(capture): route analytics through v1 submitter (capture v1, 4/6)#704eli-r-ph wants to merge 2 commits into
Conversation
|
Reviews (1): Last reviewed commit: "feat(capture): route analytics through v..." | Re-trigger Greptile |
| for events, label, sender in ( | ||
| (analytics_events, "analytics", self._send_analytics), | ||
| (ai_events, AI_EVENTS_ENDPOINT, self._send_ai), | ||
| ): | ||
| if not events: | ||
| continue | ||
| try: | ||
| self._send(events, path) | ||
| sender(events) | ||
| except Exception as e: | ||
| if first_exc is None: | ||
| first_exc = e | ||
| else: | ||
| self.log.error("error uploading to %s: %s", path, e) | ||
| self.log.error("error uploading to %s: %s", label, e) |
There was a problem hiding this comment.
Dead
label entry for analytics in the loop tuple
The "analytics" string in the analytics tuple is logically unreachable in the self.log.error branch. Because analytics is always the first iteration, first_exc is always None when it runs — so a failure there takes the first_exc = e branch, never self.log.error(..., label, e). That log path is only reachable on the second failing iteration, which is always AI. The "analytics" label is therefore dead code in the logging context, and the inconsistency with AI_EVENTS_ENDPOINT (a path constant) adds unnecessary noise. Consider using a consistent format for both labels (e.g. two plain strings, "analytics" and "AI") and accepting that only the second sender's label ever reaches the log.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
posthog-python Compliance ReportDate: 2026-06-28 00:55:19 UTC ✅ All Tests Passed!45/45 tests passed Capture Tests✅ 29/29 tests passed View Details
Feature_Flags Tests✅ 16/16 tests passed View Details
|
a901fdc to
7fd7dcd
Compare
3274be1 to
5026185
Compare
32d7b02 to
3677400
Compare
5026185 to
d1875fc
Compare
Wires send_v1_batch into both send paths so capture_mode actually takes effect end to end. The consumer's async path (Consumer.request) and the client's sync path (_enqueue) now pick the analytics submitter by capture_mode: v1 -> the partial-retry send loop, v0 -> the legacy batch_post. The dedicated AI endpoint has no v1 form, so $ai_* events on it always use the legacy submitter regardless of capture_mode. Refactors Consumer.request to route via _send_analytics/_send_ai helpers, stores Client.max_retries so the sync path can pass it through, and forwards gzip/timeout/retries/historical_migration to the v1 submitter. Default is still v0, so existing callers are unaffected. Adds consumer routing-matrix tests (v0/v1, dedicated-AI split, config forwarding) and client sync-mode tests (v0 vs v1, dedicated-AI event stays legacy, analytics event uses v1). ruff/mypy clean.
Resolve capture_compression once on the client (kwarg > env > legacy gzip flag > none) and thread it to the v1 submitter via the consumer and the sync path. Parameterize the capture_mode routing tests and use consistent submitter labels.
d1875fc to
73e034f
Compare
3677400 to
c698bd5
Compare
💡 Motivation and Context
Fourth PR in the stacked Capture V1 series (stacked on #703). This is the one that makes
capture_modeactually do something: it wiressend_v1_batch(from #703) into both send paths, so the previously-inert v1 transport is now selectable end to end. Default is stillv0, so existing callers are unaffected.Consumer.request) — refactored to route via_send_analytics/_send_ai._send_analyticspicks the submitter bycapture_mode:v1-> the partial-retrysend_v1_batch,v0-> the legacybatch_postloop._send_aialways uses the legacy submitter.Client._enqueue,sync_mode=True) — same branch: an analytics event inv1goes throughsend_v1_batch([msg]); everything else stays onbatch_post.$ai_*events routed to it always use the legacy submitter regardless ofcapture_mode. When the dedicated AI endpoint is not enabled, all events (including$ai_*) go to the single analytics destination and therefore followcapture_mode— exactly mirroring the v0 single-endpoint behavior.Supporting changes:
Client.max_retriesis now stored so the sync path can forward it, andgzip/timeout/retries/historical_migrationare forwarded to the v1 submitter so it honors the same config as v0.💚 How did you test it?
test_consumer.py— newTestConsumerCaptureModeRouting: v0 ->batch_post, v1 ->send_v1_batch, config-forwarding, dedicated-AI split (analytics-> v1 submitter /$ai_*-> legacy AI endpoint), and AI-only batch skips the v1 submitter.test_client.py— newTestClientSyncCaptureMode: v0 vs v1 sync submitter selection, config forwarding, dedicated-AI$ai_*event stays legacy, dedicated-AI analytics event uses v1.Full
test_consumer.py(25) and the new client class green;ruff format/checkclean;mypyclean onclient.py+consumer.py; regeneratedreferences/public_api_snapshot.txt.📝 Checklist
🤖 Agent context
Autonomy: Human-driven (agent-assisted)
Authored with Cursor (Claude Opus 4.8) per the agreed plan. The routing split intentionally keeps the dedicated-AI destination on the legacy submitter — there is no v1 AI ingestion endpoint, so
capture_modeonly swaps the analytics submitter. This mirrors posthog-go, whereCaptureModelikewise governs only the analytics path.