Skip to content

feat: track AI generation outcome (kept vs discarded) and retry depth#2865

Open
ineagu wants to merge 2 commits into
developmentfrom
feat/track-ai-generation-outcome
Open

feat: track AI generation outcome (kept vs discarded) and retry depth#2865
ineagu wants to merge 2 commits into
developmentfrom
feat/track-ai-generation-outcome

Conversation

@ineagu

@ineagu ineagu commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Outcome

When a user runs the AI block (content generator), we currently see that a prompt was sent, but we have no signal about what happened next: did the user actually keep the generated content, or throw it away? And how many times did they have to regenerate before they were satisfied (or gave up)?

This PR closes that gap by recording the outcome of each AI generation — kept (inserted/replaced) vs. discarded — plus a coarse retry-depth bucket for how many regenerations a session took.

The product question this answers: Is AI generation producing output people actually use, and how much retrying does it take to get there? It informs where to invest in prompt quality and model tuning — a high discard rate or a fat tail of 4+ retries flags presets that aren't pulling their weight, while a healthy kept rate validates the feature. It also lets us compare outcomes per preset (form / textTransformation / patternsPicker) so improvement effort can be targeted.

What changed

  • src/blocks/blocks/content-generator/edit.js

    • Added an allowedPromptID() allowlist helper that maps the preset id to one of form | textTransformation | patternsPicker, falling back to other — so an arbitrary string can never reach the tracking wire.
    • Introduced a per-generation trackingKey (attributes?.id ?? clientId) used only as the dedup key for the tracking sets (never sent as a value), and a hasAccepted ref so a later discard can't clobber a prior accept.
    • Fires a "kept" outcome event when content is replaced (replaceBlocks) and when content is inserted into the page (insertContentIntoPage).
    • Passes trackingKey and hasAcceptedRef down to PromptPlaceholder.
    • Dropped the now-unused useMemo import.
  • src/blocks/components/prompt/index.tsx

    • Added matching allowedPromptID() allowlist helper and a retryBucket() helper that buckets the regenerate count into the coarse enum 0 | 1 | 2-3 | 4+.
    • Added a retryCount ref that resets on the first generation and increments on each regenerate; fires a retry-depth event on every generation.
    • Added the trackingKey and hasAcceptedRef props to PromptPlaceholderProps.
    • Fires a "discard" outcome event from the result placeholder's onClose — but only when there was generated output to throw away (resultHistory?.length > 0) and the generation wasn't already accepted (! hasAcceptedRef?.current).

Telemetry event shapes added

All events are emitted via window.oTrk?.set(...) (the dedup key in backticks is the per-session trackingKey, not part of the payload):

Outcome — kept by replacing the block:

{ feature: 'ai-generation', featureComponent: 'outcome-<form|textTransformation|patternsPicker|other>', featureValue: 'replace' }

Outcome — kept by inserting into the page:

{ feature: 'ai-generation', featureComponent: 'outcome-<form|textTransformation|patternsPicker|other>', featureValue: 'insert' }

Outcome — discarded (closed with output present, not previously accepted):

{ feature: 'ai-generation', featureComponent: 'outcome-<form|textTransformation|patternsPicker|other>', featureValue: 'discard' }

Retry depth (fired on each generation/regeneration):

{ feature: 'ai-generation', featureComponent: 'regenerate-count', featureValue: '0' | '1' | '2-3' | '4+' }

Compliance

  • Non-PII. No prompt text, generated content, block content, or user identity is sent. Every featureValue is a fixed enum (replace / insert / discard, and the bucketed retry counts), and every featureComponent is built from an allowlisted preset id — any unknown id collapses to other, so freeform strings can't leak onto the wire. The retry count is bucketed (0 / 1 / 2-3 / 4+) rather than reported raw.
  • Consent gate respected. These new events use window.oTrk?.set(...) with no { consent: true } argument, so they flow through the standard otter_blocks_logger_flag opt-in gate (the same flag surfaced in the welcome guide and dashboard "anonymous data tracking" toggle). They are dropped entirely when the user hasn't opted in. This is deliberately different from the pre-existing prompt and ai-toolbar events in this codebase, which pass { consent: true } to bypass the gate — none of the events added here use that bypass.
  • No new identifiers persisted. trackingKey (attributes?.id ?? clientId) is used only locally as the in-memory dedup key for set(); it is never included in any event payload.

Test plan

A reviewer can validate in the block editor with the AI (content generator) block, with anonymous data tracking enabled (otter_blocks_logger_flag = yes, e.g. via the welcome guide or the dashboard toggle):

  1. Retry depth: Add the AI block, run a prompt, then click Regenerate a few times. Each generation should fire feature: 'ai-generation', featureComponent: 'regenerate-count' with featureValue advancing 0 → 1 → 2-3 → 4+.
  2. Kept (insert): Generate, then choose to insert the result into the page. Expect an outcome-<preset> event with featureValue: 'insert'.
  3. Kept (replace): Use a flow that replaces a target block (e.g. the patterns/text-transformation path). Expect outcome-<preset> with featureValue: 'replace'.
  4. Discarded: Generate output, then close/dismiss the result without accepting. Expect outcome-<preset> with featureValue: 'discard'. Closing with no generated output should fire nothing.
  5. Accept-wins-over-discard: Accept a generation, then close — confirm no discard event fires after an accept (guarded by hasAccepted).
  6. Preset allowlisting: Confirm featureComponent is one of outcome-form / outcome-textTransformation / outcome-patternsPicker / outcome-other and never contains a raw/arbitrary id.
  7. Consent gate: With tracking disabled, repeat the above and confirm none of the new events are sent.

Observe events on the wire (network requests to the tiTrk tracking endpoint) or by instrumenting window.oTrk.set. Aggregated events surface downstream in the usual telemetry pipeline / Metabase.

Related

Part of the telemetry-expansion roadmap, following the data-logging pattern established in PR #2862 (block add/remove tracking) and reusing the same oTrk (tiTrk.with('otter')) plumbing and otter_blocks_logger_flag consent gate. Sibling to the existing ai-generation events (prompt, ai-toolbar) — this PR adds the missing outcome and retry-depth dimensions on top of them.

🤖 Generated with Claude Code

Adds kept-vs-discarded outcome and regenerate retry-depth signals to the AI block via the existing oTrk accumulator. No new free-text capture; no consent bypass; preset ids allowlisted.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@pirate-bot

pirate-bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Bundle Size Diff

Package Old Size New Size Diff
Animations 178.24 KB 178.24 KB 0 B (0.00%)
Blocks 1.5 MB 1.5 MB 1.22 KB (0.08%)
CSS 7.87 KB 7.87 KB 0 B (0.00%)
Dashboard 108.48 KB 108.48 KB 0 B (0.00%)
Onboarding 68.14 KB 68.14 KB 0 B (0.00%)
Export Import 4.7 KB 4.7 KB 0 B (0.00%)
Pro 320.08 KB 320.08 KB 0 B (0.00%)

@pirate-bot

pirate-bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Plugin build for 56dd0f2 is ready 🛎️!

@pirate-bot

pirate-bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

E2E Tests

Playwright Test Status: See serial and parallel matrix jobs

Performance Results serverResponse: {"q25":445.6,"q50":453.3,"q75":482.6,"cnt":10}, firstPaint: {"q25":519.3,"q50":588.65,"q75":648.6,"cnt":10}, domContentLoaded: {"q25":3366.1,"q50":3396.75,"q75":3434.7,"cnt":10}, loaded: {"q25":3366.8,"q50":3397.25,"q75":3435.2,"cnt":10}, firstContentfulPaint: {"q25":8940.6,"q50":9018.4,"q75":9047.8,"cnt":10}, firstBlock: {"q25":13497.7,"q50":13527.65,"q75":13560.7,"cnt":10}, type: {"q25":21.35,"q50":22.88,"q75":24.9,"cnt":10}, typeWithoutInspector: {"q25":17.31,"q50":18.99,"q75":19.83,"cnt":10}, typeWithTopToolbar: {"q25":28.01,"q50":28.75,"q75":30.08,"cnt":10}, typeContainer: {"q25":12.5,"q50":13.56,"q75":14.91,"cnt":10}, focus: {"q25":98.44,"q50":102.13,"q75":105.2,"cnt":10}, inserterOpen: {"q25":35.13,"q50":36.08,"q75":38.39,"cnt":10}, inserterSearch: {"q25":11.75,"q50":12.04,"q75":12.84,"cnt":10}, inserterHover: {"q25":4.35,"q50":4.59,"q75":4.7,"cnt":20}, loadPatterns: {"q25":1464.87,"q50":1505.17,"q75":1560.56,"cnt":10}, listViewOpen: {"q25":203.76,"q50":206.44,"q75":213.36,"cnt":10}

- emit retry depth on the success path with a high-water mark (no clobber by a later reset; failed/aborted regenerations no longer inflate it)
- track 'discard' on real block removal (covers toolbar/Backspace, not just the in-panel X) using a live output ref synced from the prompt component
- reset accepted state on each new generation; record 'insert'/'replace' only after the action actually runs

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants