Skip to content

Agent Separation & Parallel Slide Generation#71

Draft
ShotaroKataoka wants to merge 96 commits intomainfrom
feat/agent-separation-spec-composer
Draft

Agent Separation & Parallel Slide Generation#71
ShotaroKataoka wants to merge 96 commits intomainfrom
feat/agent-separation-spec-composer

Conversation

@ShotaroKataoka
Copy link
Copy Markdown
Contributor

Agent Separation & Parallel Slide Generation

🚧 Draft — implementation in progress, will be ready for review when parallel generation is complete.

Summary

Split the single monolithic agent into a SPEC agent (Phase 1: user dialogue & specification) and composer agents (Phase 2+3: slide generation & polish), enabling parallel slide generation to reduce total compose time from ~10 min to ~1-2 min.

Motivation

  • Single-agent architecture causes context bloat — Phase 1 dialogue context is carried into Phase 2, increasing token cost and degrading output quality
  • Serial slide generation (1+ min/slide × 10 slides) is unacceptable for web UI UX
  • presentation.json as a single file prevents concurrent writes

Architecture

SPEC Agent (Phase 1) Composer Agent(s) (Phase 2+3)
User dialogue Slide JSON generation
Brief / Outline / Art Dir Measure → Preview → Polish
│ ▲
└── compose_slides(instruction, deck_id) ──┘
(Agents as Tools pattern)

Changes

File splitpresentation.jsondeck.json + slides/{slug}.json

  • Enables per-slide concurrent writes
  • Outline uses slug-based ordering

Agent separation — SPEC agent + composer via compose_slides tool

  • SPEC agent: lightweight model, handles user interaction only
  • Composer: heavier model, pre-loaded with all Phase 2/3 references
  • COMPOSER_MODEL_ID env var for independent model selection

Composer optimization

  • Prefetch all references + deck specs + slides listing into system prompt
  • Explicit "skip MUST run/read" instructions to prevent redundant tool calls
  • deck_id embedded in prompt to prevent init_presentation calls

Progress streamingcompose_slides streams sub-tool progress to WebUI via invoke_async + callback + Queue

Remaining (in this branch)

  • Parallel composer agents (asyncio.gather + semaphore)
  • Group splitting logic for slide batches
  • Local validation of Engine layer changes

…eview

SPEC: 20260413-0806_svg-composition-animation
Phase 1-3: compose module, API integration, WebUI animation

- mcp-server/tools/compose.py: extract_optimized_defs + split_slide_components
  (font strip, PNG→WebP, component split with bbox/class/text metadata)
- server.py: compose after _run_measure (sync, ~300ms for 8 slides)
- api/index.py: defsUrl + composeUrl presigned URLs in deck detail
- AnimatedSlidePreview.tsx: SVG build + diff animation (class+bbox key)
  cursor fly-in → wireframe drag → materialize + typewriter
- SlideCarousel: composeUrl → animated preview, fallback to WebP thumbnail
- DOMPurify sanitization, prefers-reduced-motion, version check fallback
- useWorkspace: presigned URL stabilization for compose/defs URLs
…build cache

- Disable DOMPurify sanitization (strips SVG clip-path, namespaces, fills)
- Fix slidesWithPreview filter to include composeUrl-only slides
- Fix hasSlides to check composeUrl in addition to previewUrl

SPEC: 20260413-0806_svg-composition-animation
… webp separation

- Fix compose.py: use slide-specific SlideBackground over master page
- Epoch-keyed compose S3 keys for proper update detection
- API: _latest_compose_key() to resolve latest epoch
- AnimatedSlidePreview: skip animation on initial load
- Remove WebP generation from run_python (kept in generate_pptx)
- Generate compose for all slides (frontend diff handles animation)
- count_slides() helper in compose.py

SPEC: 20260413-0806_svg-composition-animation
…ose on save-only

- Compose runs on save=True even without measure_slides
- SVG export shared between measure and compose (single conversion)
- Remove WebP from run_python (kept in generate_pptx)
- Auto-scroll to first changed slide before animation
- AnimatedSlidePreview accepts slideId for scroll targeting

SPEC: 20260413-0806_svg-composition-animation
…ration

- Extract _export_svg() from _run_measure for reuse by compose
- Compose generates SVG when measure_slides not present (save-only)
- Add Dockerfile cache-bust for forced image rebuild

SPEC: 20260413-0806_svg-composition-animation
…sition fix

- Add initialLoad prop: page-load compose = instant, session-first = animate all
- Scroll to top of changed slide with 24px padding instead of center

SPEC: 20260413-0806_svg-composition-animation
…a onAnimate callback

SPEC: 20260413-0806_svg-composition-animation
…ewriter fast

SPEC: 20260413-0806_svg-composition-animation
…hange detected

SPEC: 20260413-0806_svg-composition-animation
…p by composeUrl

- Diff uses text+class instead of raw SVG (stable across LibreOffice re-renders)
- Skip re-render when composeUrl base path unchanged (ignore defs-only changes)

SPEC: 20260413-0806_svg-composition-animation
…_1 matched slide_10/11)

SPEC: 20260413-0806_svg-composition-animation
…pose cleanup

- server.py: sourceHash (slide JSON md5) for cross-slide matching
- server.py: 2-level diff (slide-level sourceHash + component-level class+bbox)
- server.py: cleanup old epoch compose files after upload
- AnimatedSlidePreview: simplified to use backend changed flag only
- SlideCarousel: removed initialComposeIds, kept scroll via onAnimate

SPEC: 20260413-0806_svg-composition-animation
…nimation

- server.py: fallback to slot-number diff when sourceHash mismatches
- AnimatedSlidePreview: first render = instant, defer composeUrl changes during animation

SPEC: 20260413-0806_svg-composition-animation
…ction + defer animation

- AnimatedSlidePreview: interval-based check (no useEffect dep on composeUrl),
  skipAnimation prop for instant render on page load
- SlideCarousel: deckReadyRef tracks initial load, detects new slides for scroll
- Remove debug logs

SPEC: 20260413-0806_svg-composition-animation
…t B303

SPEC: 20260413-0806_svg-composition-animation
- AnimatedSlidePreview: reset lastComposeUrlRef when skipAnimation transitions true→false
- generate.py: fallback template blank-dark instead of non-existent default

SPEC: 20260413-0806_svg-composition-animation
…ecision

- Existing deck (slides on mount) → first compose instant, subsequent animate
- New deck (no slides on mount) → all composes animate
- No localStorage, no complex state — just mount-time fact

SPEC: 20260413-0806_svg-composition-animation
- Lint/bias block: 2-space → 4-space indent consistency
- Move 'import re' outside for-loop

SPEC: 20260413-0806_svg-composition-animation
SPEC: 20260414-0111_agent-separation-spec-composer
Progress: Phase C tasks 1-12 (file split block) completed
- Layer 1: _resolve_config supports directory input (deck.json + slides/*.json)
- Storage ABC: 4 abstract methods added (get/put_deck_json, get/put_slide_json)
- AwsStorage: deck.json + slides/{slug}.json read/write implemented
- generate.py: _prepare_workspace/_assemble_slides separated, new/legacy format auto-detect
- sandbox.py: prefix-based scan save, updated _WORKSPACE_PREFIXES
- init.py: creates deck.json instead of presentation.json
- server.py: measure_slides changed from list[int] to list[str] (slug), lint uses _assemble_slides
Next: outline slug化 + WebUI対応
SPEC: 20260414-0111_agent-separation-spec-composer
Progress: Phase C tasks 13-16 (outline slug化 block) completed
- Outline workflow: [N: label] → [slug] kebab-case format
- Art-direction workflow: writes deck.json instead of presentation.json
- outlineParser.ts: slug regex, OutlineSlide uses slug instead of num/title
- OutlineView.tsx: node circle shows index+1, slug as bold label, message below
Next: エージェント分離
…resolution

- _resolve_template: handle directory input correctly (use dir as base_dir, not dir.parent)
- _prepare_workspace: catch only ValueError for deck.json fallback (not all exceptions)
- _prepare_workspace: add warning log for missing outline.md instead of silent pass
- _resolve_config: simplify _resolve_template call
Progress: 16/16 tasks checked off (file split + outline slug blocks)
Notes: implementation details + self-review findings (3 fixes)
…ides tool

SPEC: 20260414-0111_agent-separation-spec-composer
Progress: Phase C tasks 17-21 (agent separation block)
- SPEC agent prompt: Phase 1 only, delegates to compose_slides
- Composer agent prompt: Phase 2+3, no user interaction, deck.json read-only
- compose_slides @tool: Strands Agents as Tools pattern, creates composer Agent per call
- Briefing workflow: add Constraints & Requests + Materials sections
- create_agent: wire compose_slides, rename to SdpmSpecAgent
Next: Phase C verification
Progress: 21/21 Phase C tasks checked off (file split + outline slug + agent separation)
Notes: Strands Agents as Tools investigation + implementation details
Reverts 212502b and 2ebb2eb — .kiro/specs is gitignored and should not be committed
…ser agents

SPEC: composer sub-agents were missing cache_config, causing full prompt
reprocessing on every Bedrock turn. Added CacheConfig(strategy="auto")
to composer_model to match the main agent's configuration.
@okamoto-aws okamoto-aws added blog:pending ブログ記事にする labels Apr 15, 2026
ShotaroKataoka and others added 26 commits April 15, 2026 20:14
…omposer agents

- Split prefetch into system (workflow/spec) and refs (guides/examples)
- System prompt: template + workflow only, with cachePoint for cross-turn caching
- User message: reference data (grid, components, patterns) + deck specs + instruction
- Enables Bedrock prompt cache hits on system prompt across composer LLM turns
- Cross-group cache sharing possible when routed to same region

SPEC: 20260415-1915_composer-parallel-perf
…oser model

Cuts tail latency by failing fast when Bedrock takes too long to return
the first streaming chunk. Default retry strategy will handle retries.
…mpose progress

Agent now sends tool input alongside tool name in progress events.
UI displays detail (purpose, slide_id, etc.) next to each sub-tool label.
…progress via hooks

Use BeforeToolCallEvent/AfterToolCallEvent hooks instead of callback_handler
to get parsed tool input (purpose, slide_id, etc.) for sub-tool display.
callback_handler only has empty input during streaming.
…ail display

Hooks implementation broke sub-tool display in compose progress.
Reverting to working callback_handler-only approach.
Tool input detail display needs proper local testing before re-implementing.
… (verified)

- BeforeToolCallEvent hook: sends input (purpose, slide_id, etc.) confirmed working via local test
- AfterToolCallEvent hook: sends tool completion status
- ChatPanel: updates existing tool entry with input from hook event
- ToolCard: displays detail from getDetail() next to sub-tool label
- Fix: removed !ev.toolUseId filter that was blocking status events
…versation resume

Replace outer serial retry (full restart) with inner retry loop that
reuses the same composer Agent instance. On failure, invoke_async is
called with prompt=None to resume from existing conversation history,
preserving completed tool call results.

SPEC: 20250415-2220_agent-separation-spec-composer
Progress: composer retry with conversation resume
…ompose failure

Wrap compose_slides main logic in try/except so that partial results
(successfully generated slides) are always included in the final report,
even when prefetch or some groups fail unexpectedly.
…nable info

Include failed slugs, failure phase (prefetch/compose), and retryable
flag so the SPEC agent can decide whether to retry or inform the user.
…per composer

Create per-composer MCPClient instances via factories instead of sharing
a single set across all parallel composers. Each composer gets its own
MCP connections, preventing one group's connection failure from affecting
others. Connections are cleaned up in finally block after each group.
On retry, stop old MCP connections, create fresh ones via factories,
and rebuild the composer Agent with conversation history preserved.
This handles cases where MCP connection died mid-execution.
- Clear tool list on retrying event (hide stale tools from failed attempt)
- Show RefreshCw icon with spin animation during retry
- Display retry attempt number in group header
…s in compose UI

- Merge toolResult events into tool entries by toolUseId
- Show Check icon for successful tools, AlertCircle (red) for failed
- Spinner only on the last tool when still executing
- Error tools use ERR color scheme matching the main ToolCard pattern
…ep simple retry

Remove per-composer MCP connections and reconnection logic (caused
initialization failures with too many simultaneous connections).
Revert to shared MCP servers. Keep simple retry with prompt=None
for conversation resume on Bedrock timeouts.
…io to ThreadPoolExecutor

Root cause: asyncio event loop interference caused boto3 streaming
Read timed out when using invoke_async within main agent's tool
execution context.

Fix: Use ThreadPoolExecutor + synchronous composer() calls so each
composer runs in its own thread with independent event loop, matching
the stable path used by the main agent.

Also fix ChatPanel.tsx to propagate attempt/error fields in group
status events, and widen error display in ToolCard.tsx.

SPEC: 20260415-1915_composer-parallel-perf
…switching

- Composer system prompt: explicit prohibition on writing non-assigned slides with reason (parallel data race)
- Composer user message: inject assigned slug list with write restriction reminder
- SPEC agent: add spec files checklist before compose_slides
- WebUI: detect compose_slides tool call for slides tab auto-switch

SPEC: agent-separation-spec-composer
- SPEC agent: enforce all Phase 1 steps in order, require all content in spec files
- AnimatedSlidePreview: reset error on new composeUrl to recover from transient defs 404

SPEC: agent-separation-spec-composer
…cTab declaration

settled useEffect referenced specTab before its useState declaration,
causing 'Cannot access before initialization' at runtime after minification.

SPEC: agent-separation-spec-composer
- Storage: presentation.json → deck.json + slides/{slug}.json
- Deck Workspace: update file listing and outline format
- Restore compose/ in Storage section (still generated, now slug-based)
- Restore LibreOffice in JA Layer 3 diagram (still used for preview/measure)
- Preview Generation: update step 1 description
…entation-maker into feat/agent-separation-spec-composer

# Conflicts:
#	.kiro/steering/tech-public.md
#	api/index.py
#	docs/ja/architecture.md
#	mcp-server/server.py
#	web-ui/package-lock.json
#	web-ui/src/components/deck/AnimatedSlidePreview.tsx
#	web-ui/src/components/deck/SlideCarousel.tsx
… main merge

Port main's sourceHash-based content diff into slug-keyed compose:
- Use hashlib.md5(usedforsecurity=False) for bandit B303 compliance
- Short-circuit: if sourceHash unchanged, mark all components unchanged
- Fallback to bbox/text component-level diff otherwise

This preserves main's intent from commits c1866c1 (sourceHash diff) and
dceede6 (B303) while keeping the branch's slug-keyed compose format.
LLM was putting all 12 slides in 1 group because none had structural
relationships (no core groups). Add explicit rule: split independent
slides into groups of 2-3 for parallel execution.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

blog:pending ブログ記事にする

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants