aws-samples · isadeks · Jun 23, 2026 · Jun 23, 2026 · Jun 23, 2026 · Jun 23, 2026
@@ -1,139 +1,152 @@
 ---
 name: onboard-repo
 description: >-
-  Onboard a new GitHub repository to the ABCA platform by adding a Blueprint CDK
-  construct. Use when the user says "onboard a repo", "add a repository",
-  "register a repo", "new repo", "Blueprint construct", "REPO_NOT_ONBOARDED error",
-  or gets a 422 error about an unregistered repository.
+  Onboard a new GitHub repository to the ABCA platform so the agent can target it.
+  Use when the user says "onboard a repo", "add a repository", "register a repo",
+  "new repo", or gets a `REPO_NOT_ONBOARDED` / 422 error about an unregistered
+  repository.
 ---
 
 # Repository Onboarding
 
-You are guiding the user through onboarding a new GitHub repository to ABCA. Repositories must be registered as `Blueprint` constructs in the CDK stack before tasks can target them.
+You are helping an **operator** register a GitHub repository with their running ABCA
+deployment so tasks can target it.
 
-## Step 1: Gather Repository Details
+There are two paths.
 
-Use AskUserQuestion to collect:
-- **Repository**: GitHub `owner/repo` format
-- **Compute type**: `agentcore` (default) or `ecs`
-- **Model preference**: Claude Sonnet 4 (default), Claude Opus 4 (complex repos), or Claude Haiku (lightweight). **Important:** Models must be specified using their cross-region inference profile ID (e.g. `us.anthropic.claude-opus-4-20250514-v1:0`), not the raw foundation model ID. On-demand invocation of raw model IDs is not supported for most models.
-- **Max turns**: Default 100 (range: 1-500)
-- **Max budget**: USD cost ceiling per task (optional)
-- **Custom GitHub PAT**: If this repo needs a different token than the platform default
+**Prefer the CLI operator path (Path A)** when the repo can run on the
+**platform/default-blueprint** setup — the default GitHub token secret, a model
+already granted to the runtime, and the default egress allowlist. It's a single
+runtime command against the deployed stack: no code change, no redeploy.
 
-## Step 2: Read the Current Stack
+**Use the CDK Blueprint path (Path B)** when the repo needs its **own** config that
+the CLI can't provision at runtime — a per-repo GitHub token, a model not yet
+granted to the runtime, custom egress domains, Cedar HITL policies, or
+system-prompt overrides. These are baked into infrastructure and require a redeploy
+(with the correct permissions). When in doubt, start with Path A; if a task later
+fails on a missing token / model grant / blocked egress, promote the repo to a
+Blueprint.
 
-Read the CDK stack file to understand existing Blueprint definitions:
+> **This is an operation, not a contribution.** Onboarding a repo into your own
+> deployment writes a record to the platform's RepoTable — it is **not** a change to
+> the `aws-samples` codebase, so the ADR-003 contribution flow (GitHub issue →
+> approval → feature branch) does **not** apply. Only invoke ADR-003 if the user is
+> actually changing the platform source (e.g. wiring a brand-new Bedrock model into
+> the stack — see "Model not yet wired into the runtime" below).
 
+## Path A — CLI operator onboarding (default)
+
+`bgagent repo onboard` writes (or re-activates) the repository's `RepoConfig` row in
+the deployed RepoTable directly. It takes effect immediately — **no `agent.ts` edit,
+no `cdk deploy`.**
+
+```bash
+bgagent repo onboard <owner/repo>
+# common overrides:
+#   --model <inference-profile-id>     e.g. us.anthropic.claude-sonnet-4-6
+#   --compute-type <agentcore|ecs>
+#   --max-turns <n>
+#   --token-secret-arn <arn>           per-repo GitHub token (else platform default)
 ```
-Read cdk/src/stacks/agent.ts
+
+Then confirm it landed:
+
+```bash
+bgagent repo list                 # status should be "active"
+bgagent repo show <owner/repo>    # full resolved config (secret ARNs redacted)
 ```
 
-Identify:
-- Where existing Blueprint constructs are defined
-- The `repoTable` reference used
-- Any patterns for compute/model overrides
+That's it — the repo is onboarded. Submit a task with the `submit-task` skill.
 
-### Sample blueprint repo without a code change
+**Pick a model that is already wired into the runtime.** With no `--model`, the repo
+uses the platform default (Sonnet 4.6). If you pass `--model`, use a cross-Region
+**inference profile ID** (e.g. `us.anthropic.claude-sonnet-4-6`), not a raw
+`anthropic.*` foundation-model ID. Only models the stack has granted the runtime can
+be invoked — see "Model not yet wired into the runtime" before choosing a model the
+deployment doesn't already support.
 
-The stack’s **AgentPlugins** blueprint uses a `repo` value resolved in this order: **`BLUEPRINT_REPO`** (environment variable) → CDK context **`blueprintRepo`** → default `awslabs/agent-plugins` (see `blueprintRepo` in `cdk/src/stacks/agent.ts`). If the user only needs to target a fork of the sample repo, they can set `export BLUEPRINT_REPO=owner/repo` or pass `-c blueprintRepo=owner/repo` (or set `"context": { "blueprintRepo": "..." }` in `cdk/cdk.json`) and redeploy, instead of adding a new `Blueprint` construct.
+## Path B — CDK Blueprint (declarative / canonical)
 
-## Step 3: Add the Blueprint Construct
+Use this when the operator wants the repo committed to infrastructure-as-code (so a
+fresh deploy re-creates it) rather than set as a runtime record. This **does** require
+editing the stack and redeploying.
 
-Add a new `Blueprint` construct instance to the stack. Follow the existing pattern. Example:
+1. Read `cdk/src/stacks/agent.ts` to find where `Blueprint` constructs are defined and
+   the `repoTable` reference.
+2. Add a construct following the existing pattern:
 
-```typescript
-new Blueprint(this, 'MyRepoBlueprint', {
-  repo: 'owner/repo',
-  repoTable: repoTable.table,
-  // Optional overrides:
-  // computeType: 'agentcore',
-  // modelId: 'us.anthropic.claude-sonnet-4-20250514-v1:0',
-  // maxTurns: 100,
-  // maxBudgetUsd: 50,
-  // runtimeArn: runtime.runtimeArn,
-  // githubTokenSecretArn: 'arn:aws:secretsmanager:...',
-});
-```
+   ```typescript
+   new Blueprint(this, 'MyRepoBlueprint', {
+     repo: 'owner/repo',
+     repoTable: repoTable.table,
+     // Optional overrides:
+     // computeType: 'agentcore',
+     // modelId: 'us.anthropic.claude-sonnet-4-6',
+     // maxTurns: 100,
+     // maxBudgetUsd: 50,
+     // githubTokenSecretArn: 'arn:aws:secretsmanager:...',
+   });
+   ```
 
-Use a descriptive construct ID derived from the repo name.
+3. Redeploy: `mise //cdk:compile` → `mise //cdk:diff` (show the diff) → `mise //cdk:deploy -- --require-approval never`.
 
-### Model ID and IAM Permissions
+> **Sample-repo shortcut:** the stack's AgentPlugins blueprint resolves its `repo` from
+> `BLUEPRINT_REPO` (env) → CDK context `blueprintRepo` → default `awslabs/agent-plugins`.
+> To target a fork of the sample without adding a construct, set
+> `export BLUEPRINT_REPO=owner/repo` (or `cdk.json` context) and redeploy.
 
-When specifying a non-default model via `agent.modelId`, three things are required:
+## Model not yet wired into the runtime (the one real code change)
 
-1. **Use the inference profile ID, not the raw model ID, when Bedrock requires it.** For `InvokeModel` / streaming, specify the cross-Region **inference profile** identifier (or ARN) where the Bedrock User Guide calls for it — not only the bare `anthropic.*` foundation model ID. Examples:
-   - Sonnet 4.6 (US geography profile): `us.anthropic.claude-sonnet-4-6`
-   - Sonnet 4: `us.anthropic.claude-sonnet-4-20250514-v1:0`
-   - Opus 4: `us.anthropic.claude-opus-4-20250514-v1:0`
-   - Haiku 4.5: `us.anthropic.claude-haiku-4-5-20251001-v1:0`
+A repo can only use a model the **runtime IAM role has `grantInvoke` for**. As of now
+the stack wires **Sonnet 4.6, Opus 4 (`claude-opus-4-20250514`), and Haiku 4.5** (see
+the `grantInvoke` block in `agent.ts`). Onboarding a repo pinned to any **other** model
+(e.g. Opus 4.8 / `us.anthropic.claude-opus-4-8`) will fail at invoke with a 403 — the
+CLI onboard succeeds, but tasks can't run.
 
-   See [Use an inference profile in model invocation](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles-use.html).
+Adding a new model **is** a platform source change, so it follows ADR-003 (issue →
+approval → feature branch) and requires:
 
-2. **Grant the runtime IAM permissions for the model.** The Blueprint construct does not automatically grant `bedrock:InvokeModel*` — this is by design (least privilege). You must add a `grantInvoke` block in the stack for each model used:
+1. **Wire the model + inference profile and grant the runtime**, in `agent.ts`:
    ```typescript
-   const opusModel = new bedrock.BedrockFoundationModel('anthropic.claude-opus-4-20250514-v1:0', {
+   const model = new bedrock.BedrockFoundationModel('anthropic.claude-opus-4-8', {
      supportsAgents: true,
      supportsCrossRegion: true,
    });
-   opusModel.grantInvoke(runtime);
-
-   const opusProfile = bedrock.CrossRegionInferenceProfile.fromConfig({
+   model.grantInvoke(runtime);
+   const profile = bedrock.CrossRegionInferenceProfile.fromConfig({
      geoRegion: bedrock.CrossRegionInferenceProfileRegion.US,
-     model: opusModel,
+     model,
    });
-   opusProfile.grantInvoke(runtime);
+   profile.grantInvoke(runtime);
    ```
+   then redeploy.
+2. **Account-level Bedrock model access** (separate from IAM): the account must have the
+   model enabled for the Region — complete [model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access.html)
+   prerequisites (Marketplace actions / Anthropic first-time use where applicable). For
+   cross-Region profiles, IAM and SCPs must allow Bedrock in source **and** destination
+   Regions.
 
-3. **Account-level Bedrock model access (separate from IAM).** The runtime role must be allowed to invoke the model, and the **AWS account** must be able to use that model in Bedrock: complete [model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access.html) prerequisites (AWS Marketplace actions on first serverless use where applicable, Anthropic first-time use / `PutUseCaseForModelAccess` for Anthropic models, valid payment method for Marketplace-backed models). For geographic cross-Region inference profiles, IAM and SCPs must allow Bedrock in **source and destination** Regions per [Supported Regions and models for inference profiles](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles-support.html).
-
-## Step 4: Deploy
-
-After adding the Blueprint, the stack must be redeployed:
-
-```bash
-export MISE_EXPERIMENTAL=1
-mise //cdk:compile   # Verify TypeScript compiles
-mise //cdk:test      # Run tests
-mise //cdk:diff      # Preview changes
-```
-
-Show the diff to the user. If it looks correct, ask if they want to deploy now.
-
-```bash
-mise //cdk:deploy
-```
-
-## Step 5: Verify
-
-After deployment, verify the repo config was written to DynamoDB:
-
-```bash
-aws dynamodb scan --table-name <RepoTableName> \
-  --filter-expression "repo = :r" \
-  --expression-attribute-values '{":r":{"S":"owner/repo"}}' \
-  --output json
-```
+If the user just wants the agent working now, steer them to a wired model (Sonnet 4.6)
+via Path A and treat "add model X" as a separate, later change.
 
-## Per-Repository Configuration Reference
+## Per-repository configuration reference
 
 | Setting | Purpose | Default |
 |---------|---------|---------|
 | `compute_type` | Execution strategy | `agentcore` |
-| `runtime_arn` | AgentCore runtime override | Platform default |
-| `model_id` | AI model for tasks | Platform default (Sonnet 4) |
+| `model_id` | AI model for tasks (inference profile ID) | Platform default (Sonnet 4.6) |
 | `max_turns` | Turn limit per task | 100 |
 | `max_budget_usd` | Cost ceiling per task | Unlimited |
 | `system_prompt_overrides` | Custom system instructions | None |
 | `github_token_secret_arn` | Repo-specific GitHub token | Platform default |
 | `poll_interval_ms` | Completion polling frequency | 30000ms |
 
-Task-level parameters override Blueprint defaults. If neither specifies a value, platform defaults apply.
+Task-level parameters override per-repo defaults; if neither specifies a value, platform defaults apply.
 
-## Common Issues
+## Common issues
 
-- **422 "Repository not onboarded"** — Blueprint hasn't been deployed yet. Add the construct and redeploy.
-- **Preflight failures after onboarding** — GitHub PAT may lack permissions for the new repo. Check the PAT's fine-grained access includes the target repository with Contents (read/write) and Pull requests (read/write) permissions.
-- **400 "Invocation with on-demand throughput isn't supported"** — The Blueprint `modelId` is using a raw foundation model ID instead of an inference profile ID. Change e.g. `anthropic.claude-opus-4-20250514-v1:0` to `us.anthropic.claude-opus-4-20250514-v1:0`.
-- **403 "not authorized to perform bedrock:InvokeModelWithResponseStream"** — The runtime IAM role lacks permissions for the model specified in the Blueprint. Add `grantInvoke` for both the model and its cross-region inference profile in `agent.ts`.
-- **Model not available / "not available on your Bedrock deployment"** — IAM is not the whole story: the account must meet [Bedrock model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access.html) for that model family and Region, and `modelId` should be an **enabled** inference profile ID (for example `us.anthropic.claude-sonnet-4-6`) where Bedrock requires it. After fixing access in the console, align Blueprint / DynamoDB `model_id` and redeploy if you change IAM grants.
+- **`REPO_NOT_ONBOARDED` / 422** — the repo isn't registered. Run `bgagent repo onboard <owner/repo>` (Path A). Confirm the `owner/repo` matches exactly what you pass to `bgagent submit --repo`.
+- **Preflight failure after onboarding** — the GitHub PAT lacks access to the new repo. Ensure the token has Contents (read/write) + Pull requests (read/write) on it, or onboard with a repo-specific `--token-secret-arn`.
+- **400 "Invocation with on-demand throughput isn't supported"** — `model_id` is a raw foundation-model ID; use the inference-profile ID (e.g. `us.anthropic.claude-sonnet-4-6`).
+- **403 "not authorized to perform bedrock:InvokeModelWithResponseStream"** — the repo's model isn't wired into the runtime. See "Model not yet wired into the runtime."
+- **Model not available / "not available on your Bedrock deployment"** — account-level Bedrock access isn't enabled for that model/Region (separate from IAM); complete [model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access.html), then use an enabled inference-profile ID.
@@ -48,28 +48,33 @@ Run these steps in order, verifying each:
 6. `mise run install` — Install all workspace dependencies
 7. `mise run build` — Full monorepo build (agent quality + CDK + CLI + docs)
 
-If `mise run install` fails with "yarn: command not found", Corepack wasn't activated. If `prek install` fails about `core.hooksPath`, another hook manager owns hooks — suggest `git config --unset-all core.hooksPath`.
+Common Phase 2 snags to pre-empt (don't let these read as a broken environment):
+- "yarn: command not found" → Corepack wasn't activated (step 3).
+- `prek install` fails about `core.hooksPath` → another hook manager owns hooks; suggest `git config --unset-all core.hooksPath`.
+- Node, Yarn, AND CDK all "not found" at once → expected before `mise install` finishes; mise provisions them.
+- `mise install` fails Node on GPG verification (headless/EC2, no gpg-agent) → `mise settings set node.gpg_verify false` (still checksum-verified), retry.
+- "config not trusted" for `~/.config/mise/config.toml` → run `mise trust` on the user-global config too, not just the project one.
+- In a non-interactive/spawned shell, `mise` may not be on `PATH` → use `~/.local/bin/mise` or `mise exec --`.
 
-## Phase 3: One-Time AWS Setup
+## Phase 3: One-Time Host Setup (build architecture)
 
-On a fresh AWS account, X-Ray needs a CloudWatch Logs resource policy before it can write spans. Run both commands — the first creates the policy, the second sets the destination:
+The agent image is built for **linux/arm64** (AgentCore runs on Graviton). On an **x86_64** build host this is the most common first-deploy blocker — the image build dies with `exec /bin/sh: exec format error`. Register QEMU emulation once per host:
 
 ```bash
-ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
-aws logs put-resource-policy \
-  --policy-name xray-spans-policy \
-  --policy-document "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Sid\":\"XRaySpansAccess\",\"Effect\":\"Allow\",\"Principal\":{\"Service\":\"xray.amazonaws.com\"},\"Action\":[\"logs:PutLogEvents\",\"logs:CreateLogGroup\",\"logs:CreateLogStream\"],\"Resource\":[\"arn:aws:logs:*:${ACCOUNT_ID}:log-group:aws/spans\",\"arn:aws:logs:*:${ACCOUNT_ID}:log-group:aws/spans:*\"]}]}"
-aws xray update-trace-segment-destination --destination CloudWatchLogs
+docker run --privileged --rm tonistiigi/binfmt --install arm64
 ```
 
-These must be run once per AWS account before first deployment. If the `put-resource-policy` step is skipped, the `update-trace-segment-destination` command fails with `AccessDeniedException`.
+If `docker run --privileged` is blocked (security-managed hosts), deploy from a **native arm64 host** (Graviton EC2 / Apple Silicon) instead. On Apple Silicon / arm64 hosts, skip this phase.
+
+**X-Ray tracing is OPTIONAL — do not gate deployment on it.** The stack ships with X-Ray→CloudWatch-Logs export disabled (`tracingEnabled` in `agent.ts`), so it deploys and runs fully without any X-Ray setup. Do NOT run `aws xray update-trace-segment-destination` as a prerequisite — on a security-managed AWS Org account an SCP can make that call fail with `AccessDeniedException` no matter what, dead-ending the user on a step the platform doesn't use. Mention tracing only as an opt-in extra.
 
 ## Phase 4: First Deployment
 
 Guide through:
 
 1. `mise //cdk:bootstrap` — Bootstrap CDK (if not already done for this account/region)
-2. `mise //cdk:deploy` — Deploy the stack (~9.5 minutes)
+2. `mise //cdk:deploy -- --require-approval never` — Deploy the stack (~9.5 minutes). The flag avoids the approval prompt hanging in a non-interactive shell.
+   - If the deploy rolls back on a missing IAM permission and lands in `ROLLBACK_COMPLETE`, the stack can't be updated — `mise //cdk:destroy` then redeploy. Teardown can stall in `DELETE_FAILED` for ~20–40 min while AgentCore's service-managed (Hyperplane) ENIs are reclaimed; wait, then retry destroy. Never force-delete past stuck VPC resources (orphans the VPC; VPCs are quota-capped per Region).
 3. Retrieve stack outputs:
    ```bash
    aws cloudformation describe-stacks --stack-name backgroundagent-dev \