Mirrowel · claw-io · Apr 5, 2026 · Apr 5, 2026 · Apr 5, 2026 · Apr 5, 2026
@@ -0,0 +1 @@
+/home/b3nw/.gemini/config/projects/22148748-73e2-4bca-aff2-c0bfb511dd11.json
@@ -22,10 +22,13 @@ ENV/
 
 # Build
 *.egg-info/
-dist/
 build/
 .eggs/
 
+# Web UI (only package.json, package-lock.json, and src/ are needed)
+webui/node_modules
+webui/dist
+
 # Logs (will be mounted as volume)
 logs/
 

@@ -234,6 +234,33 @@
 # Examples:
 # QUOTA_GROUPS_GEMINI_CLI_PRO="gemini-2.5-pro,gemini-3-pro-preview"
 
+# --- Model Fallback / Spillover ---
+# Configure fallback providers for specific models. When a prefixed request
+# (e.g., google/gemma-4-31b-it) exhausts all credentials on the primary
+# provider due to scaling issues or errors, the proxy will automatically
+# try the listed fallback providers in order.
+#
+# This does NOT affect unprefixed requests (those use MODEL_ALIAS instead).
+# Fallback only triggers on transient provider failures (5xx, rate limits,
+# connection errors). Request-level errors (400, 401, 403) are never retried.
+#
+# Format: MODEL_FALLBACK_<MODEL_NAME>=provider1[:model1],provider2[:model2][|retry_mode]
+#
+# Model name: dashes → underscores, dots → underscores, uppercased
+#   gemma-4-31b-it → GEMMA_4_31B_IT
+#
+# If no :model is specified after a provider, the original model name is used.
+#
+# Retry mode (appended after |):
+#   exhaust      - (Default) Try all credentials on each fallback provider
+#                  before moving to the next. Gives each provider a full shot.
+#   round_robin  - Try one credential per provider, cycling through.
+#
+# Examples:
+# MODEL_FALLBACK_GEMMA_4_31B_IT="nvidia_nim,ollama_cloud"
+# MODEL_FALLBACK_GEMMA_4_31B_IT="nvidia_nim:google/gemma-4-31b-it,ollama_cloud:gemma-4-31b-it|exhaust"
+# MODEL_FALLBACK_DEEPSEEK_V3="nvidia_nim,google|round_robin"
+
 # ------------------------------------------------------------------------------
 # | [ADVANCED] Fair Cycle Rotation                                              |
 # ------------------------------------------------------------------------------
@@ -369,6 +396,99 @@
 # Default: 8085
 # GEMINI_CLI_OAUTH_PORT=8085
 
+# ------------------------------------------------------------------------------
+# | [CODEX] OpenAI Codex Provider Configuration                                |
+# ------------------------------------------------------------------------------
+#
+# Codex provider uses OAuth authentication with OpenAI's ChatGPT backend API.
+# Credentials are stored in oauth_creds/ directory as codex_oauth_*.json files.
+#
+
+# --- Reasoning Effort ---
+# Controls how much "thinking" the model does before responding.
+# Higher effort = more thorough reasoning but slower responses.
+#
+# Available levels (model-dependent):
+#   - low: Minimal reasoning, fastest responses
+#   - medium: Balanced (default)
+#   - high: More thorough reasoning
+#   - xhigh: Maximum reasoning (gpt-5.2, gpt-5.2-codex, gpt-5.3-codex, gpt-5.1-codex-max only)
+#
+# Can also be controlled per-request via:
+#   1. Model suffix: codex/gpt-5.2:high
+#   2. Request param: "reasoning_effort": "high"
+#
+# CODEX_REASONING_EFFORT=medium
+
+# --- Reasoning Summary ---
+# Controls how reasoning is summarized in responses.
+# Options: auto, concise, detailed, none
+# CODEX_REASONING_SUMMARY=auto
+
+# --- Reasoning Output Format ---
+# How reasoning/thinking is presented in responses.
+# Options:
+#   - think-tags: Wrap in <think>...</think> tags (default, matches other providers)
+#   - raw: Include reasoning as-is
+#   - none: Don't include reasoning in output
+# CODEX_REASONING_COMPAT=think-tags
+
+# --- Identity Override ---
+# When true, injects an override that tells the model to prioritize
+# user-provided system prompts over the required opencode instructions.
+# CODEX_INJECT_IDENTITY_OVERRIDE=true
+
+# --- Instruction Injection ---
+# When true, injects the required opencode system instruction.
+# Only disable if you know what you're doing (API may reject requests).
+# CODEX_INJECT_INSTRUCTION=true
+
+# --- Empty Response Handling ---
+# Number of retry attempts when receiving empty responses.
+# CODEX_EMPTY_RESPONSE_ATTEMPTS=3
+
+# Delay (seconds) between empty response retries.
+# CODEX_EMPTY_RESPONSE_RETRY_DELAY=2
+
+# --- OAuth Configuration ---
+# OAuth callback port for Codex interactive authentication.
+# Default: 8086
+# CODEX_OAUTH_PORT=8086
+
+
+# --- GitHub Copilot ---
+# GitHub Copilot provider uses Device Flow OAuth.
+# The GitHub OAuth token (long-lived) is used to derive short-lived
+# Copilot API tokens (~30 min expiry, refreshed automatically).
+#
+# Numbered credential format (recommended for multiple accounts):
+#   COPILOT_1_GITHUB_TOKEN=gho_xxxxx  (first GitHub account)
+#   COPILOT_2_GITHUB_TOKEN=gho_yyyyy  (second GitHub account)
+#
+# Legacy single-credential format:
+#   COPILOT_GITHUB_TOKEN=gho_xxxxx
+#
+# Optional: override the default model list
+#   COPILOT_MODELS=gpt-4o,claude-sonnet-4,gemini-2.5-pro
+#
+# To obtain a GitHub OAuth token, run the proxy with --add-credential
+# and select the Copilot provider, or use the interactive Device Flow
+# by starting the proxy without any COPILOT env vars.
+
+# --- KiloCode ---
+# KiloCode is configured as a custom OpenAI-compatible provider.
+# API key and base URL follow the standard pattern:
+#   KILO_API_BASE=https://api.kilo.ai/api/openrouter/
+#   KILO_API_KEY_1="your-kilo-api-key"
+#
+# Optional: credit balance monitoring via the Kilo web dashboard.
+# Obtain this value from the browser cookie __Secure-next-auth.session-token
+# after logging in to https://app.kilo.ai/profile
+# The token auto-refreshes (~30-day TTL) and the proxy keeps it alive.
+# If absent or expired, requests still work — quota simply shows as unknown.
+#KILO_SESSION_TOKEN=""
+#KILO_QUOTA_REFRESH_INTERVAL=600
+
 # ------------------------------------------------------------------------------
 # | [ADVANCED] Debugging / Logging                                              |
 # ------------------------------------------------------------------------------

@@ -0,0 +1,172 @@
+#!/usr/bin/env python3
+"""Validate the LLM-API-Key-Proxy fork stack metadata.
+
+This script intentionally uses only the Python standard library so it can run in
+fresh workspaces without installing project dependencies.
+"""
+
+from __future__ import annotations
+
+import re
+import subprocess
+import sys
+from pathlib import Path
+
+ROOT = Path(__file__).resolve().parents[1]
+STACK = ROOT / ".fork" / "stack.yml"
+FEATURES = ROOT / ".fork" / "features"
+AGENTS = ROOT / "AGENTS.md"
+
+SUBJECT_RE = re.compile(r"^\s*subject:\s+\"(?P<subject>.+)\"\s*$")
+ID_RE = re.compile(r"^\s*- id:\s+(?P<id>[A-Za-z0-9_.-]+)\s*$")
+DUP_FEATURE_RE = re.compile(r"^\s{4}(?P<feature>[A-Za-z0-9_.-]+):\s*$")
+DUP_SUBJECT_RE = re.compile(r"^\s{6}-\s+\"(?P<subject>.+)\"\s*$")
+PREFIX_RE = re.compile(r"^(?P<kind>feat|fix)\((?P<feature>[^)]+)\):")
+
+
+def git(*args: str) -> str:
+    return subprocess.check_output(["git", *args], cwd=ROOT, text=True)
+
+
+def parse_manifest() -> tuple[dict[str, str], dict[str, str], dict[str, set[str]]]:
+    text = STACK.read_text()
+    ids: dict[str, str] = {}
+    subjects: dict[str, str] = {}
+    allowed_duplicates: dict[str, set[str]] = {}
+    current_id: str | None = None
+    in_allowed = False
+    current_allowed: str | None = None
+
+    for line in text.splitlines():
+        if line.strip() == "allowed_duplicate_features:":
+            in_allowed = True
+            current_allowed = None
+            continue
+        if line.startswith("features:"):
+            in_allowed = False
+            current_allowed = None
+            continue
+        if in_allowed:
+            m = DUP_FEATURE_RE.match(line)
+            if m:
+                current_allowed = m.group("feature")
+                allowed_duplicates.setdefault(current_allowed, set())
+                continue
+            m = DUP_SUBJECT_RE.match(line)
+            if m and current_allowed is not None:
+                allowed_duplicates.setdefault(current_allowed, set()).add(m.group("subject"))
+            continue
+
+        m = ID_RE.match(line)
+        if m:
+            current_id = m.group("id")
+            ids[current_id] = ""
+            continue
+        m = SUBJECT_RE.match(line)
+        if m and current_id is not None:
+            subjects[m.group("subject")] = current_id
+            ids[current_id] = m.group("subject")
+            current_id = None
+
+    return ids, subjects, allowed_duplicates
+
+
+def stack_subjects() -> list[str]:
+    output = git("log", "--format=%s", "--reverse", "upstream/dev..HEAD")
+    return [line for line in output.splitlines() if line]
+
+
+def check_agents(errors: list[str]) -> None:
+    text = AGENTS.read_text()
+    release_notes = sum(1 for line in text.splitlines() if line.strip() == "### Release Notes")
+    if release_notes != 1:
+        errors.append(f"AGENTS.md must contain exactly one '### Release Notes' heading (found {release_notes})")
+    if text.count("```") % 2:
+        errors.append("AGENTS.md has unbalanced fenced code blocks")
+    if any(line.strip() == "git add -A" for line in text.splitlines()):
+        errors.append("AGENTS.md contains an executable `git add -A` example")
+    for marker in ("<<<<<<<", ">>>>>>>"):
+        if marker in text:
+            errors.append(f"AGENTS.md contains conflict marker {marker}")
+    if ".fork/features" not in text:
+        errors.append("AGENTS.md must document .fork/features as canonical feature history")
+    if "local workspace state" not in text.lower():
+        errors.append("AGENTS.md must state that local workspace state is non-canonical")
+
+
+def check_stack(errors: list[str]) -> None:
+    ids, manifest_subjects, allowed_duplicates = parse_manifest()
+    subjects = stack_subjects()
+    stack_set = set(subjects)
+
+    for subject in manifest_subjects:
+        if subject not in stack_set:
+            errors.append(f"manifest subject not found in stack: {subject}")
+
+    for subject in subjects:
+        if subject not in manifest_subjects:
+            m = PREFIX_RE.match(subject)
+            if not m:
+                errors.append(f"stack commit lacks known manifest subject and feature prefix: {subject}")
+                continue
+            feature = m.group("feature")
+            allowed = allowed_duplicates.get(feature, set())
+            if subject not in allowed:
+                errors.append(f"stack commit is not in manifest or allowed exceptions: {subject}")
+
+    by_feature: dict[str, list[str]] = {}
+    for subject in subjects:
+        m = PREFIX_RE.match(subject)
+        if not m:
+            continue
+        by_feature.setdefault(m.group("feature"), []).append(subject)
+
+    for feature, feature_subjects in sorted(by_feature.items()):
+        if len(feature_subjects) <= 1:
+            continue
+        allowed = allowed_duplicates.get(feature, set())
+        unexpected = [s for s in feature_subjects if s not in allowed]
+        manifest_for_feature = [s for s, fid in manifest_subjects.items() if fid == feature]
+        # Multiple commits are allowed only when every commit is either the canonical
+        # manifest subject for that feature or an explicitly documented exception.
+        permitted = set(allowed) | set(manifest_for_feature)
+        if any(s not in permitted for s in feature_subjects):
+            errors.append(f"feature {feature!r} has unexpected duplicate stack commits: {feature_subjects}")
+
+    for feature_id in ids:
+        feature_file = FEATURES / f"{feature_id}.md"
+        if not feature_file.exists():
+            # Only require detailed histories for features that have a feature file
+            # once they change under the new workflow. Keep stack-wide adoption
+            # incremental instead of forcing 20+ stub docs on day one.
+            continue
+        text = feature_file.read_text()
+        subject = ids[feature_id]
+        if subject and subject not in text:
+            errors.append(f"{feature_file} does not mention its stack subject")
+
+
+def main() -> int:
+    errors: list[str] = []
+    if not STACK.exists():
+        errors.append("missing .fork/stack.yml")
+    if not FEATURES.exists():
+        errors.append("missing .fork/features/")
+    if not AGENTS.exists():
+        errors.append("missing AGENTS.md")
+    if not errors:
+        check_agents(errors)
+        check_stack(errors)
+
+    if errors:
+        print("fork stack validation failed:", file=sys.stderr)
+        for err in errors:
+            print(f"- {err}", file=sys.stderr)
+        return 1
+
+    print("fork stack validation passed")
+    return 0
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())
@@ -0,0 +1,51 @@
+## 2026-06-21 — Fix release job failing when short SHA length differs between runners
+
+Target: `feat(ci): fork-aware release notes with incremental topic diff` (`ea5f239`)
+
+Files:
+- `.github/workflows/build.yml`
+
+Working commit before autosquash:
+- TBD — created via `fixup! feat(ci): ...`
+
+Final stack commit after autosquash:
+- TBD — folded into `feat(ci): ...`
+
+### Why
+
+Run 27859339250 / job 82452676947 failed in **Generate Build Metadata**
+with `find: 'release-assets': No such file or directory`.
+
+Root cause: `git rev-parse --short HEAD` returns the minimum length
+needed for SHA uniqueness in the local object DB — and that length is
+not deterministic across runners. For run 27859339250 the build jobs
+uploaded artifacts named `proxy-app-build-{Linux,macOS,Windows}-afec625`
+(7 chars) while the release job filtered with
+`proxy-app-build-*-afec6255` (8 chars). Zero artifacts matched, the
+download step exited 0 anyway, and the next bash step (set `-e -o pipefail`)
+crashed on the missing directory.
+
+### Fix
+
+1. Pin both `Get short SHA` steps (build job and release job) to
+   `git rev-parse --short=7 HEAD` so they always agree.
+2. Add a defensive `Verify downloaded artifacts` step right after the
+   download that fails with a clear error and lists the available
+   artifacts when the download silently matched zero items.
+
+### Verification
+
+- `python3 -c "import yaml; yaml.safe_load(open('.github/workflows/build.yml'))"` — OK
+- 7-char SHA matches the length already in use for artifact names, so
+  no re-upload of historical artifacts is required.
+- Recommended: re-run the failed workflow after the fix is folded into
+  the `feat(ci)` stack commit and pushed.
+
+### Notes / risks
+
+- A fully-orthogonal future fix is to pin everything to the full
+  40-char SHA — that decouples the artifact name from git's notion of
+  "short" entirely.
+- Another option is to drop `pattern:` on `download-artifact@v4` and
+  filter by an explicit list (artifact IDs or full names) — `pattern:`
+  glob matching across multi-runner SHA lengths is a recurring foot-gun.
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		/home/b3nw/.gemini/config/projects/22148748-73e2-4bca-aff2-c0bfb511dd11.json