From 3174d658373bf75f73f01973cdb660a5b2823ef6 Mon Sep 17 00:00:00 2001 From: Algis Dumbris Date: Thu, 21 May 2026 21:23:24 +0300 Subject: [PATCH 1/2] chore: remove tracked report/backup junk from repo MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - scanner-qa-report.html (root, generated, unreferenced) โ€” also de-clutters the top-level listing so the README surfaces sooner. - specs/005-.../tasks.md.bak (editor backup) - specs/044-.../test-report.html (generated report) --- scanner-qa-report.html | 417 --------- .../tasks.md.bak | 301 ------- .../044-diagnostics-taxonomy/test-report.html | 838 ------------------ 3 files changed, 1556 deletions(-) delete mode 100644 scanner-qa-report.html delete mode 100644 specs/005-rest-management-integration/tasks.md.bak delete mode 100644 specs/044-diagnostics-taxonomy/test-report.html diff --git a/scanner-qa-report.html b/scanner-qa-report.html deleted file mode 100644 index 1c267daf..00000000 --- a/scanner-qa-report.html +++ /dev/null @@ -1,417 +0,0 @@ - - - - - -MCPProxy Security Scanner QA Report - - - -
- -

MCPProxy Security Scanner QA Report

-

Comprehensive audit of scanning feature across all server types | 2026-04-06 | Branch: feat/039-security-scanner-plugins

- - - - -

1. Executive Overview

- -
-
-
11
-
Servers Tested
-
-
-
6
-
Scanners Installed
-
-
-
214
-
Total Scans Run
-
-
-
42
-
Total Bugs Found
-
-
-
6
-
Bugs Fixed
-
-
-
1,899
-
Total Findings
-
-
- -
-
Findings by Severity (Global Overview)
- - - - - - -
LevelCountPercentage
Critical140.7%
High63133.2%
Medium1,16361.2%
Low914.8%
-

- Threat classification: 247 dangerous, 473 warnings, 1,179 informational -

-
- - -

2. Testing Coverage

- -
-
QA Methodology
- - - - - - - -
PhaseMethodScope
API Testingcurl + jq72 API tests across all scan endpoints
Frontend Code ReviewStatic analysisServerDetail.vue (2,000+ lines), Security.vue, api.ts
Backend Code ReviewStatic analysisservice.go, engine.go, source_resolver.go, registry_bundled.go, security_scanner.go
Visual UI TestingChrome screenshotsGlobal Security page, server detail security tabs
Scanner QualityFalse positive analysisAll findings from cisco-mcp-scanner, trivy, semgrep
-
- -
-
Server Types Tested
- - - - - - - - -
TypeServersSource MethodScanners Run
HTTP (remote)context7, hugginface, kaggle, supabaseurl1-6
Streamable-HTTP (remote)kubic, synapbusurl1-3
Stdio (local)demo-filesystemworking_dir6
Stdio (Docker)perplexity, screenshot-website-fastdocker_extract3-6
Stdio (quarantined)malicious-demouvx_cache6 (1 failed)
Stdio (disconnected)everything-servernpx_cache6
-
- - -

3. Server-by-Server Scan Results

- -
- -
-
context7
-
HTTP | https://mcp.context7.com/mcp | 2 tools
-
Risk: 60 2 False Positives
-
cisco-mcp-scanner: 2 findings (PROMPT INJECTION)
-
semgrep-mcp: 0 findings
-
trivy-mcp: 0 findings
-
ramparts: 0 findings
-
nova-proximity: 0 findings
-
mcp-scan: 0 findings
-
- -
-
demo-filesystem
-
Stdio (local) | working_dir | 14 tools
-
Risk: 31 7 Findings
-
trivy-mcp: 5 findings (secrets)
-
semgrep-mcp: 2 findings (secrets)
-
cisco-mcp-scanner: 0 findings
-
ramparts/nova/mcp-scan: 0 findings
-
- -
-
perplexity
-
Stdio (Docker) | docker_extract | 3 tools
-
Risk: 20 2 CVEs
-
trivy-mcp: 2 findings (MCP SDK CVEs)
-
All other scanners: 0 findings
-

CVE-2025-66414 (DNS rebinding), CVE-2026-0621 (ReDoS) in @modelcontextprotocol/sdk

-
- -
-
malicious-demo
-
Stdio (quarantined) | uvx_cache | 0 tools (disconnected)
-
Risk: 0 Scan Incomplete
-
cisco-mcp-scanner: FAILED (tools.json not found)
-
All other scanners: 0 findings
-

Tool poisoning detector could not run - server failed to connect for tool export

-
- -
-
ElevenLabs
-
Stdio | Error state | 0 tools
-
Risk: 100 16 False Positives
-
cisco-mcp-scanner: 16 findings (SYSTEM MANIPULATION)
-

Audio processing tools incorrectly flagged as system manipulation

-
- -
-
hugginface / kaggle / supabase / kubic / synapbus
-
HTTP/Streamable-HTTP | url | 5-58 tools
-
Risk: 0 Clean
-

Note: supabase and kubic had 2 failed scanners each (cisco, trivy) due to tools.json not exported at time of scan

-
- -
- - -

4. All Bugs Found (42 total)

- -

API / Backend Bugs (27)

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
#SeverityCategoryDescriptionStatus
1HighAPIConcurrent scan returns 500 instead of 409 ConflictFixed
2HighBackendDuplicate findings when merging Pass 1 + Pass 2 reports (same CVE appears twice)Fixed
3HighBackendSecurity overview threat levels all zero (dangerous/warnings/info_level not aggregated)Fixed
4HighBackendmalicious-demo tools.json not exported - cisco scanner fails, server shows "clean"Open
5HighBackendCancelScan doesn't cancel running Docker containers (uses context.Background())Open
6HighBackendRace condition between Pass 1 completion and Pass 2 startOpen
7HighBackendReport directory (scanner-reports/) never cleaned upOpen
8HighBackendNo scanner-source matching: all scanners run on all source typesOpen
9MediumAPIhandleStartScan silently ignores JSON decode errorsOpen
10MediumBackendPass 1 cleanup removes temp dir before Pass 2 can use itOpen
11MediumBackendRace condition reading/writing job.Status without lockOpen
12MediumAPIPOST scan for nonexistent server returns 500 instead of 404Open
13MediumBackendtools_exported inconsistently null for some serversOpen
14MediumBackendInconsistent scanner count: some servers get 6 scanners, others only 1-3Open
15MediumBackendDocker cache mount at /root/.cache may conflict with scanner-specific pathsOpen
16MediumBackendextractTopLevelDir includes /usr, /var for Docker - too broad for supply chain auditOpen
17MediumBackendcancel-all wipes scan job data for servers with active scansOpen
18MediumBackendScan report has duplicate scanner entries for multi-scanned serversOpen
19LowBackendValidateManifest requires Command non-empty, but 3 bundled scanners have nil CommandOpen
20LowBackendparseResults silently treats unparseable scanner output as 'clean'Open
21LowBackendFile-to-findings path matching uses flawed normalizationOpen
22LowBackendGetScanSummary doesn't check for active Pass 2 scansOpen
23LowBackendCisco scanner hardcodes --tools /scan/source/tools.json pathOpen
24LowBackendDocker-extracted scans report total_files=0 despite scanning extracted filesOpen
25LowBackendArgument-based source resolution matches non-flag args as file paths incorrectlyOpen
26LowBackendJob ID collision risk with time.Now().UnixNano() generationOpen
27LowBackendhandleGetScanFiles retrieves report independently of job (potential mismatch)Open
- -

Frontend / UI Bugs (15)

- - - - - - - - - - - - - - - - - -
#SeverityCategoryDescriptionStatus
28HighUINo Cancel button during active scan (API exists but UI doesn't expose it)Fixed
29MediumUIScanned Files section visible for tool_definitions_only source methodFixed
30MediumUINo retry button after scan failureFixed
31HighUIRace condition: polling completion fires before scanReport loadsOpen
32HighUI"Already in progress" error extracts job ID with fragile regexOpen
33MediumUINo debounce on Scan Now button (rapid clicks can cause issues)Open
34MediumUIPolling continues silently on network errors with no max retryOpen
35MediumUIScan error alert has no dismiss actionOpen
36MediumUIApprove/Reject only shown with findings (can't approve clean servers)Open
37MediumUIActive scan state lost on page navigation and returnOpen
38LowUIInconsistent risk score color thresholds between pagesOpen
39LowUIFailed scanners counted as "completed" in progress barOpen
40LowUIScanner Execution Logs depend on scanStatus populated at wrong timeOpen
41LowUINo explanation of Risk Score metric anywhereOpen
42LowUINo "last scanned" timestamp shown prominentlyOpen
- - -

5. Bugs Fixed (6 implemented)

- -
-
Fix 1: Concurrent scan returns 409 Conflict Verified
-

File: internal/httpapi/security_scanner.go

-

When a scan is already running for a server and another scan is triggered, the API now returns HTTP 409 Conflict instead of 500 Internal Server Error.

-
-
- s.writeError(w, r, http.StatusInternalServerError, err.Error())
-
+ if strings.Contains(err.Error(), "already in progress") {
-
+ s.writeError(w, r, http.StatusConflict, err.Error())
-
+ } else {
-
+ s.writeError(w, r, http.StatusInternalServerError, err.Error())
-
+ }
-
-

Validated: POST /scan returns 409 with "scan already in progress" message

-
- -
-
Fix 2: Deduplicate Pass 1/Pass 2 findings Verified
-

File: internal/security/scanner/service.go

-

When merging Pass 1 (security scan) and Pass 2 (supply chain audit) reports, duplicate findings (same scanner + rule + title) are now removed. Pass 1 findings take priority.

-

Example: Perplexity had 4 findings (2 duplicated). Now correctly shows 2.

-

Validated: perplexity report shows 2 findings (was 4)

-
- -
-
Fix 3: Security overview threat level aggregation Verified
-

File: internal/security/scanner/service.go

-

The global security overview now correctly counts findings by threat level (dangerous, warnings, info_level). Previously these were all zero because ClassifyFinding() wasn't called during overview aggregation.

-

Validated: Overview shows dangerous=247, warnings=473, info_level=1179 (was all 0)

-
- -
-
Fix 4: Cancel button in security tab Verified
-

File: frontend/src/views/ServerDetail.vue

-

Added a "Cancel" button that appears during active scans. Calls the existing cancelScan API endpoint, stops polling, and resets scan state.

-

Validated: Cancel button renders, calls API correctly

-
- -
-
Fix 5: Scanned Files section visibility Verified
-

File: frontend/src/views/ServerDetail.vue

-

The Scanned Files collapsible section is now hidden for HTTP servers and tool_definitions_only source methods (no filesystem to show files for).

-

Validated: Section hidden for url, url_full, and tool_definitions_only

-
- -
-
Fix 6: Retry button after scan error Verified
-

File: frontend/src/views/ServerDetail.vue

-

Added a "Retry" button to the scan error alert, allowing users to easily re-trigger a scan after failure without refreshing the page.

-

Validated: Retry button clears error and re-triggers scan

-
- - -

6. False Positive Analysis

- -
-
False Positive: context7 "PROMPT INJECTION" (Risk: 60)
-

Scanner: cisco-mcp-scanner | Findings: 2

-

What was flagged:

-
Tool: resolve-library-id -Evidence: "You MUST call this function before 'Query Documentation' tool to obtain a valid Context7-compatible library ID UNLESS the user explicitly provides a library ID..."
-

Analysis: This is standard MCP tool description pattern. Context7 instructs the LLM to call resolve-library-id before query-docs. The phrase "You MUST call" triggers the prompt injection detector, but this is normal tool orchestration guidance, not malicious prompt injection.

-

Verdict: FALSE POSITIVE — Cisco scanner is too aggressive with imperative language in tool descriptions.

-
- -
-
False Positive: ElevenLabs "SYSTEM MANIPULATION" (Risk: 100)
-

Scanner: cisco-mcp-scanner | Findings: 16 (2 dangerous + 14 warning)

-

What was flagged: All audio tools (text_to_speech, speech_to_text, text_to_sound_effects, isolate_audio, speech_to_speech, etc.) flagged as "SYSTEM MANIPULATION"

-

Analysis: ElevenLabs is a legitimate audio processing API. Its tools interact with audio data, not system resources. The scanner's description of system manipulation ("unsolicited modification or deletion of files, registries") does not match what these tools do.

-

Verdict: FALSE POSITIVE — Cisco scanner misclassifies media processing as system manipulation.

-
- -
-
True Positives (Confirmed Real Issues)
- - - - -
ServerFindingsAssessment
demo-filesystem7 findings (Stripe key, GitHub PAT, private keys)TRUE POSITIVE - real secrets in filesystem
perplexity2 CVEs (DNS rebinding, ReDoS in MCP SDK)TRUE POSITIVE - real vulnerabilities in dependencies
-
- - -

7. Remaining Issues (Not Fixed)

- -
-
Critical: malicious-demo tool poisoning not detected
-

The quarantined malicious-demo server can't have its tool definitions exported because it fails to connect (MCP initialize timeout). The cisco-mcp-scanner, which is the primary tool poisoning detector, requires /scan/source/tools.json which can't be created without a connection.

-

Impact: Quarantined servers that are truly malicious can't be scanned for tool poisoning โ€” the exact scenario this feature is designed for.

-

Suggested fix: Cache tool definitions when they are first discovered (before quarantine), so scanning can use cached definitions even when the server refuses to connect.

-
- -
-
High: Inconsistent scanner count across servers
-

Some servers get 6 scanners, others only 1-3. The scanner selection logic doesn't match scanner capabilities to source types. For example, hugginface (HTTP, 8 tools) only ran semgrep, while context7 (HTTP, 2 tools) ran all 6.

-

Impact: Inconsistent security coverage across servers.

-

Suggested fix: Implement scanner-to-source capability matching based on scanner input requirements.

-
- -
-
High: False positive rate from cisco-mcp-scanner
-

The Cisco MCP Scanner produces a high false positive rate for standard MCP tool descriptions. Imperative language ("You MUST call", "always use") and media processing tools are incorrectly flagged.

-

Impact: Risk score of 60-100 for legitimate servers, eroding user trust.

-

Suggested fix: Implement scanner result post-processing to filter known false positive patterns, or adjust cisco scanner configuration thresholds.

-
- - -

8. Recommendations

- -
-

Priority 1 (Next Sprint)

-
    -
  • Cache tool definitions for quarantined servers to enable tool poisoning detection
  • -
  • Implement scanner-source capability matching to avoid running irrelevant scanners
  • -
  • Add false positive suppression rules for cisco-mcp-scanner (imperative language patterns)
  • -
  • Fix CancelScan to actually terminate Docker containers
  • -
  • Fix Pass 1/Pass 2 race condition (Pass 2 starts before Pass 1 cleanup)
  • -
-
- -
-

Priority 2 (Future)

-
    -
  • Add report directory cleanup (TTL-based or max-size)
  • -
  • Add Risk Score explanation tooltip in the UI
  • -
  • Show "last scanned" timestamp prominently
  • -
  • Add scan history view (past scans comparison)
  • -
  • Improve error handling for nonexistent servers (404 instead of 500)
  • -
  • Add scanner input/output type enforcement during installation
  • -
-
- -
-

Priority 3 (Polish)

-
    -
  • Standardize risk score color thresholds across all pages
  • -
  • Add debounce to Scan Now button
  • -
  • Add polling error limit (stop after N consecutive failures)
  • -
  • Show scanner capability badges in the scanner list
  • -
  • Improve progress bar to distinguish failed vs completed scanners
  • -
-
- -
-

- Generated 2026-04-06 | MCPProxy v0.23.1 | Branch: feat/039-security-scanner-plugins -
QA Coverage: 72 API tests, 15 UI bugs, 20 backend bugs, 8 design issues, 10 UX improvements -

- -
- - diff --git a/specs/005-rest-management-integration/tasks.md.bak b/specs/005-rest-management-integration/tasks.md.bak deleted file mode 100644 index 1c93a91a..00000000 --- a/specs/005-rest-management-integration/tasks.md.bak +++ /dev/null @@ -1,301 +0,0 @@ -# Tasks: REST Endpoint Management Service Integration - -**Input**: Design documents from `/specs/005-rest-management-integration/` -**Prerequisites**: plan.md, spec.md, data-model.md, contracts/management-service.yaml - -**Tests**: Test tasks included per FR-015, FR-016, FR-017 (unit, integration, E2E validation) - -**Organization**: Tasks are grouped by user story to enable independent implementation and testing of each story. - -## Format: `[ID] [P?] [Story] Description` - -- **[P]**: Can run in parallel (different files, no dependencies) -- **[Story]**: Which user story this task belongs to (e.g., US1, US2, US3) -- Include exact file paths in descriptions - -## Path Conventions - -This project uses single project structure: -- `internal/` - All Go packages -- `cmd/` - Command-line applications -- `scripts/` - Test and build scripts - ---- - -## Phase 1: Setup (Shared Infrastructure) - -**Purpose**: Verify existing infrastructure and review current implementations - -Since this is a refactoring within an existing codebase, setup is minimal. - -- [ ] T001 Review existing management service interface in internal/management/service.go -- [ ] T002 Review existing runtime implementations in internal/server/server.go:1447 and internal/server/server.go:136 -- [ ] T003 [P] Review existing REST handlers in internal/httpapi/server.go:1155 and internal/httpapi/server.go:1050 - -**Checkpoint**: Understand current code structure before refactoring - ---- - -## Phase 2: Foundational (Blocking Prerequisites) - -**Purpose**: Core interface extension that MUST be complete before ANY user story can be implemented - -**โš ๏ธ CRITICAL**: No REST handler work can begin until management service interface is extended - -- [ ] T004 Extend ManagementService interface in internal/management/service.go with GetServerTools method signature -- [ ] T005 Extend ManagementService interface in internal/management/service.go with TriggerOAuthLogin method signature - -**Checkpoint**: Foundation ready - user story implementation can now begin - ---- - -## Phase 3: User Story 1 - Unified Server Management via REST API (Priority: P1) ๐ŸŽฏ MVP - -**Goal**: Refactor two REST endpoints to delegate to management service layer, ensuring architectural compliance with spec 004 and consistent behavior across all interfaces. - -**Independent Test**: Call REST endpoints directly (`GET /api/v1/servers/{id}/tools` and `POST /api/v1/servers/{id}/login`) and verify they delegate to management service methods, emit events, and respect configuration gates. - -### Unit Tests for User Story 1 (Per FR-015) - -> **NOTE: Write these tests FIRST using TDD approach, ensure they FAIL before implementation** - -- [ ] T006 [P] [US1] Add unit test for GetServerTools with valid server name in internal/management/service_test.go -- [ ] T007 [P] [US1] Add unit test for GetServerTools with empty server name in internal/management/service_test.go -- [ ] T008 [P] [US1] Add unit test for GetServerTools with nonexistent server in internal/management/service_test.go -- [ ] T009 [P] [US1] Add unit test for TriggerOAuthLogin with valid server in internal/management/service_test.go -- [ ] T010 [P] [US1] Add unit test for TriggerOAuthLogin with disable_management enabled in internal/management/service_test.go -- [ ] T011 [P] [US1] Add unit test for TriggerOAuthLogin with read_only enabled in internal/management/service_test.go -- [ ] T012 [P] [US1] Add unit test for TriggerOAuthLogin with empty server name in internal/management/service_test.go - -### Implementation for User Story 1 - -**Service Layer Implementation:** - -- [ ] T013 [US1] Implement GetServerTools method in internal/management/service_impl.go - delegate to runtime.GetServerTools -- [ ] T014 [US1] Implement TriggerOAuthLogin method in internal/management/service_impl.go - check config gates, delegate to runtime.TriggerOAuthLogin -- [ ] T015 [US1] Add configuration gate checks in TriggerOAuthLogin (disable_management, read_only) in internal/management/service_impl.go - -**REST Handler Refactoring:** - -- [ ] T016 [US1] Update handleGetServerTools in internal/httpapi/server.go:1155 to call management service instead of controller -- [ ] T017 [US1] Update handleServerLogin in internal/httpapi/server.go:1050 to call management service instead of controller -- [ ] T018 [US1] Add error mapping for management service errors to HTTP status codes in handleGetServerTools -- [ ] T019 [US1] Add error mapping for management service errors to HTTP status codes in handleServerLogin - -**Mock Updates:** - -- [ ] T020 [US1] Update MockServerController in internal/httpapi/contracts_test.go to include GetServerTools method -- [ ] T021 [US1] Update MockServerController in internal/httpapi/contracts_test.go to include TriggerOAuthLogin method - -**Integration Testing (Per FR-016):** - -- [ ] T022 [US1] Add integration test to verify servers.changed event emitted after OAuth completion in internal/management/service_test.go -- [ ] T023 [US1] Verify event propagates to SSE endpoint /events (monitor event bus integration) - -**E2E Validation (Per FR-017, SC-005):** - -- [ ] T024 [US1] Run existing E2E API tests with ./scripts/test-api-e2e.sh and verify all pass without modification -- [ ] T025 [US1] Verify no behavioral changes in REST API responses (backward compatibility check) - -**Checkpoint**: At this point, User Story 1 should be fully functional - REST endpoints delegate to management service, config gates enforced, events emitted, E2E tests pass - ---- - -## Phase 4: User Story 2 - CLI Socket Commands Use Management Layer (Priority: P2) - -**Goal**: Ensure CLI commands from PR #152 (`tools list`, `auth login`, `auth status`) benefit from management service's configuration gates, event emissions, and error handling. - -**Independent Test**: Run `mcpproxy tools list --server=test-server` and `mcpproxy auth login --server=test-server` with daemon running, verify they work correctly and trigger management service events. - -**Note**: No new implementation required for this story - it automatically benefits once REST endpoints are refactored in US1. This phase is purely validation. - -### Validation for User Story 2 - -- [ ] T026 [US2] Start mcpproxy daemon and verify it's running -- [ ] T027 [US2] Test mcpproxy tools list --server= command and verify tools retrieved via management service -- [ ] T028 [US2] Test mcpproxy auth login --server= command and verify OAuth triggered via management service -- [ ] T029 [US2] Test mcpproxy auth status --server= command and verify authentication state shown -- [ ] T030 [US2] Enable disable_management in config and verify mcpproxy auth login is blocked with clear error -- [ ] T031 [US2] Verify servers.changed event emitted after OAuth completion (monitor logs or SSE stream) - -**Checkpoint**: CLI commands work correctly through refactored REST endpoints, config gates enforced, events emitted - ---- - -## Phase 5: User Story 3 - Tray Application Server Management (Priority: P3) - -**Goal**: Ensure tray application users get consistent behavior when managing servers through GUI menus (passive benefit from US1 refactoring). - -**Independent Test**: Use tray menu actions to trigger OAuth login and verify operations go through management service with proper event emissions. - -**Note**: No new implementation required - tray already uses REST API endpoints refactored in US1. This phase is purely validation. - -### Validation for User Story 3 - -- [ ] T032 [US3] Launch mcpproxy-tray application and verify connection to daemon -- [ ] T033 [US3] Use tray menu "Authenticate Server" action and verify OAuth triggered via management service -- [ ] T034 [US3] Verify tray UI updates automatically after OAuth completion (SSE event received) -- [ ] T035 [US3] Enable read_only mode and verify server restart blocked via tray menu with error message -- [ ] T036 [US3] Verify all tray server management actions use refactored REST endpoints - -**Checkpoint**: Tray application works correctly through refactored REST endpoints, automatic UI updates via events - ---- - -## Phase 6: Polish & Cross-Cutting Concerns - -**Purpose**: Final cleanup, documentation updates, and comprehensive validation - -### Documentation - -- [ ] T037 Add code comments explaining delegation pattern in internal/management/service_impl.go -- [ ] T038 Update CLAUDE.md if management service patterns changed (minimal changes expected) -- [ ] T039 Update OpenAPI annotations in internal/httpapi/server.go if endpoint behavior changed - -### Code Quality - -- [ ] T040 Run golangci-lint on modified files: ./scripts/run-linter.sh -- [ ] T041 Verify test coverage โ‰ฅ80% for new management service methods: go test -coverprofile=coverage.out ./internal/management/... -- [ ] T042 [P] Check for code duplication removed (SC-006): compare LOC before/after refactoring - -### Final Validation - -- [ ] T043 Run full test suite: ./scripts/run-all-tests.sh -- [ ] T044 Manual smoke test: Start daemon, call all refactored endpoints, verify responses -- [ ] T045 Performance verification: Ensure no regression in API response times (<10ms for GetServerTools, <50ms for TriggerOAuthLogin) - -**Final Checkpoint**: All success criteria met, ready for PR submission - ---- - -## Dependencies Between User Stories - -``` -Phase 1 (Setup) โ†’ Phase 2 (Foundational) - โ†“ - Phase 3 (US1) ๐ŸŽฏ MVP - Core refactoring - โ†“ - โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” - โ†“ โ†“ โ†“ - Phase 4 (US2) Phase 5 (US3) Phase 6 (Polish) - CLI validation Tray validation Cleanup -``` - -**Critical Path**: Phase 1 โ†’ Phase 2 โ†’ Phase 3 (US1) โ†’ Phase 4 (US2) + Phase 5 (US3) in parallel โ†’ Phase 6 - -**Parallelization Opportunities**: -- After US1 complete: US2 and US3 validation can run in parallel -- Within US1: All unit tests (T006-T012) can be written in parallel -- Within US1: Mock updates (T020-T021) can be done in parallel with service implementation -- Within Phase 6: Documentation (T037-T039) and code quality (T040-T042) can run in parallel - ---- - -## Implementation Strategy - -### MVP Scope (Minimum Viable Product) - -**Phase 3 (US1) ONLY** constitutes the MVP: -- Extend management service interface with 2 methods โœ… -- Implement methods to delegate to runtime โœ… -- Refactor 2 REST handlers to call management service โœ… -- Add unit tests (target 80% coverage) โœ… -- Verify E2E tests pass โœ… - -**Deliverable**: REST endpoints architecturally compliant, all interfaces use unified management service - -### Incremental Delivery - -**Iteration 1** (MVP): User Story 1 -- โœ… Delivers core architectural compliance -- โœ… Unblocks CLI and tray benefits -- โœ… Verifiable by E2E tests - -**Iteration 2**: User Story 2 + User Story 3 -- โœ… Validates CLI commands work correctly -- โœ… Validates tray application works correctly -- โœ… Confirms passive benefits realized - -**Iteration 3**: Polish & Documentation -- โœ… Final cleanup and documentation -- โœ… Performance verification -- โœ… Ready for production deployment - -### Parallel Execution Examples - -**Within User Story 1**: -```bash -# Terminal 1: Write unit tests -vim internal/management/service_test.go # T006-T012 - -# Terminal 2: Implement service methods -vim internal/management/service_impl.go # T013-T015 - -# Terminal 3: Update mocks -vim internal/httpapi/contracts_test.go # T020-T021 - -# All three can proceed in parallel -``` - -**Across User Stories** (after US1 complete): -```bash -# Terminal 1: Validate CLI commands -./scripts/validate-cli.sh # US2 tasks - -# Terminal 2: Validate tray application -./mcpproxy-tray # US3 tasks - -# Both can run in parallel -``` - ---- - -## Task Summary - -**Total Tasks**: 45 - -**Breakdown by Phase**: -- Phase 1 (Setup): 3 tasks -- Phase 2 (Foundational): 2 tasks -- Phase 3 (US1 - MVP): 20 tasks (7 unit tests + 13 implementation/integration) -- Phase 4 (US2): 6 validation tasks -- Phase 5 (US3): 5 validation tasks -- Phase 6 (Polish): 9 tasks - -**Breakdown by User Story**: -- User Story 1 (P1): 20 tasks - Core refactoring (MVP) -- User Story 2 (P2): 6 tasks - CLI validation -- User Story 3 (P3): 5 tasks - Tray validation - -**Parallelization**: -- 16 tasks marked with [P] can run in parallel -- After US1: US2 and US3 can run fully in parallel (11 tasks total) - -**Test Coverage**: -- 7 unit tests (T006-T012) - Target 80% coverage -- 2 integration tests (T022-T023) - Event emissions -- 1 E2E validation (T024-T025) - Backward compatibility -- 6 CLI validation tests (T026-T031) -- 5 tray validation tests (T032-T036) -- **Total: 21 test/validation tasks (47% of all tasks)** - -**Independent Test Criteria**: -- โœ… US1: Call REST endpoints, verify delegation and events -- โœ… US2: Run CLI commands, verify correct behavior -- โœ… US3: Use tray menus, verify automatic updates - -**Suggested MVP**: Phase 3 (US1) only - 20 tasks delivering core architectural compliance - ---- - -## Format Validation - -โœ… **ALL tasks follow checklist format**: `- [ ] [TaskID] [P?] [Story?] Description with file path` - -- โœ… Checkbox prefix: All tasks start with `- [ ]` -- โœ… Task IDs: Sequential T001-T045 -- โœ… [P] markers: 16 tasks correctly marked as parallelizable -- โœ… [Story] labels: All US1/US2/US3 tasks properly labeled -- โœ… File paths: All implementation tasks include exact file paths -- โœ… Organization: Grouped by user story for independent implementation -- โœ… Dependencies: Clear critical path and parallelization opportunities documented diff --git a/specs/044-diagnostics-taxonomy/test-report.html b/specs/044-diagnostics-taxonomy/test-report.html deleted file mode 100644 index a9f9a328..00000000 --- a/specs/044-diagnostics-taxonomy/test-report.html +++ /dev/null @@ -1,838 +0,0 @@ - - - - - -Spec 044 Diagnostics โ€” End-to-End Verification Report - - - -
- - -
-

Spec 044 Diagnostics โ€” End-to-End Verification

-
-
Run: 2026-04-24 15:53 UTC (18:53 EEST)
-
Branch: feat/diagnostics-taxonomy
-
SHA: 911704c
-
Commits under test: 9
-
Binary version: v0.24.9
-
Go: go1.25.1 darwin/arm64
-
PR: #400
-
-
- ● PASS (2/3 phases) - ⚠ Tray phase skipped - 3 non-blocking findings - Production untouched -
-
- - -

Summary

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
PhaseSurfaceStatusTests executedNotes
Phase 1CLI (doctor)PASS6All 29 codes registered; doctor fix dry-run + execute exercised; classifier maps stdio-spawn-enoent.
Phase 2Web UI (ErrorPanel)PASS1 integration flowErrorPanel renders full payload; Preview (dry-run) button fires fix endpoint โ†’ 200.
Phase 3macOS TraySKIPPED0Production tray actively running; stopping mid-session was judged too risky. Visual confirm recommended post-merge.
- -
- Commits under test (9) -
-
911704cchore(spec-044): regenerate OpenAPI spec for Diagnostic schema
-
892e056feat(spec-044): wrap OAUTH/DOCKER/CONFIG/QUARANTINE errors with DiagnosticError
-
7b81e03feat(spec-044): mcpproxy doctor fix + --server filter
-
6aa8305feat(spec-044): macOS tray badge + Fix issues menu group
-
4d92c82feat(spec-044): Vue ErrorPanel for per-server diagnostics
-
4f6872cfeat(diagnostics): mcpproxy doctor list-codes subcommand
-
8ac86bafeat(diagnostics): STDIO classifier wired + per-server REST + fix endpoint
-
a0a1049feat(diagnostics): initial error-code catalog
-
a632288docs(spec-044): speckit artifacts for diagnostics & error taxonomy
-
-
- - -

Phase 1 โ€” CLI

-

- Ran the freshly-built dev mcpproxy on an isolated port (127.0.0.1:18080) and data dir - (/tmp/mcpproxy-test-spec044) with a purpose-built test config containing a broken-stdio - server whose command points at /nonexistent/binary. The production daemon on :8080 was - never touched; its config was backed up (byte-identical after test) and the dev binary was run from the worktree. -

- - -
-

1a ยท doctor list-codes โ€” all 29 diagnostic codes registered

-
- PASS - pretty + JSON output -
-
- Expected:29 codes registered, list-codes enumerates them with docs + fix rows. - Actual:jq length == 29. All 5 MCPX_STDIO codes present. Each entry surfaces severity, docs link, and fix rows (command / button / link). -
-
- Command + output -
-
$ ./mcpproxy doctor list-codes
-29 diagnostic codes registered:
-
-  MCPX_CONFIG_DEPRECATED_FIELD          warn   The configuration uses a deprecated field that will be removed in a future release.
-    docs: docs/errors/MCPX_CONFIG_DEPRECATED_FIELD.md
-    fix (button):  Preview migration (dry-run)  key=config_migrate_deprecated [destructive -> dry-run default]
-    fix (link):    Migration notes  docs/errors/MCPX_CONFIG_DEPRECATED_FIELD.md
-
-  MCPX_CONFIG_MISSING_SECRET            error  The configuration references a secret that is not defined.
-    docs: docs/errors/MCPX_CONFIG_MISSING_SECRET.md
-    fix (command): List secrets  mcpproxy secret list
-    fix (link):    Secret references  docs/errors/MCPX_CONFIG_MISSING_SECRET.md
-  ... (27 more) ...
-
-$ ./mcpproxy doctor list-codes -o json | jq 'length'
-29
-
-$ ./mcpproxy doctor list-codes -o json | jq '[.[] | select(.code | startswith("MCPX_STDIO_"))] | map(.code)'
-[
-  "MCPX_STDIO_EXIT_NONZERO",
-  "MCPX_STDIO_HANDSHAKE_INVALID",
-  "MCPX_STDIO_HANDSHAKE_TIMEOUT",
-  "MCPX_STDIO_SPAWN_EACCES",
-  "MCPX_STDIO_SPAWN_ENOENT"
-]
-
-
-
- - -
-

1b ยท doctor --server broken-stdio โ€” per-server health check

-
- PASS - caveat: initial socket-disabled run errored -
-
- Expected:Prints banner + upstream error section for broken-stdio, maps to MCPX_STDIO_SPAWN_ENOENT. - Actual:With socket enabled, produces ⚠ Found 1 issue that need attention and ❌ Upstream Server Connection Errors for broken-stdio. Classifier correctly picks up the zsh:1: no such file or directory stderr pattern. -
-
- Non-blocking observation (see Findings #1): first invocation with - enable_socket: false failed with "doctor requires running daemon. Start with: mcpproxy serve" - โ€” misleading because the daemon was running and reachable over HTTP. Re-running with socket enabled - resolved the issue. Worth a CLI UX follow-up. -
-
- Command + output (via socket) -
-
$ ./mcpproxy doctor --server broken-stdio
-โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”
-๐Ÿ” MCPProxy Health Check
-โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”
-Version: v0.24.9 (latest)
-
-⚠ Found 1 issue that need attention
-
-❌ Upstream Server Connection Errors
-  Server: broken-stdio
-
-โš ๏ธ  Deprecated Configuration
-  โ€ข features
-    features is deprecated and has no effect
-    Suggestion: Remove from config (all feature flags are unused)
-โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”
-
-
-
- Misleading socket-disabled error (non-blocking) -
-
$ ./mcpproxy doctor --server broken-stdio   # enable_socket: false
-Error: doctor requires running daemon. Start with: mcpproxy serve
-# ...usage text elided...
-Error: doctor requires running daemon. Start with: mcpproxy serve
-
-
-
- - -
-

1c ยท doctor fix MCPX_STDIO_SPAWN_ENOENT --server broken-stdio โ€” dry-run default

-
- PASS - fixer_key: stdio_show_last_logs -
-
- Expected:Auto-resolve fixer_key, run dry-run, return preview text. - Actual:Outcome success, Mode: dry_run, preview text returned. -
-
- Command + output -
-
$ ./mcpproxy doctor fix MCPX_STDIO_SPAWN_ENOENT --server broken-stdio
-โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”
-๐Ÿ›   Doctor Fix: MCPX_STDIO_SPAWN_ENOENT
-โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”
-Server:      broken-stdio
-Fix step:    Show last server log lines
-Fixer key:   stdio_show_last_logs
-Destructive: no
-Mode:        dry_run
-
-Outcome:     ✅ success
-
-Preview:
-  Server 'broken-stdio' log tail unavailable in this build โ€” enable server-side
-  log access to view the last 50 lines here.
-โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”
-
-
-
- - -
-

1d ยท doctor fix ... --execute โ€” rate-limited safety guard

-
- PASS - Rate-limit honored: HTTP 429 -
-
- Expected:First burst hits rate-limit; after cooldown, execute succeeds. - Actual:HTTP 429 "Too many fix attempts" on first attempt โ€” matches the documented safety guard. Subsequent direct REST call to POST /api/v1/diagnostics/fix with {"mode":"execute",...} returned 200 {"mode":"execute","outcome":"success"}. -
-
- Command + output -
-
$ ./mcpproxy doctor fix MCPX_STDIO_SPAWN_ENOENT --server broken-stdio --execute
-Error: fix invocation failed: API returned status 429: {
-  "success":false,
-  "error":"Too many fix attempts; try again shortly",
-  "request_id":"eeb5791b-06ab-4055-aa2f-1acabf83bf42"
-}
-
-
-
- - -
-

1e ยท REST endpoints โ€” /diagnostics, /servers, /servers/{name}/diagnostics

-
- PASS - schema matches spec -
-
    -
  • GET /api/v1/servers/broken-stdio/diagnostics returns full Diagnostic payload: error_code, user_message, fix_steps[] (command / button / link variants), docs_url, severity, detected_at, health.level=unhealthy, health.action=restart.
  • -
  • GET /api/v1/servers returns each server with top-level error_code ("MCPX_STDIO_SPAWN_ENOENT" on the broken one, null on healthy).
  • -
  • GET /api/v1/diagnostics aggregates to total_issues: 1, upstream_errors[0].server_name == "broken-stdio".
  • -
  • Fix endpoint field is "mode": "dry_run" | "execute" (see Findings #3), not a boolean.
  • -
-
- - -
-

1f ยท STDIO classifier โ€” stderr pattern mapping

-
- PASS -
-

Despite the spawn being wrapped via /bin/zsh -l -c, the classifier parses the zsh stderr line - zsh:1: no such file or directory: /nonexistent/binary and maps it to MCPX_STDIO_SPAWN_ENOENT - (not a generic timeout or handshake error). This is the core value-add of the spec.

-
- - -

Phase 2 โ€” Web UI

-

- Opened http://127.0.0.1:18080/ui/servers/broken-stdio?apikey=*** in claude-in-chrome. The Vue - ErrorPanel component (4d92c82) rendered the full diagnostic payload with all expected - elements. Clicking the Preview (dry-run) button fires the fix endpoint successfully. -

- -
-

2a ยท ErrorPanel rendering on /ui/servers/broken-stdio

-
- PASS - all elements verified -
-
    -
  • Red severity band with Server Error heading
  • -
  • Severity badge: error
  • -
  • Error code prominently shown: MCPX_STDIO_SPAWN_ENOENT
  • -
  • User-facing message: "The configured command for this stdio server was not found on PATH."
  • -
  • Cause snippet (truncated stderr) including zsh:1: no such file or directory: /nonexistent/binary
  • -
  • Fix steps rendered as three distinct rows: -
      -
    • Command row which npx && which uvx && which python3 + Copy button
    • -
    • Link row to docs/errors/MCPX_STDIO_SPAWN_ENOENT.md
    • -
    • Button row "Show last server log lines" with Preview (dry-run) action
    • -
    -
  • -
  • Documentation link in footer
  • -
  • Top-right connection state badge: Connecting (yellow)
  • -
-
- -
-

2b ยท Preview (dry-run) button โ†’ POST /api/v1/diagnostics/fix

-
- PASS (200 OK) - UX gap: success is silent (Finding #2) -
-
- Request:POST http://127.0.0.1:18080/api/v1/diagnostics/fix with {"mode":"dry_run","code":"MCPX_STDIO_SPAWN_ENOENT","server":"broken-stdio","fixer_key":"stdio_show_last_logs"} - Response:200 OK + JSON body containing preview string - UX observation:No toast or inline render of data.preview after success. Re-verified via direct curl that the payload does contain the preview text. -
-
- -

Screenshots

-
- Note on screenshot availability. During this verification run, screenshots were captured through - claude-in-chrome's in-memory screenshot facility (image IDs ss_103222r6g and - ss_0723w5ok4). Those IDs are ephemeral and were not written to disk โ€” no - /tmp/spec044-verify-webui-*.png files exist. The textual test plan above documents every element that - was visually confirmed. -
-
- ss_103222r6g โ€” Initial ErrorPanel view at /ui/servers/broken-stdio
- (image not available on disk โ€” see caption list above for visually-confirmed elements) -
-
- ss_0723w5ok4 โ€” After clicking Preview (dry-run)
- Panel still visible; no toast / inline result rendered despite 200 OK response (see Finding #2) -
- - -

Phase 3 โ€” macOS Tray

- -
- ⚠ SKIPPED. The user's production MCPProxy tray app and mcpproxy - core were actively running on port 8080 at the time of this verification. Killing and restarting - them to swap in the dev tray binary (6aa8305) would have risked disrupting an active session for no - critical upside โ€” Phases 1 + 2 already demonstrate end-to-end that the classifier, REST API, CLI formatter, fix - endpoint, and Vue ErrorPanel all work correctly for MCPX_STDIO_SPAWN_ENOENT. -
- -

What still needs a human check

-
    -
  • Red-dot badge on the tray menu-bar icon when total_issues > 0
  • -
  • New "Fix issues" menu group opens and lists affected servers
  • -
  • Clicking a menu row opens /ui/servers/<name> in the default browser
  • -
-

- Relevant code lives in the commit 6aa8305 โ€” feat(spec-044): macOS tray badge + Fix issues menu group. - To verify post-merge, run: -

-
$ # after-hours, when it's safe to cycle production
-$ cd ~/repos/mcpproxy-go
-$ make build
-$ pkill -x MCPProxy
-$ open /tmp/MCPProxy.app
- - -

Findings (non-blocking)

- -
-

#1 ยท doctor CLI has no HTTP fallback when socket is disabled - medium

-

- With "enable_socket": false in config, mcpproxy doctor fails with - "doctor requires running daemon. Start with: mcpproxy serve" โ€” a misleading message, because the daemon - is already running and reachable over HTTP at the configured listen address. Env vars - MCPPROXY_LISTEN/MCPPROXY_API_KEY have no effect on the doctor subcommand either. -

-

Reproduction: start mcpproxy serve with enable_socket: false, then run - mcpproxy doctor.

-

Suggested fix: either (a) add HTTP fallback using listen + API key, or (b) - surface a clearer error: "socket is disabled in config โ€” re-enable it, or use - curl http://127.0.0.1:PORT/api/v1/diagnostics".

-
- -
-

#2 ยท Web UI Preview button success is silent - low

-

- Clicking the Preview (dry-run) button in ErrorPanel.vue fires the fix endpoint and receives - a 200 response whose JSON body contains a helpful preview string (e.g., - "Server 'broken-stdio' log tail unavailable in this build..."), but the Vue component does not render - it anywhere. Users have no visual feedback that the action succeeded. -

-

Location: web/frontend/src/components/ErrorPanel.vue (or equivalent โ€” commit - 4d92c82).

-

Suggested fix: surface response.data.preview via a DaisyUI toast, or render it - inline below the button while preview is populated.

-
- -
-

#3 ยท Fix endpoint body uses "mode" string, not a boolean "dry_run" - low

-

- The POST /api/v1/diagnostics/fix endpoint expects {"mode": "dry_run" | "execute", ...}, - not a boolean flag like {"dry_run": true}. This matches the implementation at - internal/httpapi/diagnostics_fix.go:27, but worth calling out explicitly in the OpenAPI description - and in any published examples to prevent API-consumer confusion. -

-

Suggested fix: add an inline note in oas/swagger.yaml and the spec's - contracts/ examples explicitly showing the string-enum field.

-
- -

Confirmed-working (no action needed)

-
    -
  • 29 diagnostic codes registered, matching list-codes -o json | jq length
  • -
  • Stdio spawn failure via /nonexistent/binary correctly classifies to MCPX_STDIO_SPAWN_ENOENT even when wrapped through /bin/zsh -l -c
  • -
  • Diagnostic payload shape matches spec: code, severity, user_message, cause, fix_steps[], docs_url, detected_at
  • -
  • server.health block populated with level: unhealthy, action: restart โ€” matches Spec 044's action-suggestion design
  • -
  • Server list endpoint carries per-server error_code field at top level
  • -
  • doctor fix rate-limiter returns structured 429 with a request_id for correlation
  • -
- - -

Environment & Reproducibility

- -

Build context

- - - - - - - - - - - - - -
Go toolchaingo1.25.1 darwin/arm64
Binary versionv0.24.9 (reported by the daemon banner; mcpproxy version subcommand is absent from this build โ€” not yet ported to Cobra)
Binary size40,874,114 bytes (/Users/user/repos/mcpproxy-go-diagnostics-taxonomy/mcpproxy, 2026-04-24 18:45)
Worktree/Users/user/repos/mcpproxy-go-diagnostics-taxonomy
Branchfeat/diagnostics-taxonomy @ 911704cc539e8c4965e7c1786cbcf3b0b70e0ae6
Commits-under-test9 (see header)
Test data dir/tmp/mcpproxy-test-spec044 (removed during cleanup)
Test listen127.0.0.1:18080
Production untouched:8080, config byte-identical to backup (diff -q)
Duration~45 minutes (setup + 6 CLI tests + 2 Web UI checks + cleanup)
- -

Reproducing the test setup

-
- Full setup script (copy-paste to re-run) -
-
# 1. back up production config, but don't touch it
-$ cp ~/.mcpproxy/mcp_config.json /tmp/mcp_config.json.backup-$(date +%s)
-
-# 2. build dev binary on the feature branch
-$ cd ~/repos/mcpproxy-go-diagnostics-taxonomy
-$ make build
-
-# 3. isolated data dir + config
-$ mkdir -p /tmp/mcpproxy-test-spec044
-$ cat > /tmp/mcpproxy-test-spec044/config.json <<'JSON'
-{
-  "listen": "127.0.0.1:18080",
-  "api_key": "***",
-  "enable_socket": true,
-  "enable_web_ui": true,
-  "mcpServers": [
-    {"name":"broken-stdio", "command":"/nonexistent/binary",
-     "protocol":"stdio", "enabled":true},
-    {"name":"healthy-control", "command":"echo",
-     "protocol":"stdio", "enabled":false}
-  ]
-}
-JSON
-
-# 4. launch in tmux on the isolated port + data dir
-$ tmux new-session -d -s spec044 \
-    "./mcpproxy serve -c /tmp/mcpproxy-test-spec044/config.json \
-     -d /tmp/mcpproxy-test-spec044 --log-level=debug"
-
-# 5. exercise CLI + REST + Web UI
-$ ./mcpproxy doctor list-codes -o json | jq length
-$ ./mcpproxy doctor --server broken-stdio
-$ ./mcpproxy doctor fix MCPX_STDIO_SPAWN_ENOENT --server broken-stdio
-$ curl -s -H "X-API-Key: ***" \
-    http://127.0.0.1:18080/api/v1/servers/broken-stdio/diagnostics | jq .
-
-# 6. cleanup
-$ tmux kill-session -t spec044
-$ rm -rf /tmp/mcpproxy-test-spec044
-$ diff -q ~/.mcpproxy/mcp_config.json /tmp/mcp_config.json.backup-*
-
-
- -

Raw artifacts consulted by this report

-
    -
  • /Users/user/repos/mcpproxy-go/tmp-agent-report-spec044-verify.md (structured verification report, 9,038 bytes)
  • -
  • /tmp/spec044-verify-cli.log (CLI command outputs, 10,091 bytes)
  • -
  • /tmp/mcpproxy-test-spec044/server.log (server log โ€” not found, test data dir was cleaned up)
  • -
  • /tmp/spec044-verify-webui-*.png (Web UI screenshots โ€” not found, captured via claude-in-chrome but not persisted to disk)
  • -
  • /tmp/spec044-verify-tray-*.png (tray screenshots โ€” not found, Phase 3 skipped)
  • -
- - -
- Generated 2026-04-24 by spec-044 verification run ยท - PR #400 ยท - 911704c -
- -
- - From ece20fbe35f01a0db9d5c668215d6f594fe1addc Mon Sep 17 00:00:00 2001 From: Algis Dumbris Date: Fri, 22 May 2026 05:53:30 +0300 Subject: [PATCH 2/2] chore: delete spec execution_log.md residue; gitignore *.bak + execution_log.md --- .gitignore | 4 + .../043-linux-package-repos/execution_log.md | 73 -------- .../execution_log.md | 163 ------------------ specs/050-global-tools-page/execution_log.md | 17 -- 4 files changed, 4 insertions(+), 253 deletions(-) delete mode 100644 specs/043-linux-package-repos/execution_log.md delete mode 100644 specs/046-local-launcher-for-http-sse/execution_log.md delete mode 100644 specs/050-global-tools-page/execution_log.md diff --git a/.gitignore b/.gitignore index 0e38d571..c06cbcd5 100644 --- a/.gitignore +++ b/.gitignore @@ -158,3 +158,7 @@ native/macos/MCPProxy/.build/ # demo pipeline: playwright node_modules symlink (recreated at capture time) scripts/demo/node_modules + +# Transient work artifacts (brainstorm logs, editor backups) +*.bak +**/execution_log.md diff --git a/specs/043-linux-package-repos/execution_log.md b/specs/043-linux-package-repos/execution_log.md deleted file mode 100644 index 5c93db66..00000000 --- a/specs/043-linux-package-repos/execution_log.md +++ /dev/null @@ -1,73 +0,0 @@ -# Execution Log โ€” Feature 043 Linux Package Repositories - -Running per CLAUDE.md Autonomous Operation Constraints. Logging every completed step. - -## Branch -`043-linux-package-repos` - -## Tool verification (start) -- gh: OK (logged in as Dumbris, repo/workflow scopes) -- wrangler: OK (account `d2fa289033a2f6f28c550834d0fe43c5`, a.dumbris@gmail.com) -- gpg: 2.4.9 OK -- aws CLI: NOT installed locally โ€” fine for CI (ubuntu-latest has it pre-installed); for local one-time uploads we use `wrangler r2 object put`. - -## Phase 1: Setup (Shared Infrastructure) -Status: in_progress - -### T001 โ€” Generate GPG signing key โ€” DONE -- Fingerprint: `3B6FA1AD5D5359DA51F18DDCE1B59B9BA1CB8A3B` -- UID: `MCPProxy Packages (Linux repository signing key) ` -- Created: 2026-04-21 -- Expires: 2031-04-21 -- Keys stored in user's GnuPG keyring (`~/.gnupg/`). Batch file shredded. - -### T002 โ€” Export public key โ€” DONE -- Written to `/Users/user/repos/mcpproxy-go/contrib/signing/mcpproxy-packages.asc` (3216 bytes) - -### T003 โ€” Write backup file โ€” DONE -- Path: `~/repos/PACKAGES_GPG_PRIVATE_KEY.txt` (outside any git repo, 0600) -- Contains: metadata header, passphrase (flagged for user to move to 1Password), full usage/rotation instructions, ASCII-armored private key. -- Size: 8472 bytes, 141 lines. - -### T004-T015 โ€” R2 and credentials โ€” DONE -- R2 subscription activated (user-authorized click). -- Buckets `mcpproxy-apt` and `mcpproxy-rpm` created in EEUR region. -- Custom domains `apt.mcpproxy.app` + `rpm.mcpproxy.app` bound, both Active + Enabled. -- R2 API token "MCPProxy Packages CI" created, Object Read&Write, scoped to both buckets. -- 5 GitHub Actions secrets + 1 variable registered. -- Public signing key uploaded to both buckets (note: needed `--remote` flag on wrangler). -- HTTPS fetch of public key verified, fingerprint `3B6F A1AD 5D53 59DA 51F1 8DDC E1B5 9B9B A1CB 8A3B` matches. - -## Phase 2: Foundational โ€” DONE -Helper scripts and config files created under `contrib/linux-repos/`. - -## Phase 3: US2 โ€” Publish automation โ€” DONE -- `apt-publish.sh`, `rpm-publish.sh`, `publish.sh` written. -- Smoke tests `smoke-test-debian.sh` + `smoke-test-fedora.sh` written. -- `publish-linux-repos` job added to `.github/workflows/release.yml`. - -Bugs found and fixed during local e2e test: -1. `wrangler r2 object put` defaulted to local storage โ€” must use `--remote`. (Only affected initial setup, not CI.) -2. `import-key.sh` writing `GNUPGHOME=...` to `$GITHUB_ENV` doesn't help in Docker/local runs. Refactored to export a stable `GNUPGHOME` before invoking. -3. AWS CLI v2.23+ sends CRC32 checksums by default โ†’ R2 `SignatureDoesNotMatch`. Added `AWS_REQUEST_CHECKSUM_CALCULATION=when_required` and `AWS_RESPONSE_CHECKSUM_VALIDATION=when_required` to publish.sh. -4. RPM packages lacked embedded GPG signatures, failing `dnf install` with `gpgcheck=1`. Added `rpmsign --addsign` step to rpm-publish.sh (requires `rpm` package in CI image). -5. Cache TTL of 300s on metadata produced hash-mismatch windows across releases. Shortened to 60s + `must-revalidate`. - -## Phase 4: US1 verification โ€” DONE -- debian:stable-slim `apt install mcpproxy` โ†’ 0.24.6 installed successfully. -- fedora:latest `dnf install mcpproxy` โ†’ 0.24.6 installed successfully. -- GPG key imported from `https://rpm.mcpproxy.app/mcpproxy.gpg`, fingerprint verified. - -## Phase 5: Docs โ€” DONE -- Website `installation.astro` updated with apt + dnf sections. -- README.md Linux install replaced with repo-based install. -- `docs/getting-started/installation.md` updated. -- `docs/features/linux-package-repos.md` created. - -## Phase 6: Ops runbook โ€” DONE -- `docs/operations/linux-package-repos-infrastructure.md` created with rotation, manual republish, purge procedures. - -## Phase 7: Polish โ€” in_progress -- bash -n passes on all scripts. -- Local e2e smoke test passes (Debian + Fedora). -- Remaining: commit fixes, push branch, open PR, let user review. diff --git a/specs/046-local-launcher-for-http-sse/execution_log.md b/specs/046-local-launcher-for-http-sse/execution_log.md deleted file mode 100644 index a0c21d0f..00000000 --- a/specs/046-local-launcher-for-http-sse/execution_log.md +++ /dev/null @@ -1,163 +0,0 @@ -# Execution Log โ€” 046-local-launcher-for-http-sse - -State maintained per `CLAUDE.md` autonomous-operation requirement. Each -session appends a dated entry; do not rewrite history. - -## 2026-05-10 โ€” Initial scaffold (Roman + Claude) - -**Status**: Phase 0 + Phase 1 code landed in working tree (uncompiled โ€” -sandbox network blocks `proxy.golang.org`, see end of log). Phase 2 partial. - -### Files added - -- `internal/upstream/launcher/launcher.go` โ€” `Spec`, `Handle`, `Spawn`. Owns the - child's lifecycle (Stop with SIGTERM โ†’ grace โ†’ SIGKILL fallback, Wait, Done, - Pid). Pumps stdout+stderr line-by-line into a caller-supplied `io.Writer`, - one Write per line so a zap-bridge sink produces one log entry per line. -- `internal/upstream/launcher/launcher_unix.go` โ€” Setpgid + signal-the-pgroup - for SIGTERM/SIGKILL on Linux/macOS. -- `internal/upstream/launcher/launcher_windows.go` โ€” best-effort stubs - (matches the existing `process_windows.go` TODO; Job Objects are a - follow-up). -- `internal/upstream/launcher/wait.go` โ€” `WaitForURL` does TCP-dial polling - rather than HTTP GET (gotcha #2 in plan: SSE endpoints stream forever and - break HTTP-GET probes). -- `internal/upstream/launcher/wait_test.go` โ€” 6 cases (immediately bound, - bound late, never bound, ctx-canceled, bad URLs, default-port inference). -- `internal/upstream/launcher/launcher_test.go` โ€” 7 cases (graceful exit, - SIGKILL fallback when SIGTERM is trapped, Done on natural exit, exit-code - capture via `*exec.ExitError`, Stop idempotency, log sink capture, nil - guards). -- `internal/upstream/launcher/integration_test.go` โ€” full Spawn + WaitForURL - with a python-listener subprocess; skips when python3 is missing or on - Windows. (Pure Go testdata helper would be cleaner โ€” TODO.) -- `internal/upstream/core/connection_launcher.go` โ€” `connectWithLauncher`, - `stopLauncher`, `watchLauncher`, `buildLauncherCmd`, `loggerWriter`. - -### Files modified - -- `internal/config/config.go` โ€” `LauncherWaitTimeout Duration` on - `ServerConfig`. Default 30s when zero/unset. -- `internal/config/merge.go` โ€” `CopyServerConfig` carries the new field. -- `internal/upstream/core/client.go` โ€” `launcherHandle launcher.Handle` and - `launcherCIDFile string` on `Client`; new import. -- `internal/upstream/core/connection.go` โ€” pre-transport launcher dispatch - for `http`/`sse`/`streamable-http` when `Command != ""`. Stops launcher - in the connect-failure cleanup path. -- `internal/upstream/core/connection_lifecycle.go` โ€” `stopLauncher` after - the MCP-client close in Disconnect (so the child sees the network - transport go away first); also clears `processCmd`. -- `docs/configuration.md` โ€” new "Locally-launched HTTP / SSE servers" - section + back-compat behaviour matrix; `launcher_wait_timeout` row in - the Server Fields table. -- `docs/cli-management-commands.md` โ€” restart-semantics note covering the - launcher stop-then-start order. - -### Decisions / assumptions - -1. **Stdio path untouched.** Plan's Phase 0 contemplated lifting env/Docker - plumbing out of `connection_stdio.go` and routing stdio through - `launcher.Spawn`. Doing that requires reworking how mcp-go owns the - stdio process (mcp-go's `Stdio` transport spawns via a `CommandFunc` it - controls โ€” externally-spawned children can't be wired into it without - patching the upstream library). To honour the spirit of "Docker-isolation - logic must live in one place" without that reshuffling, the new - `buildLauncherCmd` reuses the same Client methods (`setupDockerIsolation`, - `injectEnvVarsIntoDockerArgs`, `insertCidfileIntoShellDockerCommand`, - `wrapWithUserShell`) the stdio path already calls. Single source of - truth, but no double-spawn risk. - -2. **Launcher-managed children stay invisible to stdio cleanup helpers.** - `connectWithLauncher` deliberately does NOT set `c.processCmd` / - `c.processGroupID`. The `launcher.Handle` owns lifecycle; setting those - would let stdio's `killProcessGroup` race with `Handle.Stop`. This is a - minor deviation from the original plan (which suggested wiring the same - process-group tracking) โ€” the result is cleaner ownership. - -3. **Health check is a TCP dial.** Per the plan's gotcha #2. - `addrFromURL` infers default ports for http/https/ws/wss; rejects - unknown schemes early so misconfigurations surface fast. - -4. **StopGrace default is 5s.** Plan asked for an explicit decision (open - question #2). 5s matches `processGracefulTimeout` in - `internal/upstream/core/connection.go`. No per-server override yet โ€” - `Spec.StopGrace` is plumbed but not exposed in `ServerConfig`. Promote to - config if a real-world server needs more. - -5. **Crash-while-connected โ†’ Disconnect.** `watchLauncher` calls the - `Client.Disconnect()` path on unexpected child exit (gotcha #6). - Existing reconnect logic in `internal/upstream/managed` then handles - the come-back attempt โ€” no separate launcher-internal restart loop - (open question #3 settled toward "defer to transport-level reconnect"). - -6. **Stop ctx on shutdown.** `stopLauncher` currently uses - `context.WithTimeout(context.Background(), 10s)` everywhere. Plan - open question #4 โ€” accept this default; raise the limit if shutdown - really needs to wait for slow Docker stop. - -### Verification round 1 (2026-05-11) - -After `sbx policy allow network proxy.golang.org,sum.golang.org` was set: - -| Command | Result | -|---|---| -| `GOTOOLCHAIN=local go vet ./internal/upstream/...` | โœ… clean | -| `GOTOOLCHAIN=local go test ./internal/upstream/launcher/...` | โœ… 15/15 | -| `GOTOOLCHAIN=local go test ./internal/upstream/...` | โœ… all packages | -| `GOTOOLCHAIN=local go test ./internal/config/...` | โœ… | -| `go test -race` | โš ๏ธ blocked โ€” cgo (gcc) not installed in sandbox; user can run on host | -| `go build ./cmd/mcpproxy` | โŒ blocked โ€” needs `storage.googleapis.com` (some Go modules CDN-served from there); user must add `sbx policy allow network storage.googleapis.com` | - -### Bugs found + fixed during verification round 1 - -1. **Deadlock in connect-failure cleanup.** `Connect` holds `c.mu` for its - entire duration; my original failure-path call to `c.stopLauncher(...)` - re-acquired the same lock โ†’ hang. Fixed by inlining the stop sequence - in `connection.go`'s cleanup branch (read fields under the held lock, - release `c.mu` around `handle.Stop()`, reacquire before return). -2. **`connectWithLauncher` redundant locking.** Same root cause โ€” - `connectWithLauncher` is called from `Connect` which already holds - `c.mu`. Removed the inner `c.mu.Lock()/Unlock()` for the launcher - field writes; the wait-for-url failure path still releases the lock - around the blocking `handle.Stop()` and reacquires before returning. -3. **`bytes.Buffer` LogSink race.** Test failures from the stdout pump, - stderr pump, and the startup-banner write all racing on a single - `*bytes.Buffer` in tests. Fixed by wrapping `LogSink` internally with - a `serializedWriter` (mutex around `Write`). zap-bridge in production - is already thread-safe, so this is a robustness fix for test sinks - and any future single-writer adapters. -4. **SIGKILL-fallback test could detect "ready" in the banner.** The - launcher startup banner echoes the script source verbatim, so any - marker token literally present in the script also matched in the - banner โ€” making the test think the trap was installed before the - shell even ran. Fixed by using a shell-substituted marker - (`__LNCTICK__:$$`) and a regex detector (`__LNCTICK__:[0-9]+`). -5. **`bad scheme + explicit port` test case.** Test asserted error on - `ftp://example.com:21/foo` but the launcher correctly accepts any - scheme when the port is explicit (user took responsibility). Removed - that case; replaced with the actually-invalid `ftp://example.com/foo`. - -### Outstanding network blocker - -``` -sbx policy allow network storage.googleapis.com -``` - -Needed for `go build ./cmd/mcpproxy` to fetch Bleve/Roaring/etc. CDN-backed -modules. Once allowed, the verification commands are: - -``` -GOTOOLCHAIN=local go build ./cmd/mcpproxy -./scripts/test-api-e2e.sh # optional smoke test -``` - -### Outstanding follow-ups (post-PR) - -- Replace `integration_test.go`'s python-shellout with a Go test-binary - helper invoked via `os.Args` re-entry pattern, so the test runs on any - CI that has Go (which is all of them). Plan called for a tiny binary in - `internal/upstream/launcher/testdata/`. -- Extend `scripts/test-api-e2e.sh` with a launcher-flavoured server (plan - Phase 2 item). -- Phase 3 (post-merge): `{port}` templating in `args` / `url`, per-launcher - custom health probe, exponential backoff for repeated launcher crashes. diff --git a/specs/050-global-tools-page/execution_log.md b/specs/050-global-tools-page/execution_log.md deleted file mode 100644 index 84d512d7..00000000 --- a/specs/050-global-tools-page/execution_log.md +++ /dev/null @@ -1,17 +0,0 @@ -# Execution Log โ€” Spec 050 Global Tools Page - -State file per CLAUDE.md autonomous-operation constraint. One line per completed step. - -- 2026-05-18 brainstormed feature with user; design approved (aggregation endpoint, v1 columns, substring search, replace orphaned Tools.vue). -- 2026-05-18 speckit.specify โ†’ spec.md + checklists/requirements.md committed (3633b6e5). SynapBus SPEC announcement posted (#my-agents-algis, msg 37429). -- 2026-05-18 CLI gap analysis: `tools list` requires --server, name+desc only; no per-tool enable/disable CLI. Decision: fold CLI parity into spec 050 (same endpoint/feature), not a new spec. -- 2026-05-18 backend impl (T002-T012): AggregateToolUsage + GET /api/v1/tools + helper refactor; httpapi/storage/runtime/server tests GREEN, lint clean. -- 2026-05-18 fanned out frontend (Tools.vue rewrite, /tools route, sidebar badge, US1-3) + CLI (US4 global list + enable/disable) subagents; both reported GREEN (frontend build clean, cmd tests pass). -- 2026-05-18 live curl: found+fixed false partial:true โ€” global handler now uses mgmt-service GetServerTools (like per-server endpoint) so disabled/not-connected servers yield 0 tools, not a 'failed' flag. Re-verified: 13 tools, partial absent, stats consistent. CLI table OK. -- 2026-05-18 FR-001 note: a disabled server that was NEVER connected has no tools anywhere (index empty, per-server endpoint returns 0). Showing its tools is impossible by any path; this is an inherent limitation, distinct from the 'server errored -> partial' edge case (now correctly separated). Documented as refined assumption. -- 2026-05-18 API E2E: GET /api/v1/tools PASS. 10 unrelated pre-existing/environmental failures (upstream_servers env/args/headers CRUD hitting example.com, flaky activity/{id}) โ€” none in tools code paths. -- 2026-05-18 Playwright sweep 5/5 GREEN (loaded table, search, sort, batch-bar+disable, empty state); self-contained report.html + screenshots committed under verification/. -- 2026-05-18 chrome-ext live check: page matches issue #437 mockup (sidebar Tools badge=13, 4 stat cards, filter bar, dense table). Verified batch-disable works end-to-end (Playwright run disabled all 13; curl confirms disabled:13, frontend cards reflect backend stats โ€” consistent, no bug). -- 2026-05-18 final: golangci-lint 0 issues, frontend build clean, go tests GREEN. Ready for PR. -- 2026-05-18 PR #481 opened. First run: check-size red (CLAUDE.md >40k, pre-existing-adjacent โ€” main was 39605, my additions pushed over). Trimmed my own footprint (condensed CLI note + auto agent-context lines) โ†’ 39928/40000. -- 2026-05-18 CI FULLY GREEN: all builds (6 platforms), Unit Tests 9 OS/Go combos, Integration, E2E, OAuth E2E, Cross-Platform Logging, Lint, Verify OpenAPI, Build Frontend, check-size โ€” NO failures (Stress Tests skipped, normal). PR #481 awaiting human review.