Overview
Semantic function-clustering pass over the 702 non-test .go files in pkg/ (391 pkg/workflow, 311 pkg/cli, plus util packages). Thematic finder agents fanned out over file clusters (compiler, safe-outputs, engines, validation, MCP, codemods, audit/logs, cross-util); every duplicate candidate was then adversarially verified by reading both function bodies. Line references below are real and current.
Headline: the repo is already very well organized — feature-per-file is applied consistently, validation is split per concern, entity handlers are one-per-file, codemods share a factory, engine log-parsing shares metrics helpers. Findings are a focused set of genuine duplicates plus a few cross-util reimplementations, not a structural problem.
Summary
- Confirmed exact/near duplicates: 3 (P1)
- Cross-util reimplementations: 6 sites (P2)
- Misplaced/outlier helper: 1 (P2)
- Helper-scatter groups: 3 (P2/P3)
- Candidates investigated and rejected as legitimate: 5
- Status: ⚠️ minor, low-risk cleanups only
High-Confidence Findings (P1)
1. Two byte-identical YAML single-quote escapers
Both are literally strings.ReplaceAll(x, "'", "''"):
escapeSingleQuotedYAMLString — pkg/workflow/central_slash_command_workflow.go:500
escapeYAMLSingleQuotedScalar — pkg/workflow/observability_otlp.go:24
The same one-liner is inlined at many sites (engine_helpers.go:274, expression_nodes.go:172, cache.go, compiler_experiments.go, compiler_yaml.go, shell.go).
Fix: keep one canonical escaper in yaml_env_helpers.go, delete the duplicate, route inline uses through it. ⚠️ Do not fold in escapeSingleQuote (redact_secrets.go:22) — verified different (backslash-style '→\' for another context); keep separate.
2. RenderMCPConfig near-identical across 5 engines
Verified: claude/gemini/crush/opencode/antigravity each = log line + delegate to renderDefaultJSONMCPConfig(...), differing only in the path string.
claude_mcp.go:12, gemini_mcp.go:12, antigravity_mcp.go:12 → ${RUNNER_TEMP}/gh-aw/mcp-config/mcp-servers.json
crush_mcp.go:12, opencode_mcp.go:12 → /tmp/gh-aw/mcp-config/mcp-servers.json
(Codex/Copilot are legitimately different — TOML / Copilot fields — and excluded.)
Fix: a shared base method/helper taking just the path, collapsing the 5 bodies to one-line delegations.
3. call_workflow_validation.go ≈ dispatch_workflow_validation.go (~90% parallel)
Two ~200-line files with byte-identical error templates (only call-workflow/workflow_call vs dispatch-workflow/workflow_dispatch differ) and the same read→parse-YAML→check-on→check-trigger logic:
- "not found" error:
call_...:78 ≡ dispatch_...:63
- "does not support trigger":
call_...:113,145,163 ≡ dispatch_...:104,133
Fix: extract formatWorkflowNotFoundError, formatMissingTriggerError, and a shared readAndValidateWorkflowTrigger into validation_helpers.go.
Cross-Util Reimplementations & Outlier (P2)
4. stringutil.Truncate reimplemented inline (exact match)
stringutil.go:20 — Truncate(s,100) == s[:97]+"...". Reimplemented verbatim in pkg/workflow/workflow_errors.go:66-68 and :173-175 (if len(v) > 100 { v = v[:97]+"..." }). Fix: stringutil.Truncate(e.Value, 100).
5. sliceutil.Deduplicate/MergeUnique reimplemented with ad-hoc seen maps
pkg/sliceutil exports Deduplicate, MergeUnique, Filter, Map. Hand-rolled dedup loops in: pkg/cli/compile_schedule_calendar.go:43-52 & :133-142; pkg/cli/codemod_workflow_run_branches.go:168-188; pkg/workflow/awf_helpers.go:687-692; pkg/cli/logs_report_errors.go:21-27 (borderline — already a 3-line slices.Contains guard). Fix: use the util fns (compose Map/Filter for trim+drop-empty).
6. Outlier: isDescendant in a feature file but shared by 8+ codemods
Defined pkg/cli/codemod_github_app.go:186 (pure indentation util, no github-app logic), used from codemod_github_repos.go, codemod_mount_as_clis.go, codemod_user_rate_limit.go, codemod_github_app_client_id.go, and the three codemod_safe_output_* files. Fix: move to yaml_frontmatter_utils.go beside getIndentation/hasExitedBlock.
Scattered Helpers (P2 / P3)
7. JSONL scan skeleton duplicated → generic ParseJSONLFile[T]
parseEventsJSONL (gateway_logs_timeline.go:393) and parseEventsJSONLFile (copilot_events_jsonl.go:174) both do filepath.Clean → bufio.NewScanner → Buffer(maxScannerBufferSize) → per-line json.Unmarshal; only element type/return differ (skeleton recurs again at gateway_logs_timeline.go:322). A generic ParseJSONLFile[T](path) ([]T, error) absorbs the boilerplate.
8. pkg/cli formatting & severity helper scatter (P3)
Percent: formatPercent (audit_math_helpers.go:17) vs formatForecastPercent (forecast.go:1020). Severity→icon: renderSeverityIcon (audit_cross_run_render.go:336) vs getSeverityIcon (deps_security.go:211). Severity/rank→int: severityWeight (deps_security.go:227), severityRank (logs_episode.go:591), seedKindRank/seedConfidenceRank (logs_episode.go:351,364). Fix (optional): parameterized format_helpers.go/severity_helpers.go; output styles intentionally vary, so only with presets.
9. Codemod nested-block walking boilerplate
The 3-level YAML state-tracking walk (safe-outputs: → handler → field) repeats across codemod_safe_output_add_reviewer_allowlists.go, codemod_safe_output_merge_pr_constraints.go, codemod_safe_output_require_title_prefix.go; the insertion variant across codemod_cli_proxy_mode.go/codemod_difc_proxy.go. Fix: renameFieldInNestedBlock(...) + insertFieldIntoNestedBlock(...) near codemod_factory.go (~60–80 lines/file). Field-removal/top-level codemods already share the factory well — only nested-handler cases are missing.
Verified-and-Rejected (not defects)
escapeSingleQuote — different escaping scheme; keep separate.
computeGeminiToolsCore ≈ computeAntigravityToolsCore — identical because both CLIs genuinely share tool naming; extracting adds indirection only.
- Per-provider
*_installer.go differences — Copilot has expression handling Antigravity lacks.
- Engine
ParseLogMetrics — already share metrics.go; remaining differences reflect real log formats.
sortedMapKeys/inline sort.Strings — idiomatic Go; not worth a helper.
Implementation Checklist
Metadata
702 files analyzed · method: thematic fan-out finders + adversarial per-duplicate verification (both bodies read) + synthesis · confirmed duplicates 3 · cross-util sites 6 · outliers 1 · helper-scatter groups 3 · rejected 5 · 2026-05-31. Supersedes closed [refactor] #35822.
Generated by 🔧 Semantic Function Refactoring · opus48 2.5M · ◷
Overview
Semantic function-clustering pass over the 702 non-test
.gofiles inpkg/(391pkg/workflow, 311pkg/cli, plus util packages). Thematic finder agents fanned out over file clusters (compiler, safe-outputs, engines, validation, MCP, codemods, audit/logs, cross-util); every duplicate candidate was then adversarially verified by reading both function bodies. Line references below are real and current.Headline: the repo is already very well organized — feature-per-file is applied consistently, validation is split per concern, entity handlers are one-per-file, codemods share a factory, engine log-parsing shares metrics helpers. Findings are a focused set of genuine duplicates plus a few cross-util reimplementations, not a structural problem.
Summary
High-Confidence Findings (P1)
1. Two byte-identical YAML single-quote escapers
Both are literally
strings.ReplaceAll(x, "'", "''"):escapeSingleQuotedYAMLString—pkg/workflow/central_slash_command_workflow.go:500escapeYAMLSingleQuotedScalar—pkg/workflow/observability_otlp.go:24The same one-liner is inlined at many sites (
engine_helpers.go:274,expression_nodes.go:172,cache.go,compiler_experiments.go,compiler_yaml.go,shell.go).Fix: keep one canonical escaper in⚠️ Do not fold in
yaml_env_helpers.go, delete the duplicate, route inline uses through it.escapeSingleQuote(redact_secrets.go:22) — verified different (backslash-style'→\'for another context); keep separate.2.
RenderMCPConfignear-identical across 5 enginesVerified: claude/gemini/crush/opencode/antigravity each = log line + delegate to
renderDefaultJSONMCPConfig(...), differing only in the path string.claude_mcp.go:12,gemini_mcp.go:12,antigravity_mcp.go:12→${RUNNER_TEMP}/gh-aw/mcp-config/mcp-servers.jsoncrush_mcp.go:12,opencode_mcp.go:12→/tmp/gh-aw/mcp-config/mcp-servers.json(Codex/Copilot are legitimately different — TOML / Copilot fields — and excluded.)
Fix: a shared base method/helper taking just the path, collapsing the 5 bodies to one-line delegations.
3.
call_workflow_validation.go≈dispatch_workflow_validation.go(~90% parallel)Two ~200-line files with byte-identical error templates (only
call-workflow/workflow_callvsdispatch-workflow/workflow_dispatchdiffer) and the same read→parse-YAML→check-on→check-trigger logic:call_...:78≡dispatch_...:63call_...:113,145,163≡dispatch_...:104,133Fix: extract
formatWorkflowNotFoundError,formatMissingTriggerError, and a sharedreadAndValidateWorkflowTriggerintovalidation_helpers.go.Cross-Util Reimplementations & Outlier (P2)
4.
stringutil.Truncatereimplemented inline (exact match)stringutil.go:20—Truncate(s,100)==s[:97]+"...". Reimplemented verbatim inpkg/workflow/workflow_errors.go:66-68and:173-175(if len(v) > 100 { v = v[:97]+"..." }). Fix:stringutil.Truncate(e.Value, 100).5.
sliceutil.Deduplicate/MergeUniquereimplemented with ad-hocseenmapspkg/sliceutilexportsDeduplicate,MergeUnique,Filter,Map. Hand-rolled dedup loops in:pkg/cli/compile_schedule_calendar.go:43-52&:133-142;pkg/cli/codemod_workflow_run_branches.go:168-188;pkg/workflow/awf_helpers.go:687-692;pkg/cli/logs_report_errors.go:21-27(borderline — already a 3-lineslices.Containsguard). Fix: use the util fns (composeMap/Filterfor trim+drop-empty).6. Outlier:
isDescendantin a feature file but shared by 8+ codemodsDefined
pkg/cli/codemod_github_app.go:186(pure indentation util, no github-app logic), used fromcodemod_github_repos.go,codemod_mount_as_clis.go,codemod_user_rate_limit.go,codemod_github_app_client_id.go, and the threecodemod_safe_output_*files. Fix: move toyaml_frontmatter_utils.gobesidegetIndentation/hasExitedBlock.Scattered Helpers (P2 / P3)
7. JSONL scan skeleton duplicated → generic
ParseJSONLFile[T]parseEventsJSONL(gateway_logs_timeline.go:393) andparseEventsJSONLFile(copilot_events_jsonl.go:174) both dofilepath.Clean→bufio.NewScanner→Buffer(maxScannerBufferSize)→ per-linejson.Unmarshal; only element type/return differ (skeleton recurs again atgateway_logs_timeline.go:322). A genericParseJSONLFile[T](path) ([]T, error)absorbs the boilerplate.8. pkg/cli formatting & severity helper scatter (P3)
Percent:
formatPercent(audit_math_helpers.go:17) vsformatForecastPercent(forecast.go:1020). Severity→icon:renderSeverityIcon(audit_cross_run_render.go:336) vsgetSeverityIcon(deps_security.go:211). Severity/rank→int:severityWeight(deps_security.go:227),severityRank(logs_episode.go:591),seedKindRank/seedConfidenceRank(logs_episode.go:351,364). Fix (optional): parameterizedformat_helpers.go/severity_helpers.go; output styles intentionally vary, so only with presets.9. Codemod nested-block walking boilerplate
The 3-level YAML state-tracking walk (
safe-outputs:→ handler → field) repeats acrosscodemod_safe_output_add_reviewer_allowlists.go,codemod_safe_output_merge_pr_constraints.go,codemod_safe_output_require_title_prefix.go; the insertion variant acrosscodemod_cli_proxy_mode.go/codemod_difc_proxy.go. Fix:renameFieldInNestedBlock(...)+insertFieldIntoNestedBlock(...)nearcodemod_factory.go(~60–80 lines/file). Field-removal/top-level codemods already share the factory well — only nested-handler cases are missing.Verified-and-Rejected (not defects)
escapeSingleQuote— different escaping scheme; keep separate.computeGeminiToolsCore≈computeAntigravityToolsCore— identical because both CLIs genuinely share tool naming; extracting adds indirection only.*_installer.godifferences — Copilot has expression handling Antigravity lacks.ParseLogMetrics— already sharemetrics.go; remaining differences reflect real log formats.sortedMapKeys/inlinesort.Strings— idiomatic Go; not worth a helper.Implementation Checklist
RenderMCPConfigbase for the 5 JSON-MCP enginesstringutil.Truncateinworkflow_errors.gosliceutil.Deduplicate/MergeUniqueat the 4 dedup sitesisDescendantto shared utilsParseJSONLFile[T]go build ./...+ package tests after each changeMetadata
702 files analyzed · method: thematic fan-out finders + adversarial per-duplicate verification (both bodies read) + synthesis · confirmed duplicates 3 · cross-util sites 6 · outliers 1 · helper-scatter groups 3 · rejected 5 · 2026-05-31. Supersedes closed
[refactor]#35822.