Skip to content

[refactor] Semantic function clustering: verified duplicates, scattered helpers & cross-util reimplementations #36022

@github-actions

Description

@github-actions

Overview

Semantic function-clustering pass over the 702 non-test .go files in pkg/ (391 pkg/workflow, 311 pkg/cli, plus util packages). Thematic finder agents fanned out over file clusters (compiler, safe-outputs, engines, validation, MCP, codemods, audit/logs, cross-util); every duplicate candidate was then adversarially verified by reading both function bodies. Line references below are real and current.

Headline: the repo is already very well organized — feature-per-file is applied consistently, validation is split per concern, entity handlers are one-per-file, codemods share a factory, engine log-parsing shares metrics helpers. Findings are a focused set of genuine duplicates plus a few cross-util reimplementations, not a structural problem.

Summary

  • Confirmed exact/near duplicates: 3 (P1)
  • Cross-util reimplementations: 6 sites (P2)
  • Misplaced/outlier helper: 1 (P2)
  • Helper-scatter groups: 3 (P2/P3)
  • Candidates investigated and rejected as legitimate: 5
  • Status: ⚠️ minor, low-risk cleanups only

High-Confidence Findings (P1)

1. Two byte-identical YAML single-quote escapers

Both are literally strings.ReplaceAll(x, "'", "''"):

  • escapeSingleQuotedYAMLStringpkg/workflow/central_slash_command_workflow.go:500
  • escapeYAMLSingleQuotedScalarpkg/workflow/observability_otlp.go:24

The same one-liner is inlined at many sites (engine_helpers.go:274, expression_nodes.go:172, cache.go, compiler_experiments.go, compiler_yaml.go, shell.go).

Fix: keep one canonical escaper in yaml_env_helpers.go, delete the duplicate, route inline uses through it. ⚠️ Do not fold in escapeSingleQuote (redact_secrets.go:22) — verified different (backslash-style '\' for another context); keep separate.

2. RenderMCPConfig near-identical across 5 engines

Verified: claude/gemini/crush/opencode/antigravity each = log line + delegate to renderDefaultJSONMCPConfig(...), differing only in the path string.

  • claude_mcp.go:12, gemini_mcp.go:12, antigravity_mcp.go:12${RUNNER_TEMP}/gh-aw/mcp-config/mcp-servers.json
  • crush_mcp.go:12, opencode_mcp.go:12/tmp/gh-aw/mcp-config/mcp-servers.json

(Codex/Copilot are legitimately different — TOML / Copilot fields — and excluded.)

Fix: a shared base method/helper taking just the path, collapsing the 5 bodies to one-line delegations.

3. call_workflow_validation.godispatch_workflow_validation.go (~90% parallel)

Two ~200-line files with byte-identical error templates (only call-workflow/workflow_call vs dispatch-workflow/workflow_dispatch differ) and the same read→parse-YAML→check-on→check-trigger logic:

  • "not found" error: call_...:78dispatch_...:63
  • "does not support trigger": call_...:113,145,163dispatch_...:104,133

Fix: extract formatWorkflowNotFoundError, formatMissingTriggerError, and a shared readAndValidateWorkflowTrigger into validation_helpers.go.

Cross-Util Reimplementations & Outlier (P2)

4. stringutil.Truncate reimplemented inline (exact match)

stringutil.go:20Truncate(s,100) == s[:97]+"...". Reimplemented verbatim in pkg/workflow/workflow_errors.go:66-68 and :173-175 (if len(v) > 100 { v = v[:97]+"..." }). Fix: stringutil.Truncate(e.Value, 100).

5. sliceutil.Deduplicate/MergeUnique reimplemented with ad-hoc seen maps

pkg/sliceutil exports Deduplicate, MergeUnique, Filter, Map. Hand-rolled dedup loops in: pkg/cli/compile_schedule_calendar.go:43-52 & :133-142; pkg/cli/codemod_workflow_run_branches.go:168-188; pkg/workflow/awf_helpers.go:687-692; pkg/cli/logs_report_errors.go:21-27 (borderline — already a 3-line slices.Contains guard). Fix: use the util fns (compose Map/Filter for trim+drop-empty).

6. Outlier: isDescendant in a feature file but shared by 8+ codemods

Defined pkg/cli/codemod_github_app.go:186 (pure indentation util, no github-app logic), used from codemod_github_repos.go, codemod_mount_as_clis.go, codemod_user_rate_limit.go, codemod_github_app_client_id.go, and the three codemod_safe_output_* files. Fix: move to yaml_frontmatter_utils.go beside getIndentation/hasExitedBlock.

Scattered Helpers (P2 / P3)

7. JSONL scan skeleton duplicated → generic ParseJSONLFile[T]

parseEventsJSONL (gateway_logs_timeline.go:393) and parseEventsJSONLFile (copilot_events_jsonl.go:174) both do filepath.Cleanbufio.NewScannerBuffer(maxScannerBufferSize) → per-line json.Unmarshal; only element type/return differ (skeleton recurs again at gateway_logs_timeline.go:322). A generic ParseJSONLFile[T](path) ([]T, error) absorbs the boilerplate.

8. pkg/cli formatting & severity helper scatter (P3)

Percent: formatPercent (audit_math_helpers.go:17) vs formatForecastPercent (forecast.go:1020). Severity→icon: renderSeverityIcon (audit_cross_run_render.go:336) vs getSeverityIcon (deps_security.go:211). Severity/rank→int: severityWeight (deps_security.go:227), severityRank (logs_episode.go:591), seedKindRank/seedConfidenceRank (logs_episode.go:351,364). Fix (optional): parameterized format_helpers.go/severity_helpers.go; output styles intentionally vary, so only with presets.

9. Codemod nested-block walking boilerplate

The 3-level YAML state-tracking walk (safe-outputs: → handler → field) repeats across codemod_safe_output_add_reviewer_allowlists.go, codemod_safe_output_merge_pr_constraints.go, codemod_safe_output_require_title_prefix.go; the insertion variant across codemod_cli_proxy_mode.go/codemod_difc_proxy.go. Fix: renameFieldInNestedBlock(...) + insertFieldIntoNestedBlock(...) near codemod_factory.go (~60–80 lines/file). Field-removal/top-level codemods already share the factory well — only nested-handler cases are missing.

Verified-and-Rejected (not defects)

  • escapeSingleQuote — different escaping scheme; keep separate.
  • computeGeminiToolsCorecomputeAntigravityToolsCore — identical because both CLIs genuinely share tool naming; extracting adds indirection only.
  • Per-provider *_installer.go differences — Copilot has expression handling Antigravity lacks.
  • Engine ParseLogMetrics — already share metrics.go; remaining differences reflect real log formats.
  • sortedMapKeys/inline sort.Strings — idiomatic Go; not worth a helper.

Implementation Checklist

  • P1.1 consolidate YAML single-quote escapers
  • P1.2 shared RenderMCPConfig base for the 5 JSON-MCP engines
  • P1.3 extract call/dispatch workflow validation helpers
  • P2.4 adopt stringutil.Truncate in workflow_errors.go
  • P2.5 adopt sliceutil.Deduplicate/MergeUnique at the 4 dedup sites
  • P2.6 relocate isDescendant to shared utils
  • P2.7 generic ParseJSONLFile[T]
  • P2.9 nested-block codemod helpers
  • go build ./... + package tests after each change

Metadata

702 files analyzed · method: thematic fan-out finders + adversarial per-duplicate verification (both bodies read) + synthesis · confirmed duplicates 3 · cross-util sites 6 · outliers 1 · helper-scatter groups 3 · rejected 5 · 2026-05-31. Supersedes closed [refactor] #35822.

Generated by 🔧 Semantic Function Refactoring · opus48 2.5M ·

  • expires on Jun 2, 2026, 12:13 AM UTC

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions