Skip to content

Refactor ParseWorkflowFile orchestration into focused helper phases#36012

Merged
pelikhan merged 8 commits into
mainfrom
copilot/lint-monster-refactor-functions
May 31, 2026
Merged

Refactor ParseWorkflowFile orchestration into focused helper phases#36012
pelikhan merged 8 commits into
mainfrom
copilot/lint-monster-refactor-functions

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 30, 2026

The custom large-function lint pass flagged orchestration methods in compiler_*.go; this change targets ParseWorkflowFile by decomposing its control flow without changing compiler behavior. The goal is to reduce function complexity while keeping existing validation, merge precedence, and error formatting semantics intact.

  • Orchestration decomposition

    • Split ParseWorkflowFile into explicit phases:
      • parse-result type gating (shared / redirect-only)
      • setup (engine + imports, tools/markdown processing, initial WorkflowData)
      • validation (model alias, engine/runtime checks, tool-policy checks)
      • population (YAML extraction, import merges, step/service merges, on-section processing)
    • Introduced workflowBuildContext to carry phase state instead of passing long parameter lists.
  • Behavior-preserving extraction of dense merge logic

    • Isolated imported observability merge flow into dedicated helpers:
      • raw observability map extraction
      • endpoint dedupe/precedence merge
      • merged OTLP/apply-back to frontmatter
    • Isolated env merge/source-attribution logic into focused helpers, including the no-imports path.
  • Error handling parity

    • Preserved existing formatted-error behavior by extracting engine/tools error wrappers that keep:
      • frontmatter engine: line targeting
      • contextual source rendering
      • no double-formatting of already-formatted compiler errors.
ctx := newWorkflowBuildContext(parseResult)
if err := c.setupWorkflowBuildContext(ctx); err != nil { return nil, err }
if err := c.validateWorkflowBuildContext(ctx); err != nil { return nil, err }
if err := c.populateWorkflowBuildContext(ctx); err != nil { return nil, err }
return ctx.workflowData, nil

Copilot AI and others added 2 commits May 31, 2026 00:10
Co-authored-by: gh-aw-bot <259018956+gh-aw-bot@users.noreply.github.com>
Co-authored-by: gh-aw-bot <259018956+gh-aw-bot@users.noreply.github.com>
Copilot AI changed the title [WIP] Refactor compiler functions to reduce complexity Refactor ParseWorkflowFile orchestration into focused helper phases May 31, 2026
Copilot AI requested a review from gh-aw-bot May 31, 2026 00:12
@github-actions github-actions Bot mentioned this pull request May 31, 2026
@pelikhan pelikhan marked this pull request as ready for review May 31, 2026 01:32
Copilot AI review requested due to automatic review settings May 31, 2026 01:32
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 31, 2026

🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 31, 2026

🧪 Test Quality Sentinel completed test quality analysis.

No test files were added or modified in this PR. The only changed file is pkg/workflow/compiler_orchestrator_workflow.go (a production file). Test Quality Sentinel skipped.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 31, 2026

Design Decision Gate 🏗️ completed the design decision gate check.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 31, 2026

PR Code Quality Reviewer completed the code quality review.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

🏗️ Design Decision Gate — ADR Required

This PR makes significant changes to core business logic (276 new lines in pkg/workflow/) but does not have a linked Architecture Decision Record (ADR).

📄 Draft ADR committed: docs/adr/36012-decompose-parseworkflowfile-into-build-context-phases.md — review and complete it before merging.

🔒 This PR cannot merge until an ADR is linked in the PR body.

📋 What to do next
  1. Review the draft ADR committed to your branch — it was generated from the PR diff (the four-phase workflowBuildContext decomposition).
  2. Complete the missing sections — confirm the behavior-preserving claim is backed by existing compiler tests, refine the decision rationale, and adjust the alternatives if you weighed others.
  3. Commit the finalized ADR to docs/adr/ on your branch.
  4. Reference the ADR in this PR body by adding a line such as:

    ADR: ADR-36012: Decompose ParseWorkflowFile Orchestration into Build-Context Phases

Once an ADR is linked in the PR body, this gate will re-run and verify the implementation matches the decision.

❓ Why ADRs Matter

ADRs create a searchable, permanent record of why the codebase looks the way it does. This decomposition establishes a workflowBuildContext orchestration pattern (cf. ADR-35812) that future refactors in this package are likely to follow — worth recording explicitly. Future contributors (and your future self) will thank you.

📋 Michael Nygard ADR Format Reference

An ADR must contain these four sections to be considered complete:

  • Context — What is the problem? What forces are at play?
  • Decision — What did you decide? Why?
  • Alternatives Considered — What else could have been done?
  • Consequences — What are the trade-offs (positive and negative)?

All ADRs are stored in docs/adr/ as Markdown files numbered by PR number (e.g., 36012-...md for PR #36012).

🔒 This PR cannot merge until an ADR is linked in the PR body.

🏗️ ADR gate enforced by Design Decision Gate 🏗️ · opus48 643.1K ·

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skills-Based Review 🧠

Applied /zoom-out and /improve-codebase-architecture — commenting with suggestions; no blocking issues.

📋 Key Themes & Highlights

Key Themes

  • Opaque return tuple (mergeRawOTLPEndpoints): the two int return positions carry implicit meaning that's invisible at the call site — named returns or consolidating the log counts into applyMergedRawObservability would improve readability.
  • Silent error discard (mergeImportedObservability): the lenient json.Unmarshal pattern is preserved from the original, but in a named helper it's easier to miss. A log line would keep future debuggers informed.
  • Fragile ordering contract (validateWorkflowEngineSettings): the comment asking callers to keep the checks slice in legacy order is an implicit contract with no enforcement; a brief test or expanded comment would lock it down.
  • Untested pure functions: buildMergedEnvSources, setMainWorkflowEnvSources, and mergeRawOTLPEndpoints are now independently unit-testable but have no coverage. The refactor makes these easy to test — worth taking the opportunity.
  • Weak field name (workflowBuildContext.result): renaming to frontmatter would align with the project's existing vocabulary and avoid shadowing common result local names.

Positive Highlights

  • ✅ Clean four-phase decomposition (setup → validate → populate) makes the compilation pipeline easy to reason about at a glance.
  • validateWorkflowEngineSettings using a []func(*WorkflowData) error slice is an elegant way to unify the seven engine checks without repetition.
  • ✅ Error formatting helpers (formatEngineSetupError, formatToolsProcessingError) correctly preserve the double-formatting guard — good attention to existing semantics.
  • mergeWorkflowEnv inverted the condition to handle the no-imports fast path first — more readable control flow than the original if/else if.

🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · sonnet46 1.7M

envSources := make(map[string]string, len(topEnv))
for key := range topEnv {
envSources[key] = "(main workflow)"
func mergeRawOTLPEndpoints(mainObs map[string]any, importedObs map[string]any) ([]any, int, int) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/improve-codebase-architecture] mergeRawOTLPEndpoints returns an opaque ([]any, int, int) tuple — the two int positions are meaningful (mainCount, importAdded) but invisible at the call site.

💡 Suggestion: named returns or a small struct

The two integers are only used for a log line in applyMergedRawObservability. Named return values would clarify the contract without adding a struct:

func mergeRawOTLPEndpoints(mainObs, importedObs map[string]any) (merged []any, mainCount, importAdded int) {
    ...
    return
}

Alternatively, since mainCount and importAdded are used exclusively for logging, they could be dropped from the return and computed inside applyMergedRawObservability directly. That would reduce the coupling between the two functions.

return
}
var importedObs map[string]any
if err := json.Unmarshal([]byte(mergedObservability), &importedObs); err != nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/zoom-out] mergeImportedObservability silently swallows the JSON unmarshal error on line 244. This was the original behaviour, but now that the logic lives in a named helper it's less obvious — a future caller won't know the function is a no-op when the input is malformed.

💡 Suggestion: log the error
if err := json.Unmarshal([]byte(mergedObservability), &importedObs); err != nil {
    orchestratorWorkflowLog.Printf("skipping imported observability merge: invalid JSON: %v", err)
    return
}

This preserves the existing lenient behaviour while making unexpected failures visible in debug output.

if err := c.validateEngineMCPToolTimeout(workflowData); err != nil {
return nil, fmt.Errorf("%s: %w", cleanPath, err)
func (c *Compiler) validateWorkflowEngineSettings(cleanPath string, workflowData *WorkflowData) error {
// Keep this order aligned with legacy ParseWorkflowFile behavior.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/improve-codebase-architecture] The comment // Keep this order aligned with legacy ParseWorkflowFile behavior. is a fragile implicit contract. If execution order matters (e.g., a later validator assumes an earlier one already returned an error), there's nothing to enforce it.

💡 Suggestion: make the contract testable

Add an integration test that exercises the specific ordering that matters (e.g., validateRunInstallScripts fires before validateEngineVersion). Even a simple sequence-of-errors test would prevent silent reordering.

Alternatively, if order genuinely does not matter and the comment is just defensive, replace it with a note explaining why order is irrelevant, which is equally useful.


// workflowBuildContext captures the shared state across parse setup, validation,
// and workflow population phases.
type workflowBuildContext struct {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/zoom-out] The result field on workflowBuildContext carries *parser.FrontmatterResult — the name result doesn't communicate what it holds, and it shadows the common local variable name result used throughout the old code.

💡 Suggestion: rename to `frontmatter`
type workflowBuildContext struct {
    cleanPath    string
    content      []byte
    frontmatter  *parser.FrontmatterResult  // was: result
    markdownDir  string
    ...
}

This makes ctx.frontmatter.Frontmatter read naturally and avoids the ambiguity with generic result locals in other functions that receive ctx.


orchestratorWorkflowLog.Printf("Workflow file parsing completed successfully: %s", markdownPath)
return workflowData, nil
func buildMergedEnvSources(mergedEnv map[string]any, topEnv map[string]any, importedSources map[string]string) map[string]string {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/improve-codebase-architecture] buildMergedEnvSources and setMainWorkflowEnvSources are now independently testable pure functions but have no unit tests. The env source attribution logic (main workflow vs import path) is subtle and worth covering directly.

💡 Suggested tests
func TestBuildMergedEnvSources(t *testing.T) {
    merged := map[string]any{"A": "1", "B": "2"}
    topEnv := map[string]any{"A": "1"} // A is from main
    importedSources := map[string]string{"B": "imports/shared.md"}

    got := buildMergedEnvSources(merged, topEnv, importedSources)
    assert.Equal(t, "(main workflow)", got["A"])
    assert.Equal(t, "imports/shared.md", got["B"])
}

Similarly for mergeRawOTLPEndpoints (dedup logic, precedence, mainCount vs importAdded counters).

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The refactor is behavior-preserving and the decomposition is clean — phase sequencing is correct, error-formatting semantics are intact, merge precedence is unchanged, and the validation order matches the original.

### Findings summary

Two non-blocking observations on newly introduced code:

  1. mergeRawOTLPEndpoints opaque return signature (line 267) — the ([]any, int, int) tuple gives no indication of which int is mainCount vs importAdded. Named returns or a small struct would make this self-documenting.

  2. workflowBuildContext undocumented phase ordering (line 15) — engineSetup, toolsResult, and workflowData are nil until setupWorkflowBuildContext completes. The struct doc should state this explicitly so test helpers and future orchestrators don't accidentally call validate/populate on an uninitialized context.

No correctness regressions found. All grumpy-coder findings (nil map write, silent JSON discard, observability overwrite, double-wrap) were present in the original code and are not introduced by this change.

🔎 Code quality review by PR Code Quality Reviewer · sonnet46 2.5M

envSources := make(map[string]string, len(topEnv))
for key := range topEnv {
envSources[key] = "(main workflow)"
func mergeRawOTLPEndpoints(mainObs map[string]any, importedObs map[string]any) ([]any, int, int) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opaque return signature makes call sites hard to reason about: the three returned values ([]any, int, int) have no indication of semantics at the definition or call site.

💡 Suggested fix

Use named returns to make the contract self-documenting:

func mergeRawOTLPEndpoints(mainObs, importedObs map[string]any) (endpoints []any, mainCount, importAdded int) {
    seen := make(map[string]bool)
    for _, ep := range extractRawOTLPEndpointMaps(mainObs) {
        if url, _ := ep["url"].(string); url != "" && !seen[url] {
            seen[url] = true
            endpoints = append(endpoints, ep)
        }
    }
    mainCount = len(endpoints)
    for _, ep := range extractRawOTLPEndpointMaps(importedObs) {
        if url, _ := ep["url"].(string); url != "" && !seen[url] {
            seen[url] = true
            endpoints = append(endpoints, ep)
            importAdded++
        }
    }
    return
}

With unnamed returns, the caller mergedEndpoints, mainCount, importAdded := mergeRawOTLPEndpoints(...) works, but a future reader of the function signature alone cannot tell which int is which.


// workflowBuildContext captures the shared state across parse setup, validation,
// and workflow population phases.
type workflowBuildContext struct {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

workflowBuildContext has undocumented phase-ordering invariants: engineSetup, toolsResult, and workflowData are nil until setupWorkflowBuildContext completes. Calling validateWorkflowBuildContext or populateWorkflowBuildContext on a freshly constructed context panics.

💡 Suggested fix

Document the phase contract directly on the struct, so a future caller (e.g. a test helper, a second orchestrator) doesn't accidentally skip setup:

// workflowBuildContext captures shared state across the three build phases:
//   1. setupWorkflowBuildContext  — must run first; populates engineSetup, toolsResult, workflowData
//   2. validateWorkflowBuildContext — requires phase 1 to have succeeded
//   3. populateWorkflowBuildContext — requires phases 1 and 2 to have succeeded
//
// Fields engineSetup, toolsResult, and workflowData are nil until phase 1 completes;
// accessing them before that point will panic.
type workflowBuildContext struct {

Alternatively, a ready bool sentinel or a small validation helper (ctx.assertSetupDone()) can catch misuse at development time.

@pelikhan
Copy link
Copy Markdown
Collaborator

@copilot review all comments and reviews from copilot and github-actions actor. Respond and resolve.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 31, 2026

@copilot review all comments and reviews from copilot and github-actions actor. Respond and resolve.

Addressed the actionable github-actions review items in 338cf28: documented the workflowBuildContext phase contract, renamed ctx.result to ctx.frontmatter, made mergeRawOTLPEndpoints self-documenting, added focused helper/ordering tests, and finalized the ADR file. I also checked the Design Decision Gate run; it completed successfully. The only remaining non-code follow-up is adding the ADR link in the PR body.

Copilot AI requested a review from pelikhan May 31, 2026 02:01
@github-actions
Copy link
Copy Markdown
Contributor

@copilot review all comments and address unresolved review feedback.
Please add the ADR link to the PR body, then rerun checks.

Generated by 👨‍🍳 PR Sous Chef · gpt54 9M ·

@github-actions
Copy link
Copy Markdown
Contributor

@copilot review all comments and address unresolved review feedback.
Please add the ADR link to the PR body, then rerun checks.

Generated by 👨‍🍳 PR Sous Chef · gpt54 9M ·

@pelikhan
Copy link
Copy Markdown
Collaborator

Copilot AI and others added 2 commits May 31, 2026 02:48
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 31, 2026

@copilot lint go, merge main and recompile

see https://github.com/github/gh-aw/actions/runs/26700559611/job/78692677070?pr=36012

Addressed in 4bff6f7. I fixed the remaining Go fmt/lint issues, reran make golint, make build, and make recompile, and verified the branch is already up to date with origin/main, so no additional merge commit was needed.

@github-actions
Copy link
Copy Markdown
Contributor

@copilot review all comments and re-run checks after any final tweak.
If everything is already green, please summarize the remaining blockers.

Generated by 👨‍🍳 PR Sous Chef · gpt54 3.8M ·

@pelikhan pelikhan merged commit b893332 into main May 31, 2026
26 checks passed
@pelikhan pelikhan deleted the copilot/lint-monster-refactor-functions branch May 31, 2026 05:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[lint-monster] refactor: reduce function complexity in compiler_*.go files (205 findings)

4 participants