Skip to content

Add /ship skill — complete shipping workflow#46

Merged
TechNickAI merged 2 commits intomainfrom
feature/ship-skill
Apr 1, 2026
Merged

Add /ship skill — complete shipping workflow#46
TechNickAI merged 2 commits intomainfrom
feature/ship-skill

Conversation

@TechNickAI
Copy link
Copy Markdown
Owner

Summary

  • New /ship skill that orchestrates the full shipping pipeline in one command
  • Branch check → merge main → tests → multi-review (with fix-first) → commit → push → PR
  • Non-interactive by design — only stops for genuine blockers

Changes

  • plugins/core/skills/ship/SKILL.md — new skill (180 lines)
    • Detects test runner, base branch, commit style automatically
    • Optional version bump + changelog (when files exist)
    • Flags: --skip-tests, --skip-review, --quick
    • Replaces simpler /commit-push-pr from external plugin

Test plan

  • Run /ship on a feature branch with changes
  • Verify test detection works (npm test, pytest, etc.)
  • Verify multi-review is invoked and fix-first applies
  • Verify PR is created with comprehensive body
  • Test --quick flag skips tests and review

🤖 Generated with Claude Code

Single command that orchestrates: branch check → merge main → run tests →
multi-review with fix-first → commit → push → create PR. Non-interactive
by design — only stops for genuine blockers (merge conflicts, test failures,
ASK items from review).

Replaces the simpler /commit-push-pr with a full pipeline. Optional version
bump + changelog when those files exist in the project.

Inspired by gstack's ship workflow.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@claude
Copy link
Copy Markdown

claude bot commented Apr 1, 2026

Code Review

Overall this is a solid, well-structured skill. The non-interactive philosophy is clearly articulated and the Gates section is excellent. A few things worth addressing:


Issues

1. Missing category field in frontmatter

Per the skills CLAUDE.md, category is an expected frontmatter field. Other skills define it (e.g., systematic-debugging has category: debugging). This skill should have something like category: meta or category: workflow.

2. Over-prescription (violates skill-creator conventions)

The skill-creator SKILL.md explicitly flags this as a common mistake:

"Over-prescription: Writing detailed step-by-step procedures for things the LLM can figure out. Describe the goal, not the algorithm."

The current skill is structured as 8 numbered sequential steps — exactly the pattern to avoid. The token budget guidance also says skills should be under 500 words; this is substantially over. Consider restructuring to describe goals and constraints per step rather than procedures.

3. Inaccurate multi-review terminology

Step 3 says multi-review will "AUTO-FIX obvious issues" and "Present ASK items" — but multi-review's actual output categories are Fixed / Wontfix / Deferred. The ASK/AUTO-FIX vocabulary doesn't match the actual command behavior, which could cause the LLM to wait for signals that never come.

4. Broad trigger may cause false positives

"create a PR" is very broad. A user saying "create a PR for this small config change" probably doesn't want the full ship workflow (merge main, run tests, multi-review). Consider narrowing to something like "ship and create PR" or removing this trigger and letting the description handle semantic matching.

5. No handling for existing PR

gh pr create fails if a PR already exists for the branch. The workflow should check with gh pr view first and handle the "already exists" case — either skip creation or use gh pr edit to update the body.

6. Hardcoded model in Co-Authored-By

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

This hardcodes Opus 4.6 regardless of the actual running model. Either make this generic (Claude <noreply@anthropic.com>) or instruct the LLM to use the current model name from context.


Minor

  • git remote show origin for base branch detection is slow (network call). git symbolic-ref refs/remotes/origin/HEAD or git rev-parse --abbrev-ref origin/HEAD are faster local alternatives.
  • The version bump heuristic (detect change magnitude from diff) is inherently fuzzy. Worth noting this is a best-effort guess, not authoritative — users should review the bump before commit.

What's working well

  • The Gates section is excellent — clear, unambiguous stop conditions
  • Pre-flight branch check is the right call (refusing to create branches is good scope discipline)
  • Pre-existing vs. new test failure distinction is thoughtful
  • The PR body template is comprehensive and well-structured
  • --skip-tests / --skip-review / --quick flags provide good escape hatches

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 23226fc11b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +5 to +6
version: 1.0.0
triggers:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Add required skill category metadata

The skill frontmatter is missing a category field, but this repository’s validator requires every skill to include it (scripts/validate-frontmatter.pyvalidate_skill checks name, description, triggers, and category). Leaving it out causes frontmatter validation/pre-commit to fail and makes this skill non-compliant with the project schema.

Useful? React with 👍 / 👎.

Comment on lines +42 to +45
```
git fetch origin <base>
git merge origin/<base>
```
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Stash or commit dirty changes before merging base

This step merges origin/<base> directly even though the workflow explicitly allows unstaged/staged changes to exist until later, which can cause Git to abort with local-overwrite errors and halt /ship before tests/review/commit. I verified CLI semantics via git merge -h, which shows --autostash is optional (not default), so dirty working trees are not automatically protected during merge.

Useful? React with 👍 / 👎.

- Add category: workflow to frontmatter
- Remove overly broad "create a PR" trigger
- Fix multi-review terminology to use Fixed/Wontfix/Deferred
- Add skip instruction for --skip-tests/--quick in Step 2
- Add existing PR check before gh pr create, add --base flag
- Use generic Co-Authored-By: Claude <noreply@anthropic.com>
- Use fast local git rev-parse for base branch detection
- Stash uncommitted changes before merge if needed

Co-Authored-By: Claude <noreply@anthropic.com>
@claude
Copy link
Copy Markdown

claude bot commented Apr 1, 2026

Code Review

This is a well-conceived skill with a clear philosophy. A few observations, including some corrections to the prior review:


Corrections to prior review

The previous review flags several issues that are not actually present in the diff:

  • Category field is presentcategory: workflow is in the frontmatter (line 7)
  • No "create a PR" trigger — the triggers list doesn't include this phrase; the prior concern doesn't apply
  • Existing PR check is already handled — Step 7 explicitly does gh pr view --json url before creating
  • Co-Authored-By is already generic — the skill says Claude <noreply@anthropic.com>, not a hardcoded model name

Real issues

1. Format divergence from the skills corpus

Every other skill in this repo uses XML-style semantic tags (<objective>, <when-to-use>, <start-with-evidence>) rather than ## markdown headers. This isn't enforced anywhere, but diverging from the established pattern makes this skill harder to maintain alongside the rest. Consider refactoring to match the XML structure used in systematic-debugging and others.

2. Over-prescription (valid point from prior review)

The skill-creator SKILL.md is explicit: "describe the goal, not the algorithm." This skill is structured as 8 numbered sequential steps — exactly the pattern the project flags as a common mistake. The LLM doesn't need to be told "Step 4: Version bump → Step 5: Commit → Step 6: Push" in order; it needs to know the goal, constraints, and gates.

3. Token budget

The 500-word guideline (token-efficiency in skill-creator) exists for a reason — skills load on every relevant invocation. This skill is roughly 700 words. The step-by-step structure inflates the word count; converting to goal-oriented prose would naturally shrink this.

4. Pre-existing test failure detection is underspecified

Pre-existing failures (also fail on base branch): Note them, continue with a warning.

How should the LLM actually verify a failure is pre-existing? Running tests on the base branch doubles the test time and adds complexity. This is a genuinely hard edge case — worth either specifying the mechanism (e.g., git stash && git checkout <base> && <test-cmd>) or simplifying the rule to "any test failure is a STOP unless the user explicitly says the failure is pre-existing."

5. Stash/unstash path is underspecified

Stash them before merging if git would refuse the merge, then unstash after.

What if git stash pop gets conflicts after the merge? This can leave the working tree in a messy state. The skill should either handle this failure case explicitly or omit the stash suggestion and instead tell the user to commit/stash manually before shipping.


Minor

  • git remote show origin is a network call; git rev-parse --abbrev-ref origin/HEAD (mentioned in the skill itself for later use) is the faster local alternative and should be recommended for the initial detection too
  • The version bump heuristic (detect from diff) is inherently fuzzy — a docs-heavy PR could trigger a minor bump. Worth noting it's a suggestion requiring user confirmation, not an authoritative decision

What's working well

  • The Gates section is clear and unambiguous — exactly what a non-interactive skill needs
  • The pre-flight branch check (refuse to create branches) is correct scope discipline
  • The PR body template with "Review findings" section is a nice touch that adds transparency
  • --skip-tests / --skip-review / --quick escape hatches are well thought out
  • The non-interactive philosophy is clearly and correctly articulated

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

first. Do not create branches for them — that's a decision about naming and scope.

2. **Detect base branch**: Check `git remote show origin` or common names (main, master,
develop). This is the merge target.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Contradictory base branch detection methods cause ambiguity

Medium Severity

The Pre-flight step 2 instructs the agent to detect the base branch using git remote show origin (a network call), while Step 1 instructs it to use git rev-parse --abbrev-ref origin/HEAD (a local operation). These two commands can return different results — e.g., if origin/HEAD isn't set or points somewhere different than what git remote show origin reports. Since this skill targets an AI agent that follows instructions literally, the contradictory guidance could lead to detecting different base branches at each step, potentially merging from or targeting the wrong branch for the PR.

Additional Locations (1)
Fix in Cursor Fix in Web

## Pre-flight

1. **Branch check**: If on main/master, STOP. Tell the user to create a feature branch
first. Do not create branches for them — that's a decision about naming and scope.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Branch guard only checks main/master, not detected base branch

Medium Severity

The Pre-flight branch check (step 1) only stops the workflow if on main/master, but step 2 can detect develop (or other branches) as the base branch. If a user is on develop and develop is the detected base, the guard doesn't fire, and the skill proceeds through the entire pipeline — merging develop into itself, running tests, review, committing, pushing — only to fail confusingly at PR creation (creating a PR from develop to develop). The branch check needs to guard against being on any detected base branch, not just main/master.

Additional Locations (1)
Fix in Cursor Fix in Web

@TechNickAI TechNickAI merged commit 6a56a3f into main Apr 1, 2026
5 checks passed
@TechNickAI TechNickAI deleted the feature/ship-skill branch April 1, 2026 14:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant