codemod improvements#2156
Conversation
|
@modelcontextprotocol/client
@modelcontextprotocol/codemod
@modelcontextprotocol/server
@modelcontextprotocol/express
@modelcontextprotocol/fastify
@modelcontextprotocol/hono
@modelcontextprotocol/node
commit: |
|
@claude review |
|
|
||
| // `sourceText` and `lines` are computed once from the pre-insertion text. | ||
| // Insertions below mutate sf, but we process in descending line order, so | ||
| // each insertText only shifts positions above the next insertion point — | ||
| // prior byte offsets stay valid. | ||
| const sourceText = sf.getFullText().replaceAll('\r\n', '\n'); | ||
| const lines = sourceText.split('\n'); | ||
|
|
||
| for (const diag of merged) { | ||
| const lineIndex = diag.line - 1; | ||
| if (lineIndex < 0 || lineIndex >= lines.length) continue; | ||
|
|
||
| if (lineIndex > 0 && lines[lineIndex - 1]!.includes(CODEMOD_ERROR_PREFIX)) continue; | ||
|
|
||
| const indent = lines[lineIndex]!.match(/^(\s*)/)?.[1] ?? ''; | ||
| const safeMessage = diag.message.replaceAll('*/', '* /'); | ||
| const comment = `${indent}/* ${CODEMOD_ERROR_PREFIX} ${safeMessage} */`; | ||
|
|
||
| const lineStart = lines.slice(0, lineIndex).reduce((sum, l) => sum + l.length + 1, 0); | ||
| sf.insertText(lineStart, comment + '\n'); |
There was a problem hiding this comment.
🔴 insertDiagnosticComments() computes insertion offsets from a CRLF-normalized copy of the source (replaceAll('\r\n', '\n') then summing line.length + 1), but sf.insertText() operates on the un-normalized text — so on CRLF files the offset is short by one character per preceding line and the /* @mcp-codemod-error */ comment lands in the middle of a prior line, syntactically corrupting the user's file. Relatedly, there is no check that the target line start is a safe boundary, so when an actionRequired node starts on an interior line of a multi-line template literal (e.g. a Schema.parse() reference inside an interpolation), the comment text is injected into the runtime string value. Fix by computing offsets from the un-normalized text (e.g. sf.compilerNode.getPositionOfLineAndCharacter(lineIndex, 0)) and skipping insertion when the position falls inside a template literal/JSX text.
Extended reasoning...
Issue 1 — CRLF offset drift corrupts files
insertDiagnosticComments() builds its line table from a normalized copy:
const sourceText = sf.getFullText().replaceAll('\r\n', '\n');
const lines = sourceText.split('\n');
...
const lineStart = lines.slice(0, lineIndex).reduce((sum, l) => sum + l.length + 1, 0);
sf.insertText(lineStart, comment + '\n');The offset is computed as if every line ends with a single \n (length + 1), but sf.insertText() is a purely positional insertion into the un-normalized buffer, where each CRLF line contributes length + 2. ts-morph's getFullText() preserves \r\n (verified empirically with files loaded via addSourceFilesAtPaths, the same path run() uses), so for a diagnostic on line N of a CRLF file the computed offset is short by exactly N−1 characters.
Step-by-step proof (CRLF): take a 4-line CRLF file where line 4 is const c = 3;. The normalized line table yields lineStart = 39, but the real start of line 4 in the CRLF buffer is offset 42. sf.insertText(39, ...) therefore lands 3 characters before the line start — inside line 3 — producing output like const c = 3/* @mcp-codemod-error test */ followed by an orphaned ; on the next line. In a realistic file with hundreds of lines the drift grows to hundreds of characters, so the comment can split identifiers, statements, or string literals. Since insertDiagnosticComments runs immediately before project.saveSync(), the corrupted text is written back to the user's source files. actionRequired diagnostics fire on files that receive no other rewrites (unknown import paths, unmigratable .tool() calls, XSchema.parse() usage), so the file is still fully CRLF when the insertion happens — the trigger is exactly the common Windows / core.autocrlf repo this codemod targets. The new commentInsertion.test.ts only joins inputs with '\n', so this is never exercised.
Issue 2 — no safe-boundary check at the insertion point
Even on pure-LF files, the insertion is a raw text splice at the start of the diagnostic node's start line with no awareness of what surrounds that position. Several actionRequired diagnostics attach to arbitrary expression nodes — specSchemaAccess emits them on XSchema.parse(v) / un-captured safeParse(v) references and on result.error.<subProp> accesses, found by walking every identifier in the file. If such an expression sits on an interior line of a multi-line template literal (e.g. a multi-line `Invalid request: ${CallToolRequestSchema.parse(data)}` error template), the node's start line begins inside the literal text, and the inserted /* @mcp-codemod-error ... */ line becomes part of the runtime string value. Verified empirically: the template head ends up containing the comment text, and because several diagnostic messages themselves contain backticks, the insertion can even terminate the template and produce parse errors. Either way a feature meant to be purely annotative silently changes program behavior (or breaks the build).
Why nothing catches this: the */ sanitization and the duplicate-comment check don't address position correctness, the descending-line processing only protects against self-invalidation between insertions, and all tests use LF-only inputs with statement-level diagnostics.
Suggested fix: compute the insertion offset from the real source text rather than a normalized copy — ts-morph already provides this via sf.compilerNode.getPositionOfLineAndCharacter(lineIndex, 0), or split on /\r?\n/ while accounting for the actual delimiter length per line. Then add a guard before inserting: check the token/node kind at the insertion position (or use the diagnostic node's ancestors) and skip the insertion — or fall back to the start line of the enclosing statement — when the line start falls inside a template literal, string literal, or JSX text region.
There was a problem hiding this comment.
The CRLF half of this is fixed: insertDiagnosticComments() now computes offsets from the un-normalized text (splitting on '\n' so each line's \r is counted in length + 1) and inserts using the file's own line ending, and the new CRLF test in commentInsertion.test.ts covers it.
The second part is still open: there is no safe-boundary check at the insertion point. When an actionRequired node starts on an interior line of a multi-line template literal (e.g. a Schema.parse() reference inside an interpolation, which specSchemaAccess flags), the computed line start falls inside the literal text, so the /* @mcp-codemod-error ... */ line is spliced into the runtime string value — and because several diagnostic messages contain backticks, it can even terminate the template and produce a parse error. A guard before sf.insertText() — e.g. check the token/node at the insertion offset (or the diagnostic node's ancestors) and skip the insertion, or fall back to the start line of the enclosing statement, when the position falls inside a template literal / string literal / JSX text — would close this.
There was a problem hiding this comment.
Thanks for addressing the previous round — the CRLF offset handling, the hasMcpImports gate in handlerRegistration, the grep quoting, and the same-line merge test all look fixed in the latest commit, and I found no new bugs in this revision. Since the PR adds a new user-file-mutating behavior (writing @mcp-codemod-error comments into source) plus import-map changes, a human maintainer should still take a pass; one small residual from my earlier comment is that comment insertion still has no guard against landing inside a multi-line template literal, and the changeset bot is flagging a missing changeset.
Extended reasoning...
Overview
This PR ("codemod improvements") touches 17 files in packages/codemod. The main feature is a new actionRequired diagnostic kind that causes the runner to physically insert /* @mcp-codemod-error ... */ comments into user source files at the diagnostic location, plus a CLI hint pointing users at a grep command. Supporting changes include a shared isKeyPositionIdentifier helper extracted into astUtils, an import-map simplification for client/stdio.js, expanded mock-method coverage (unmock, requireActual, etc.), removal of the incorrect registerResource schema-wrap path, namespace-import guards in symbolRenames, shorthand-property expansion in contextTypes, and a new commentInsertion.test.ts suite.
Status relative to prior review
My previous review (2026-05-27 05:57 UTC) raised four issues. The latest commit (3e50511, pushed after that review) addresses them: the runner now computes offsets from the un-normalized text and preserves the file's line ending (with a CRLF test added), handlerRegistrationTransform now gates on hasMcpImports (with a test), the grep hint quotes the path, and the same-line merge test now drives run() end-to-end and asserts the merged comment. The only residual from that round is the lack of a safe-boundary check when the insertion line falls inside a multi-line template literal or JSX text — a narrow edge case, noted but not blocking on its own.
Security risks
None of significance. The codemod is a developer-run CLI tool operating on the user's own source tree; it does not touch the SDK runtime, transports, or auth. The grep hint is printed, not executed, and is now quoted. The main risk class is unwanted mutation of user files, which the new hasMcpImports gate and dry-run behavior mitigate.
Level of scrutiny
Moderate. This is tooling code rather than protocol/transport code, so the blast radius is limited to the migration experience, but the comment-insertion feature is a deliberate design choice (writing annotations into user code) that a maintainer should consciously sign off on, and the PR bundles several semi-independent transform fixes. The PR description is empty and there is no changeset, which a maintainer may want filled in before merge.
Other factors
Test coverage is reasonable: the new comment-insertion suite covers insertion, dry-run, idempotency, indentation, line-shift placement, same-line merging, and CRLF, and the transform test updates match the behavior changes. The bug hunting system found no bugs in this revision. Given the feature scope and the missing changeset/description, I am deferring rather than shadow-approving.
Motivation and Context
How Has This Been Tested?
Breaking Changes
Types of changes
Checklist
Additional context