Skip to content

fix(parser): preserve quote metadata for mixed quoted/unquoted words#2045

Closed
chaliy wants to merge 1 commit into
mainfrom
2026-06-12-propose-fix-for-quote-metadata-loss-bug
Closed

fix(parser): preserve quote metadata for mixed quoted/unquoted words#2045
chaliy wants to merge 1 commit into
mainfrom
2026-06-12-propose-fix-for-quote-metadata-loss-bug

Conversation

@chaliy

@chaliy chaliy commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Superseded by #2054 — rebased cleanly on main.

Copilot AI review requested due to automatic review settings June 12, 2026 01:28
@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented Jun 12, 2026

Copy link
Copy Markdown

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
bashkit 8f91288 Commit Preview URL Jun 12 2026, 01:34 AM

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to preserve quote metadata when a double-quoted segment is immediately followed by adjacent unquoted content, preventing unintended expansions (notably heredoc-body expansion and quoted metacharacters) while still enabling glob expansion on truly unquoted glob characters.

Changes:

  • Updated read_double_quoted_string to avoid downgrading concatenated double-quoted segments to Token::Word, returning Token::QuotedWord or Token::QuotedGlobWord instead.
  • Removed the now-unused has_quoted_expansion tracking in the double-quote lexer path.
  • Added integration regression tests for partially quoted heredoc delimiters and mixed quoted/unquoted words containing quoted metacharacters.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
crates/bashkit/src/parser/lexer.rs Adjusts token classification for concatenated double-quoted segments to preserve quote metadata.
crates/bashkit/tests/integration/blackbox_security_tests.rs Adds regression tests for partially quoted heredoc delimiters and mixed quoted/unquoted word quote semantics.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1336 to +1339
if has_glob {
return Some(Token::QuotedGlobWord(content));
}
if has_quoted_expansion {
return Some(Token::QuotedWord(content));
}
return Some(Token::Word(content));
return Some(Token::QuotedWord(content));
Comment on lines +1337 to +1347
#[tokio::test]
async fn quoted_glob_metacharacter_with_unquoted_suffix_stays_literal() {
let mut bash = tight_bash();
let result = bash.exec(r#"echo "*"x"#).await.unwrap();

assert_eq!(
result.stdout, "*x\n",
"glob metacharacter from quoted segment was expanded"
);
}

Comment on lines 1331 to 1335
let before_len = content.len();
self.read_continuation_into(&mut content);
let has_glob = content[before_len..]
.chars()
.any(|c| matches!(c, '*' | '?' | '['));
@chaliy chaliy closed this Jun 12, 2026
chaliy added a commit that referenced this pull request Jun 12, 2026
Closes #2045

Words mixing quoted and unquoted segments (e.g. "*"*.txt) lost quote metadata during lexing: quoted-segment metacharacters were passed raw to glob expansion, causing them to expand. Escapes glob metacharacters in quoted ranges before returning QuotedGlobWord, and tightens regression tests to create matching files so a broken lexer would fail deterministically.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants