Skip to content

Fix over-quoting of backslashes in table cells#820

Open
golikovichev wants to merge 1 commit into
pytest-dev:masterfrom
golikovichev:fix/769-datatable-backslash-quoting
Open

Fix over-quoting of backslashes in table cells#820
golikovichev wants to merge 1 commit into
pytest-dev:masterfrom
golikovichev:fix/769-datatable-backslash-quoting

Conversation

@golikovichev

Copy link
Copy Markdown

Fixes #769

Problem

Backslashes in datatable and Examples table cells were doubled. A cell written as a single backslash reached the step as two:

Then expect backslash in datatable
  | \ |
@then("expect backslash in datatable")
def _(datatable):
    assert datatable[0][0] == "\\"   # failed: value was "\\\\"

Cause

gherkin_parser.Cell.from_dict ran every cell value through a helper:

def _to_raw_string(normal_string: str) -> str:
    return normal_string.replace("\\", "\\\\")

The gherkin library already resolves cell escaping per the Gherkin spec before pytest-bdd sees the value (\\ -> \, \| -> |, \n -> newline, a lone \ stays \). The extra pass doubled every backslash a second time. This looks like the leftover compatibility workaround mentioned in the issue thread.

Fix

Deliver the cell value exactly as gherkin resolves it and drop the now-unused helper (it had a single call site).

Backwards compatibility

This is a behaviour change, not a pure internal fix. Cells now carry the Gherkin-resolved value with no extra doubling. Anyone who relied on the old doubled output (for example asserting \\ for a single authored backslash, or writing C:\\\\Users to survive the old double pass) will need to drop that workaround. Values now match what the feature author wrote and what other Gherkin implementations produce.

Tests

  • New tests/datatable/test_datatable.py::test_datatable_preserves_backslashes covering a lone backslash, an escaped \\, an escaped pipe \|, and a Windows path C:\Users\John. Expected values were checked against the raw gherkin parser output.
  • Updated the two test_outline_with_escaped_pipes expectations that encoded the old doubled output (bork \\ now decodes to a single trailing backslash; bork \\\| now decodes to bork \|).
  • CHANGES.rst: entry under Unreleased / Fixed.

Full suite passes locally except for pre-existing Windows-only path-separator failures in test_generate / test_migrate / test_feature_base_dir that also fail on an unmodified checkout and are unrelated to this change.

The gherkin library already resolves cell escaping per the Gherkin
spec, so the extra backslash-doubling pass in Cell.from_dict
double-quoted every backslash: a cell holding a single backslash
reached the step as two.

Drop the redundant _to_raw_string pass so cell values are delivered
exactly as gherkin resolves them. Update the outline escaped-pipe
expectations that encoded the old doubled output and add a datatable
regression test.

Fixes pytest-dev#769
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Too much quoting of backslashes in datatables

1 participant