Skip to content

Commit 0d5d951

Browse files
committed
gh-150878: Speed up json.dumps(ensure_ascii=False) for long strings
escape_size() sizes the ensure_ascii=False encoder output one character at a time; a character needs escaping only when c == '"' || c == '\\' || c < 0x20, and non-ASCII is kept verbatim. For the one-byte representation, detect the no-escape case eight bytes at a time and return the verbatim size directly; a length guard keeps short strings on the original per-character loop. Strings with characters above U+00FF keep the current path. Output is byte-identical, verified against test_json and a 199-case dumps differential in both ensure_ascii modes. dumps of long 1-byte strings runs up to 5.8x faster (4.2x for Latin-1 text); short keys and non-Latin-1 strings are unaffected.
1 parent 7a468a1 commit 0d5d951

2 files changed

Lines changed: 34 additions & 0 deletions

File tree

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
Speed up :func:`json.dumps` with ``ensure_ascii=False`` for strings made up of
2+
long runs of characters that need no escaping, by scanning eight bytes at a
3+
time. Short strings, strings that need escaping, and strings with characters
4+
above U+00FF are unaffected. Patch by Bernát Gábor.

Modules/_json.c

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -281,6 +281,36 @@ escape_size(const void *input, int kind, Py_ssize_t input_chars)
281281
Py_ssize_t i;
282282
Py_ssize_t output_size;
283283

284+
/* SWAR no-escape fast path (1-byte): needs-escape is c == '"' || c == '\\'
285+
|| c < 0x20; non-ASCII (Latin-1 >= 0x80) is kept verbatim here. A length
286+
guard keeps short strings on the original per-character loop. */
287+
if (kind == PyUnicode_1BYTE_KIND && input_chars >= 16
288+
&& input_chars < PY_SSIZE_T_MAX - 2) {
289+
const Py_UCS1 *p = (const Py_UCS1 *)input;
290+
const uint64_t ones = 0x0101010101010101ULL;
291+
const uint64_t high = 0x8080808080808080ULL;
292+
const uint64_t bq = 0x22ULL * ones, bs = 0x5cULL * ones, bc = 0xE0ULL * ones;
293+
Py_ssize_t j = 0;
294+
int needs_escape = 0;
295+
for (; j + 8 <= input_chars; j += 8) {
296+
uint64_t w;
297+
memcpy(&w, p + j, 8);
298+
uint64_t mq = w ^ bq; mq = (mq - ones) & ~mq & high;
299+
uint64_t ms = w ^ bs; ms = (ms - ones) & ~ms & high;
300+
uint64_t vc = w & bc; uint64_t mlo = (vc - ones) & ~vc & high;
301+
if (mq | ms | mlo) { needs_escape = 1; break; }
302+
}
303+
if (!needs_escape) {
304+
for (; j < input_chars; j++) {
305+
Py_UCS1 c = p[j];
306+
if (c == '"' || c == '\\' || c < 0x20) { needs_escape = 1; break; }
307+
}
308+
}
309+
if (!needs_escape) {
310+
return input_chars + 2;
311+
}
312+
}
313+
284314
/* Compute the output size */
285315
for (i = 0, output_size = 2; i < input_chars; i++) {
286316
Py_UCS4 c = PyUnicode_READ(kind, input, i);

0 commit comments

Comments
 (0)