Skip to content

Commit c6f0769

Browse files
authored
fix(webapp): bound logs search memory and fix pagination at scale (#4012)
## Summary The logs search page (behind a feature flag) ran ClickHouse out of memory when browsing back over long time ranges. This keeps it within bounded memory and fixes a pagination bug that could skip or duplicate rows at a page boundary. ## Fix Memory: the list query reads in sort-key order, which opens one read stream per part in the window, and on object storage those per-part read buffers dominate peak memory, so it scaled with the number of parts scanned. Two changes bound it: - The logs ClickHouse client caps the per-part read buffers via new env-tunable settings. The object-storage-only setting is opt-in, so it is never sent to a ClickHouse version that lacks it. - Recent-first window narrowing: rows come back newest first, so the presenter probes the most recent window and only widens toward the full requested range when a page is short. A busy environment fills a page from a few recent parts instead of scanning the whole range; a quiet one still returns every row in a couple of cheap reads. Correctness: the keyset cursor ordered on (triggered_timestamp, trace_id), which is not unique because the spans of a trace share both, so rows at a tie could be skipped or duplicated across pages. The cursor and ORDER BY now include span_id, and the cursor is versioned so stale cursors reset to the first page. Guards: the effective page size is capped, and the existing per-query memory limit lets a pathological wide browse fail with an error instead of taking the node down. ## ClickHouse 26.2 The memory fix relies on lazy materialization deferring the wide attributes column to the output rows, which only holds on 26.x. Cloud already runs 26.2, so this moves the dev stack, testcontainers, and CI to match. The ClickHouse test suite passes on 26.2. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
1 parent 65c545d commit c6f0769

9 files changed

Lines changed: 220 additions & 97 deletions

File tree

.github/workflows/unit-tests-internal.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,7 @@ jobs:
9595
}
9696
echo "Pre-pulling Docker images with authenticated session..."
9797
pull postgres:14
98-
pull clickhouse/clickhouse-server:25.4-alpine
98+
pull clickhouse/clickhouse-server:26.2.19.43-alpine@sha256:c6ad6a7eb2fb5999df3adfb8b69a0c7222c68fa9b8f6b04a088564ebbc959251
9999
pull redis:7.2
100100
pull testcontainers/ryuk:0.14.0
101101
pull electricsql/electric:1.2.4

.github/workflows/unit-tests-packages.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,7 @@ jobs:
9595
}
9696
echo "Pre-pulling Docker images with authenticated session..."
9797
pull postgres:14
98-
pull clickhouse/clickhouse-server:25.4-alpine
98+
pull clickhouse/clickhouse-server:26.2.19.43-alpine@sha256:c6ad6a7eb2fb5999df3adfb8b69a0c7222c68fa9b8f6b04a088564ebbc959251
9999
pull redis:7.2
100100
pull testcontainers/ryuk:0.14.0
101101
pull electricsql/electric:1.2.4

.github/workflows/unit-tests-webapp.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,7 @@ jobs:
9595
}
9696
echo "Pre-pulling Docker images with authenticated session..."
9797
pull postgres:14
98-
pull clickhouse/clickhouse-server:25.4-alpine
98+
pull clickhouse/clickhouse-server:26.2.19.43-alpine@sha256:c6ad6a7eb2fb5999df3adfb8b69a0c7222c68fa9b8f6b04a088564ebbc959251
9999
pull redis:7.2
100100
pull testcontainers/ryuk:0.14.0
101101
pull electricsql/electric:1.2.4
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
---
2+
area: webapp
3+
type: fix
4+
---
5+
6+
Keep logs search within bounded ClickHouse memory when browsing long time ranges, and fix pagination that could skip or duplicate entries sharing a timestamp.

apps/webapp/app/env.server.ts

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1642,6 +1642,34 @@ const EnvironmentSchema = z
16421642
CLICKHOUSE_LOGS_LIST_MAX_THREADS: z.coerce.number().int().default(2),
16431643
CLICKHOUSE_LOGS_LIST_MAX_ROWS_TO_READ: z.coerce.number().int().default(10_000_000),
16441644
CLICKHOUSE_LOGS_LIST_MAX_EXECUTION_TIME: z.coerce.number().int().default(120),
1645+
// Bound read-in-order memory on object-storage reads: each part opens a per-column read
1646+
// stream, and the default ~1 MiB+ S3 buffers dominate peak memory. These two byte sizes
1647+
// cap the per-stream buffers and exist on every supported ClickHouse, so they are always on.
1648+
CLICKHOUSE_LOGS_LIST_PREFETCH_BUFFER_SIZE: z.coerce.number().int().nonnegative().default(262_144),
1649+
CLICKHOUSE_LOGS_LIST_MAX_READ_BUFFER_SIZE: z.coerce.number().int().nonnegative().default(262_144),
1650+
// The decisive lever on Cloud SharedMergeTree, but it only exists on newer ClickHouse and
1651+
// is a no-op on local-disk MergeTree, so it is opt-in: unset means it is never sent (safe on
1652+
// any self-hosted version). Set to 0 on object-storage deployments to get the memory win.
1653+
CLICKHOUSE_LOGS_LIST_FILESYSTEM_CACHE_PREFER_BIGGER_BUFFER_SIZE: z.coerce
1654+
.number()
1655+
.int()
1656+
.nonnegative()
1657+
.optional(),
1658+
1659+
// Logs list pagination tuning (page sizing + recent-first probe windows).
1660+
LOGS_LIST_DEFAULT_PAGE_SIZE: z.coerce.number().int().positive().default(50),
1661+
LOGS_LIST_MAX_PAGE_SIZE: z.coerce.number().int().positive().default(100),
1662+
// Days back from the page ceiling to probe before widening to the full requested window,
1663+
// comma-separated. Empty disables narrowing (a single full-window query).
1664+
LOGS_LIST_RECENT_FIRST_PROBE_DAYS: z
1665+
.string()
1666+
.default("1,7")
1667+
.transform((s) =>
1668+
s
1669+
.split(",")
1670+
.map((v) => Number(v.trim()))
1671+
.filter((n) => Number.isFinite(n) && n > 0)
1672+
),
16451673

16461674
// Query feature flag
16471675
QUERY_FEATURE_ENABLED: z.string().default("1"),

0 commit comments

Comments
 (0)