feat(l2): @clickhouse/client byte-compat embedded façade (Layer 2)#49
Open
ShawnChen-Sirius wants to merge 3 commits into
Open
feat(l2): @clickhouse/client byte-compat embedded façade (Layer 2)#49ShawnChen-Sirius wants to merge 3 commits into
ShawnChen-Sirius wants to merge 3 commits into
Conversation
Layer 2 mirrors the @clickhouse/client surface (createClient / query / command / exec / insert / ping / stream, ResultSet, the typed error hierarchy) over an in-process chDB connection, so code written against clickhouse-js runs unchanged in embedded mode. chdb://memory clients share one reference-counted Session; query() is eager-buffered so errors surface at await; JSON-family output injects output_format_json_quote_64bit_integers=1 to keep 64-bit ints lossless, matching clickhouse-js's own default. Inserts and query parameters are encoded exactly as clickhouse-js does, decoded by the engine rather than re-implemented: - insert serializes row arrays to a FORMAT-tailed dataset (JSONEachRow for object rows, JSONCompactEachRow for positional arrays, inferred when omitted) instead of a SQL VALUES literal, so arrays / maps / tuples / Nested / JSON round-trip; an explicit null is ClickHouse NULL, only an undefined cell is rejected. - a clickhouse-js-faithful query-parameter formatter (src/layer2/params.ts): top-level strings unquoted and null as the TSV token, but inside Array/Tuple/Map strings are quoted, null is the NULL keyword and booleans are TRUE/FALSE; Date is a Unix timestamp; tuples use a TupleParam wrapper (exported here, and a clickhouse-js TupleParam is accepted structurally for migration). The formatted map is bound verbatim via a preformatted option on Layer 1's queryBindAsync, leaving Layer 1's own serializer untouched. Parameter handling is type-aware via the engine, not guarded client-side: an out-of-range Int64 is a typed error while the same value binds fine to a Float64. Includes the Layer 2 unit/behavior tests and a vitest config fix that externalizes the CJS entrypoint so the suite's session-cleanup safety net and the tests share a single module instance. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…s's own suite Two harnesses turn the byte-compat claim into something checked, not asserted. Parity (test/v3/layer2/upstream/conformance.test.ts + parity.test.ts) runs the same queries through embedded chDB and a real clickhouse-server and compares the decoded output. The server image tracks chDB's kernel (ClickHouse 26.5); both clients set output_format_json_quote_64bit_integers=1 (clickhouse-js's own suite default) so 64-bit ints are lossless on both sides. Clients are created per test so the global session-cleanup afterEach cannot tear down a shared connection. Upstream-literal (scripts/upstream-suite) runs clickhouse-js's OWN integration suite against embedded chDB: it clones clickhouse-js at the installed client's version, redirects the suite's client factory to chdb://memory, and runs the specs on vitest serially. The previous harness ran jest, but clickhouse-js migrated to vitest — so it had been parsing zero specs and silently passing. skip-list.json drops whole files for capabilities embedded lacks (HTTP transport, sockets, compression, RBAC, server runtime, cluster, TLS); expectations.patch marks the per-case embedded-vs-server divergences it.fails (documented in the suite README) so they must keep failing without hiding a regression. Both jobs gate CI (continue-on-error removed). Against clickhouse-js 1.20.0 on embedded chDB: 175 passing, 38 expected-fail, 12 skipped, 0 unexpected. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
664de56 to
d0487c1
Compare
Two independent failures on the PR layer2 workflow: 1. parity (22/23 red) — the clickhouse/clickhouse-server:26.5 image's entrypoint writes a users.d snippet requiring a password for `default` when CLICKHOUSE_PASSWORD is unset, so every parity query returned code 194 REQUIRED_PASSWORD while embedded chDB returned the correct semantic code (e.g. 60 UNKNOWN_TABLE). Locally the worklog's `clickhousectl local install 26.5` doesn't add that snippet, hence the local 23/23 green. Set CLICKHOUSE_SKIP_USER_SETUP=1 on the service so the image keeps the upstream password-less default — the service is sealed inside the GH Actions network, no real auth needed. 2. upstream-literal (6 "Expect test to fail") — 4 DateTime/Date cases and 2 others (floating point types, JS Date objects) were marked `it.fails` in expectations.patch but actually pass in CI: GitHub runners run UTC, which eliminates the DateTime session-timezone divergence the worklog deferred as Stage B. The gate is working as designed (an unexpected pass is a hard failure), and these cases have simply graduated from "known divergence" to "byte-compat under UTC". Drop the 6 hunks from expectations.patch and pin TZ=UTC on the upstream step so the contract reproduces off CI. Remaining 32 markers stay; categories 4 and 6 in the README narrowed accordingly. The macos-15-intel/20.x check is unrelated (a native chDB abort-recovery crash on that one matrix cell after the abort-storm stress test); a re-run should clear it and the underlying robustness issue belongs to chdb-core, not this PR. Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>
f857f98 to
fe175e7
Compare
Contributor
Author
|
@chibugai, review this PR. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds Layer 2: an embedded-only façade that mirrors the
@clickhouse/clientsurface (createClient,query/command/exec/insert/ping/stream,ResultSet, the typed error hierarchy) on top of the v3 Layer 1 connection. The goal is byte-compatible behavior so code written against clickhouse-js runs unchanged against an in-process chDB.Shape
src/layer2/*.ts(compiled todist/layer2) — a thin forwarder to Layer 1.chdb://memoryclients share one process-wide reference-counted Session;chdb:///pathclients share one per absolute path. Opening a different on-disk path while one is live surfacesChdbConnectionError(single active connection per process).query()is eager-buffered via Layer 1queryAsyncso errors surface atawait query()(byte-compat);stream()replays the buffer through a newline transform.output_format_json_quote_64bit_integers=1) to match server/clickhouse-js and avoid silentJSON.parseprecision loss.instanceofholds against the same module instance the runtime throws from.Tests
test/v3/layer2/*(formats / url / sql_guard / result_set / client / config / errors / adversarial / compat-types).upstream/conformance.test.ts(embedded by default, orCHDB_UPSTREAM_BACKEND=server) asserting true ClickHouse semantics.parity.test.ts(CHDB_PARITY_URL) + a non-gating clickhouse-js literal-suite import-rewrite harness, with alayer2.ymlCI workflow.Full v3 suite: 257 passing, 16 skipped (opt-in server parity), 0 failing. Typecheck clean.
Rebase note
Rebased onto current
main(now carries the global session-cleanup safety net). That safety net exposed two harness issues, fixed in the last commit:importvs therequire('../../index.js')the compiled Layer 2 uses to reach Layer 1), so the cleanup hook drained a different instance than Layer 2's sessions — a timed-out heavy query then leaked its connection and cascaded "only one active data directory" failures. Fixed by externalizing the entrypoint so every importer shares one Node-cached instance.beforeAll-shared client that the now-effective cleanup tears down after the first case; switched to a per-test client (behavior-preserving — each case is self-contained).Honest gaps (follow-ups, not this PR)
upstream/conformance.test.ts.🤖 Generated with Claude Code