From 26f90b5823e94bc9f5c3b956ea1f324b025e949e Mon Sep 17 00:00:00 2001
From: OmarAlJarrah <o.mazari.om63@gmail.com>
Date: Wed, 17 Jun 2026 04:50:49 +0300
Subject: [PATCH] docs: specify the generated service/operation layer design
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Add design specs under docs/codegen/ for the typed service layer the future
generator will emit on top of sdk-core. These are specifications only — no
generator, KotlinPoet, or generated code enters the repo; every snippet is
illustrative target output anchored to real sdk-core types.

- service-method-tiers.md: two-tier raw/cooked operations over the
  ResponseHandler / ParsedResponse seam, with error-mapping and
  deserialization composed at the service layer so the HTTP pipeline stays
  transport-pure and keeps returning a raw Response.
- typed-page-classes.md: list-operation page types whose nextPage() rebuilds
  a typed params object (params.toBuilder().<cursor>().build()) and drives the
  stateless Paginator, with RequestRebuilder kept as the bring-your-own path.
- operation-overloads.md: curation rules capping each operation at one
  canonical params-object method plus a small fixed convenience set, leaning
  on Kotlin default arguments and Tristate instead of a parameter cross-product
  that would tax explicit-API and apiCheck.
- sub-service-tree.md: lazily-instantiated sub-service accessor tree
  (client.foo().bar()) where the root reuses the raw-response impl to halve the
  generated type count, with a by-lazy-vs-memoized-supplier note for Java.
- README.md: index linking the four specs and how they compose.
---
 docs/codegen/README.md               |  45 ++++++
 docs/codegen/operation-overloads.md  | 152 +++++++++++++++++++++
 docs/codegen/service-method-tiers.md | 181 ++++++++++++++++++++++++
 docs/codegen/sub-service-tree.md     | 159 +++++++++++++++++++++
 docs/codegen/typed-page-classes.md   | 197 +++++++++++++++++++++++++++
 5 files changed, 734 insertions(+)
 create mode 100644 docs/codegen/README.md
 create mode 100644 docs/codegen/operation-overloads.md
 create mode 100644 docs/codegen/service-method-tiers.md
 create mode 100644 docs/codegen/sub-service-tree.md
 create mode 100644 docs/codegen/typed-page-classes.md
diff --git a/docs/codegen/README.md b/docs/codegen/README.md
new file mode 100644
index 00000000..f28c3b05
--- /dev/null
+++ b/docs/codegen/README.md
@@ -0,0 +1,45 @@
+# Codegen design specs
+
+This directory holds **design specifications** for the future dexpace code generator — the
+component that, given an API description, will emit the typed service/operation layer that sits
+on top of `sdk-core`. Nothing here is generator code. There is no KotlinPoet, no emitter, and no
+generated source in the repository; these documents define the *target shape* of the code the
+generator will eventually produce and explain how that shape binds to the runtime that already
+ships in `sdk-core`.
+
+Every Kotlin/Java snippet in these specs is **illustrative target output** — what the generator
+should emit — not code compiled in this repo. The snippets reference real `sdk-core` types
+(`HttpClient`, `ResponseHandler`, `ParsedResponse`, `Paginator`, `PaginationStrategy`, `Serde`,
+the `CallContext` chain, `Tristate`, `Request`/`Response`) so the design stays anchored to the
+runtime that exists today.
+
+For the broader survey that motivates building our own generator, see
+[`../refs-comparison.md`](../refs-comparison.md). For the runtime layering these specs build on,
+see [`../architecture.md`](../architecture.md), [`../http.md`](../http.md), and
+[`../pipelines.md`](../pipelines.md).
+
+## Specs
+
+| Spec | Topic |
+|---|---|
+| [service-method-tiers.md](service-method-tiers.md) | Two-tier raw/cooked service methods over the `ResponseHandler` / `ParsedResponse` seam. |
+| [typed-page-classes.md](typed-page-classes.md) | Page types whose `nextPage()` rebuilds a typed params object, not a URL string, tied to `Paginator` and the strategy set. |
+| [operation-overloads.md](operation-overloads.md) | Curation rules for the per-operation overload set — one canonical method plus a small, fixed convenience set instead of the full parameter cross-product. |
+| [sub-service-tree.md](sub-service-tree.md) | The lazily-instantiated sub-service accessor tree (`client.foo().bar()`) and how the root reuses the raw-response implementation. |
+
+## How the specs fit together
+
+The four specs describe one cohesive generated layer:
+
+- The **sub-service tree** is the entry surface: `client.<resource>()` accessors, lazily built.
+- Each leaf service exposes **operations**, each generated in **two tiers** (raw and cooked).
+- Each operation has a **curated overload set** rather than the full cross-product.
+- List operations additionally emit a **typed page class** that drives `Paginator` and rebuilds
+  the next request from typed params.
+
+They share two cross-cutting dependencies that are *not yet* in the tree and are tracked
+separately:
+
+- **OperationParams SPI** — the typed, builder-backed params object per operation, which both
+  the overload set and typed-page rebuild lean on.
+- The generator itself (KotlinPoet-based), per [`../refs-comparison.md`](../refs-comparison.md).
diff --git a/docs/codegen/operation-overloads.md b/docs/codegen/operation-overloads.md
new file mode 100644
index 00000000..097bb88a
--- /dev/null
+++ b/docs/codegen/operation-overloads.md
@@ -0,0 +1,152 @@
+# Curated operation overload set
+
+> Design spec. The Kotlin/Java in this document is **target generator output**, not code compiled
+> in this repository.
+
+## Problem
+
+A naive generator emits an overload for every shape a caller might want to call an operation: with
+and without a body, with and without optional path/query params, with and without a per-call
+request-options argument, params-object vs. positional-primitives. The reference SDK has on the
+order of a dozen `retrieve` overloads per operation. Multiply that by the **raw × cooked × sync ×
+async** matrix from [service-method-tiers.md](service-method-tiers.md) and the count explodes.
+
+In this codebase that explosion is not free. The repo enforces:
+
+- **explicit-API strict mode** — every public overload needs an explicit visibility and return
+  type, and is reviewed surface;
+- **binary-compatibility-validator** (`apiCheck`) — every public overload is pinned in an `api/*.api`
+  snapshot, so each one is a permanent binary-compat obligation that `apiDump` must regenerate and
+  that can never be removed without a breaking change.
+
+A wide cross-product is therefore a *permanent* explicit-API + binary-compat tax, paid on every
+operation, in every tier. The reference SDK targets Java, where overloads are the only ergonomic
+lever; we target Kotlin first, where **default arguments** collapse most of the cross-product into a
+single method.
+
+## Proposed policy: one canonical method + a small curated set
+
+For each operation in each tier, the generator emits **one canonical method** plus a **fixed, small
+curated overload set** — never the full cross-product.
+
+### The canonical method
+
+The canonical method takes the operation's typed params object and a request-options argument with
+a default:
+
+```kotlin
+// GENERATED — illustrative target output, not compiled here.
+public fun retrieve(
+    params: ModelRetrieveParams,
+    options: RequestOptions = RequestOptions.none(),
+): Model
+```
+
+All optionality inside the request lives in the **params object's builder**, not in overloads. A
+param that may be set, explicitly null, or omitted uses `Tristate<T>`
+(`org.dexpace.sdk.core.serde.Tristate`) inside the params type, so "send `null`" and "omit" stay
+distinct without spawning two overloads.
+
+### The curated overload set (the only ones generated)
+
+| # | Overload | When generated | Rationale |
+|---|---|---|---|
+| 1 | **Canonical** `op(params, options = …)` | always | The one true signature; `options` default covers the no-options call in Kotlin. |
+| 2 | **Java no-options** `op(params)` | when a Java target is emitted | `@JvmOverloads` on the canonical method materializes this for Java callers, who have no default-argument support. Not a hand-written second overload. |
+| 3 | **Bare-identifier convenience** `op(id: String, options = …)` | only when the operation has exactly **one required param** and it is a scalar path/identifier | Lets `retrieve("gpt-x")` work without a builder for the overwhelmingly common single-id case. Forwards to the canonical method with `params { … }`. |
+| 4 | **Body-first convenience** `op(body: BodyType, options = …)` | only when the operation has **exactly one required param and it is the request body** (no required path/query params) | Lets `create(body)` work for create-style operations. Forwards to canonical. |
+
+That is the whole set: at most **four** generated entry points per operation per tier, and #2 is
+produced by `@JvmOverloads` rather than a separate declaration. Everything else is reachable by
+building the params object.
+
+### Curation rules (deterministic, so the generator is mechanical)
+
+1. **Always emit the canonical params-object method.** It is the floor; every operation has it.
+2. **`options` is always a defaulted trailing argument**, never its own overload. `@JvmOverloads`
+   gives Java the no-options form.
+3. **Emit the bare-identifier overload (#3) iff** the operation has exactly one required param,
+   that param is a scalar path/query identifier (not a body), and it has no other required inputs.
+   More than one required scalar → no positional overload; callers use the builder. (Avoids
+   argument-order ambiguity, the classic `retrieve(a, b)` trap.)
+4. **Emit the body-first overload (#4) iff** the operation's only required input is the request body.
+   Mutually exclusive with #3 in practice (an operation rarely has a single required id *and* a
+   single required body as its only inputs; if it has both, neither convenience overload is emitted
+   and the builder is the entry point).
+5. **Never** emit overloads that vary optional params positionally. Optional params live in the
+   builder, full stop. This is the rule that kills the cross-product.
+6. **Apply the identical set in every tier.** Raw, cooked, sync, async all get the same curated set
+   — the raw tier returns `ParsedResponse<T>`, cooked returns `T`, async wraps in
+   `CompletableFuture`. Tiers never *add* overloads of their own.
+
+### Target output
+
+```kotlin
+// GENERATED — illustrative target output, not compiled here.
+public interface ModelService {
+    // 1 + 2: canonical, @JvmOverloads materializes the Java no-options form.
+    @JvmOverloads
+    public fun retrieve(
+        params: ModelRetrieveParams,
+        options: RequestOptions = RequestOptions.none(),
+    ): Model
+
+    // 3: single required scalar id → bare-identifier convenience, forwards to canonical.
+    @JvmOverloads
+    public fun retrieve(
+        id: String,
+        options: RequestOptions = RequestOptions.none(),
+    ): Model =
+        retrieve(ModelRetrieveParams.builder().id(id).build(), options)
+}
+```
+
+A counter-example the rules **reject** — an operation with two required scalars gets no positional
+overload:
+
+```kotlin
+// NOT GENERATED — rule 3 forbids positional overloads for multi-required-scalar ops.
+// public fun retrieve(org: String, id: String): Model   // ambiguous arg order; use the builder.
+```
+
+## Design decisions and trade-offs
+
+- **Default arguments over overloads.** Kotlin default arguments collapse the `options`-present /
+  `options`-absent axis (and any optional-param axis) into one declaration. This is the single
+  biggest lever against the cross-product and the main reason a Kotlin-first generator can stay far
+  leaner than a Java-first one. The cost is borne by Java callers, who get exactly the forms
+  `@JvmOverloads` materializes — which is why the curated set is defined in terms of *required*
+  inputs only.
+
+- **`Tristate<T>` instead of presence overloads.** Optional-with-null params do not fork into
+  "set to null" vs. "omit" overloads; they use `Tristate.Present` / `Tristate.Null` /
+  `Tristate.Absent` in the params type. One builder method, three semantics, zero extra overloads.
+
+- **Bounded, deterministic generation.** The rules above are mechanical: given an operation's
+  required-param profile, the generator emits a known, small set. No heuristic "emit if it seems
+  convenient." This keeps the generator simple and the public surface predictable — important
+  because every public method is an `apiCheck`-pinned commitment.
+
+- **Builder is the escape hatch, and it is always present.** Anything the curated overloads don't
+  cover is reachable through `Params.builder()`. There is no operation a caller cannot fully drive;
+  they may just need the builder. That trade — slightly more verbose tail cases for a dramatically
+  smaller pinned surface — is the whole point.
+
+- **Uniform across tiers.** Because the set is identical in every tier, the four tiers stay in
+  lockstep and the cross-tier matrix multiplies a *small constant* (≤4), not a dozen.
+
+## Ties into the runtime
+
+- `org.dexpace.sdk.core.serde.Tristate` — three-state optional params inside the params object,
+  removing presence-driven overloads.
+- explicit-API strict mode and `apiCheck` (binary-compatibility-validator) — the enforcement that
+  makes a wide surface costly and a curated surface cheap; see the root `CLAUDE.md`.
+- The OperationParams SPI (tracked separately) — the builder-backed params type the canonical method
+  takes and the convenience overloads forward into.
+
+## Acceptance mapping
+
+- *Overload policy documented* — the four-row curated set and the six curation rules above.
+- *Generated surface stays minimal* — at most four entry points per operation per tier (one of which
+  is `@JvmOverloads`-materialized), no positional optional-param cross-product, builder as the escape
+  hatch.
diff --git a/docs/codegen/service-method-tiers.md b/docs/codegen/service-method-tiers.md
new file mode 100644
index 00000000..4ad6fa99
--- /dev/null
+++ b/docs/codegen/service-method-tiers.md
@@ -0,0 +1,181 @@
+# Two-tier raw/cooked service methods
+
+> Design spec. The Kotlin/Java in this document is **target generator output**, not code compiled
+> in this repository.
+
+## Problem
+
+A caller of a typed operation wants two different things at different times:
+
+- Most of the time, the parsed body: `client.models().retrieve("gpt-x")` should hand back a typed
+  `Model`, with error responses already turned into exceptions and the body fully consumed and
+  closed.
+- Sometimes, the response metadata *without* paying for deserialization: the status code, an
+  `ETag`, a `Retry-After`, the raw headers — for conditional requests, cache validation, or
+  cheap existence checks. Forcing a full body parse to read a header is wasteful, and on a large
+  or hostile payload it is an unbounded-allocation hazard.
+
+A single method signature cannot serve both. Emitting only the cooked method strands the
+metadata-only caller; emitting only a raw method makes the common case verbose. The reference
+SDK (openai-java) answers this with a parallel `withRawResponse()` service tree
+(`ModelServiceImpl.kt`) — every operation exists in a cooked tier and a raw tier.
+
+## Constraints from `sdk-core`
+
+The seam these tiers dispatch against already ships. Two types in
+`org.dexpace.sdk.core.http.response` do the heavy lifting:
+
+- **`ResponseHandler<out T>`** — a `fun interface` whose `handle(response: Response): T` maps a raw
+  `Response` to a typed value. A handler that reads the body **owns consuming and closing it**.
+  Built-ins: `ResponseHandler.string()` and `ResponseHandler.empty()`. Adapter modules supply a
+  JSON handler backed by the `Serde` SPI.
+- **`ParsedResponse<out T>`** — pairs a raw `Response` with a `ResponseHandler<T>` and parses
+  **lazily and exactly once**. Its raw accessors (`status`, `headers`, `message`, `protocol`,
+  `request`) read straight from the underlying `Response` and never touch the body; `value()`
+  runs the handler on first call and memoizes the outcome (success *or* failure) behind a
+  `ReentrantLock`. It is `Closeable`, so the metadata-only path can release the body without ever
+  parsing.
+
+The pipeline stays transport-pure: `HttpClient.execute(request): Response` returns a raw
+`Response` and nothing about deserialization or error mapping lives in `http.pipeline`. **Error
+mapping and deserialization compose at the generated-service layer**, as `Response -> X` handlers
+— not as pipeline stages. This keeps the `http.pipeline` REDIRECT/RETRY/AUTH/LOGGING/SERDE pillar
+contract untouched.
+
+## Proposed generated shape
+
+Each operation is generated in two tiers behind one service interface. The **raw tier** returns a
+`ParsedResponse<T>`; the **cooked tier** returns `T` directly. The cooked tier is a thin
+delegation onto the raw tier — it calls `.value()` and lets `use {}` close the body.
+
+### Service interface (target output)
+
+```kotlin
+// GENERATED — illustrative target output, not compiled here.
+public interface ModelService {
+    // Cooked tier: parsed body, body consumed + closed, errors already thrown.
+    public fun retrieve(params: ModelRetrieveParams): Model
+
+    // Raw tier: lazy ParsedResponse — read status/headers without parsing,
+    // or call value() to parse exactly once. Caller owns close().
+    public fun withRawResponse(): WithRawResponse
+
+    public interface WithRawResponse {
+        public fun retrieve(params: ModelRetrieveParams): ParsedResponse<Model>
+    }
+}
+```
+
+### Service implementation (target output)
+
+```kotlin
+// GENERATED — illustrative target output, not compiled here.
+internal class ModelServiceImpl(
+    private val client: HttpClient,
+    private val serde: Serde,
+    private val errorMapper: ErrorMapper,
+) : ModelService {
+
+    private val rawTier = WithRawResponseImpl()
+
+    override fun withRawResponse(): ModelService.WithRawResponse = rawTier
+
+    override fun retrieve(params: ModelRetrieveParams): Model =
+        rawTier.retrieve(params).use { it.value() }
+
+    private inner class WithRawResponseImpl : ModelService.WithRawResponse {
+        override fun retrieve(params: ModelRetrieveParams): ParsedResponse<Model> {
+            val request: Request = params.toRequest(/* baseUrl, auth context, etc. */)
+            val response: Response = client.execute(request)
+            // Compose error-mapping + deserialization into one Response -> Model handler.
+            val handler: ResponseHandler<Model> =
+                ResponseHandler { raw ->
+                    errorMapper.throwOnError(raw)          // 4xx/5xx -> typed exception
+                    serde.deserializer.deserialize(        // body -> typed DTO
+                        raw.body!!.source().inputStream(),
+                        Model::class.java,
+                    )
+                }
+            return ParsedResponse.of(response, handler)
+        }
+    }
+}
+```
+
+The error-mapping/deserialization composition is the load-bearing piece. `ErrorMapper` reads the
+raw `status` and headers (and, for a 4xx/5xx, deserializes a typed error envelope) and throws;
+only on success does the handler deserialize the success type. Because the whole thing is a single
+`ResponseHandler`, it runs **once**, inside `ParsedResponse.value()`'s memoized, locked section —
+the body is touched exactly once no matter how many times `value()` is called.
+
+### Caller experience (target usage)
+
+```kotlin
+// Cooked — the 90% case.
+val model: Model = client.models().retrieve(params)
+
+// Raw — read metadata, never parse.
+client.models().withRawResponse().retrieve(params).use { raw ->
+    val etag = raw.headers["ETag"]
+    if (raw.status.code == 304) return@use  // body never deserialized
+}
+
+// Raw — read a header AND parse, single body read.
+client.models().withRawResponse().retrieve(params).use { raw ->
+    val requestId = raw.headers["x-request-id"]
+    val model = raw.value()                  // handler runs here, once
+}
+```
+
+## Design decisions and trade-offs
+
+- **One `ResponseHandler` per operation, composed at the service layer.** Error mapping and
+  deserialization are fused into a single `Response -> T` function rather than layered as two
+  passes over the body. The body is single-use; a two-pass design would force a re-read or a
+  per-response cache. `ParsedResponse` already memoizes a *thrown* outcome, so a mapped error is
+  re-thrown verbatim on every later `value()` call without re-touching the consumed body.
+
+- **Cooked delegates to raw, never the reverse.** The generator emits one real implementation (the
+  raw tier) and derives the cooked method as `rawTier.op(params).use { it.value() }`. This halves
+  the generated logic per operation and guarantees the two tiers can never drift. It also dovetails
+  with the sub-service tree spec ([sub-service-tree.md](sub-service-tree.md)), where the root reuses
+  the nested raw impl.
+
+- **Closeable + `use {}` + KDoc, no `@MustBeClosed` lint.** The raw tier hands the caller an open
+  `ParsedResponse` (a `Closeable`). We rely on Kotlin `use {}`, the Java try-with-resources idiom,
+  and explicit KDoc — we deliberately do **not** introduce an Errorprone `@MustBeClosed`
+  annotation or a new lint dependency. The cooked tier closes for the caller; only the raw tier
+  transfers ownership, and that is exactly where the KDoc warns. Note the asymmetry: a raw caller
+  who reads only metadata and forgets to close leaks the body, whereas the cooked path cannot leak.
+
+- **Transport purity preserved.** Nothing here adds a pipeline stage. `http.pipeline` keeps
+  returning a raw `Response`; the SERDE pillar stays about wire framing, and typed-body
+  deserialization is a service-layer concern. This is the boundary the issue calls out and it is
+  the boundary `ResponseHandler`/`ParsedResponse` were built for.
+
+- **Async mirror.** The async service tier returns `CompletableFuture<ParsedResponse<T>>` (cooked:
+  `CompletableFuture<T>`) by dispatching through `AsyncHttpClient.executeAsync` and mapping the
+  completed `Response` through the same handler with `thenApply`. The handler is reused verbatim;
+  only the dispatch differs. `ParsedResponse`'s own laziness still applies — the future completes
+  with an *unparsed* `ParsedResponse`, and parsing happens when the caller calls `value()`.
+
+## Ties into the runtime
+
+- `org.dexpace.sdk.core.http.response.ResponseHandler` — the `Response -> T` seam the handler is.
+- `org.dexpace.sdk.core.http.response.ParsedResponse` — lazy, memoized, `Closeable` raw/parsed
+  pairing; `ParsedResponse.of(...)` and the `Response.parsedWith(...)` extension are the factories.
+- `org.dexpace.sdk.core.client.HttpClient` / `AsyncHttpClient` — transport SPIs the tiers dispatch
+  against; both stay deserialization-free.
+- `org.dexpace.sdk.core.serde.Serde` (`serializer` / `deserializer`) — supplies the JSON handler
+  the error/deserialize composition uses; `sdk-core` ships no embedded serializer.
+- `org.dexpace.sdk.core.http.response.Status` (`code`, `isSuccess`) and `Headers` — what the raw
+  tier reads without parsing.
+
+## Acceptance mapping
+
+- *Raw + cooked tiers generated* — the `ModelService` / `WithRawResponse` shape above.
+- *Error-mapping composed at the service layer* — the single `ResponseHandler` fusing `ErrorMapper`
+  + `serde.deserializer`, never a pipeline stage.
+- *Test* — the generator's golden-file tests assert the emitted shape; a runtime fixture exercises
+  cooked-parses-once, raw-reads-headers-without-parsing, and raw-then-`value()`-single-read using a
+  stub `HttpClient`, mirroring the existing `ParsedResponse` tests.
diff --git a/docs/codegen/sub-service-tree.md b/docs/codegen/sub-service-tree.md
new file mode 100644
index 00000000..e08e267d
--- /dev/null
+++ b/docs/codegen/sub-service-tree.md
@@ -0,0 +1,159 @@
+# Lazy sub-service accessor tree
+
+> Design spec. The Kotlin/Java in this document is **target generator output**, not code compiled
+> in this repository.
+
+## Problem
+
+A real API surface is a tree: `client.models()`, `client.files()`, `client.fineTuning().jobs()`,
+and so on, often several levels deep. Two naive generation strategies both go wrong:
+
+- **Eager instantiation** — the root client constructs every service (and every nested service) in
+  its constructor. For a deep tree, that allocates the entire object graph up front even though a
+  given program touches a handful of services. It also forces every service's dependencies to be
+  resolvable at client-construction time.
+- **Duplicated raw/cooked impls** — naively pairing the two-tier design
+  ([service-method-tiers.md](service-method-tiers.md)) with a service tree doubles the generated
+  type count: a cooked `ModelServiceImpl` *and* a separate raw `ModelServiceRawImpl`, repeated for
+  every node. The tree's type count is the dominant term in generated-code size.
+
+openai-java's client impl (`OpenAIClientImpl.kt`) solves both: sub-services are `by lazy`, and the
+root client reuses the nested raw-response implementation instead of emitting a parallel cooked
+tree.
+
+## Proposed generated shape
+
+The root client and every interior node expose **lazily-instantiated** sub-service accessors. Each
+node is constructed at most once, on first access, and memoized. The root reuses the raw-response
+implementation to avoid a parallel cooked tree.
+
+### Root client (target output)
+
+```kotlin
+// GENERATED — illustrative target output, not compiled here.
+public class DexpaceClientImpl internal constructor(
+    private val client: HttpClient,
+    private val serde: Serde,
+    private val callContext: CallContext,   // base context promoted per call
+    private val errorMapper: ErrorMapper,
+) : DexpaceClient {
+
+    // Each sub-service built at most once, on first access, then memoized.
+    private val models: ModelService by lazy { ModelServiceImpl(client, serde, callContext, errorMapper) }
+    private val fineTuning: FineTuningService by lazy { FineTuningServiceImpl(client, serde, callContext, errorMapper) }
+
+    override fun models(): ModelService = models
+    override fun fineTuning(): FineTuningService = fineTuning
+}
+```
+
+### Interior node — nested accessors, same laziness (target output)
+
+```kotlin
+// GENERATED — illustrative target output, not compiled here.
+internal class FineTuningServiceImpl(
+    private val client: HttpClient,
+    private val serde: Serde,
+    private val callContext: CallContext,
+    private val errorMapper: ErrorMapper,
+) : FineTuningService {
+
+    private val jobs: FineTuningJobService by lazy {
+        FineTuningJobServiceImpl(client, serde, callContext, errorMapper)
+    }
+
+    override fun jobs(): FineTuningJobService = jobs
+}
+// Caller: client.fineTuning().jobs().retrieve(params)
+//   — fineTuning() built on first call, jobs() built on first call under it, both memoized.
+```
+
+### Reusing the raw impl instead of a parallel cooked tree (target output)
+
+The cooked tier is derived from the raw tier (per [service-method-tiers.md](service-method-tiers.md)):
+the generator emits **one** implementation per service node — the raw one — and the cooked methods
+delegate into it. The tree therefore contains one impl type per node, not two.
+
+```kotlin
+// GENERATED — illustrative target output, not compiled here.
+internal class ModelServiceImpl(
+    private val client: HttpClient,
+    private val serde: Serde,
+    private val callContext: CallContext,
+    private val errorMapper: ErrorMapper,
+) : ModelService {
+
+    // The single, real implementation: the raw tier. Built once, memoized.
+    private val rawTier: ModelService.WithRawResponse by lazy { WithRawResponseImpl() }
+
+    override fun withRawResponse(): ModelService.WithRawResponse = rawTier
+
+    // Cooked methods reuse the raw impl — no parallel cooked tree, no duplicated dispatch.
+    override fun retrieve(params: ModelRetrieveParams): Model =
+        rawTier.retrieve(params).use { it.value() }
+
+    private inner class WithRawResponseImpl : ModelService.WithRawResponse {
+        override fun retrieve(params: ModelRetrieveParams): ParsedResponse<Model> { /* dispatch */ }
+    }
+}
+```
+
+So across the whole tree the generated impl count is **one per service node** (each carrying an
+inner raw tier), not two — halving the dominant term, exactly as the issue asks.
+
+## Design decisions and trade-offs
+
+- **`by lazy` for sub-service accessors.** Kotlin's `by lazy { … }` is `LazyThreadSafetyMode.
+  SYNCHRONIZED` by default: the first reader constructs the node, concurrent readers block until it
+  is ready, and every later read returns the memoized instance. That matches the desired semantics
+  (build-once, share-safely) with no hand-written double-checked locking. The whole `HttpClient` is
+  thread-safe per its SPI contract, so a shared, lazily-built service graph is safe to use
+  concurrently.
+
+- **`by lazy` vs. a memoized supplier — Java-target note.** `by lazy` compiles to a synthetic
+  `Lazy` field and is a Kotlin-runtime construct. If a future **Java** generation target is added,
+  the equivalent is a memoized `Supplier` (e.g. a double-checked-locked or `Suppliers.memoize`-style
+  holder) exposed behind the same `fooService()` accessor — same observable contract (build-once,
+  thread-safe, memoized), different mechanism. The accessor method shape (`fun models():
+  ModelService`) is identical either way, so the public surface does not depend on which mechanism
+  backs it. Note one behavioral detail to preserve: with `by lazy` SYNCHRONIZED a *thrown*
+  initializer failure is **not** memoized — a failed init re-runs on the next access — so a Java
+  memoized supplier must match whichever retry semantics we standardize on.
+
+- **Accessors are methods, not properties.** Emitting `fun models(): ModelService` (rather than a
+  `val models`) keeps the Java call site `client.models()` natural and leaves room for the accessor
+  to take per-call arguments later without a source break. The backing `by lazy` field stays
+  `private`.
+
+- **Lazy, not eager, even for shallow trees.** Uniform laziness keeps the generator mechanical (no
+  "eager if shallow" heuristic) and means client construction cost is O(1) regardless of tree depth
+  or breadth. A program that touches three services out of forty allocates three service objects.
+
+- **Shared dependencies threaded down, not re-resolved.** `HttpClient`, `Serde`, the base
+  `CallContext`, and the `ErrorMapper` are constructed once at the root and passed by reference into
+  each lazily-built node. Nodes hold references; they do not re-resolve or re-wrap these. This keeps
+  the per-node constructor trivial and the whole graph backed by one transport and one serde.
+
+- **Context chain.** Each node carries the base `CallContext`
+  (`org.dexpace.sdk.core.http.context`), which is promoted per call into `DispatchContext` →
+  `RequestContext` → `ExchangeContext`. The service tree holds only the immutable base context; the
+  promotion chain runs at call time inside the operation, so lazy node construction never bakes in
+  per-call state.
+
+## Ties into the runtime
+
+- `org.dexpace.sdk.core.client.HttpClient` / `AsyncHttpClient` — the single shared transport threaded
+  through every node; thread-safe per contract, so the shared lazy graph is safe.
+- `org.dexpace.sdk.core.serde.Serde` — single shared serde reference passed down the tree.
+- `org.dexpace.sdk.core.http.context.CallContext` (→ `DispatchContext` → `RequestContext` →
+  `ExchangeContext`) — the base context each node holds; per-call promotion happens in the operation,
+  not at node construction.
+- `org.dexpace.sdk.core.http.response.ParsedResponse` — what the reused raw tier returns; the cooked
+  methods call `.use { it.value() }`, the single-impl reuse that halves the type count.
+
+## Acceptance mapping
+
+- *Lazy sub-service accessors* — `by lazy` backing fields behind `fun foo(): FooService` accessors,
+  at the root and every interior node.
+- *Raw impl reused* — one impl type per node (the raw tier); cooked methods delegate into it via
+  `rawTier.op(params).use { it.value() }`, so no parallel cooked tree is generated.
diff --git a/docs/codegen/typed-page-classes.md b/docs/codegen/typed-page-classes.md
new file mode 100644
index 00000000..8060e9c1
--- /dev/null
+++ b/docs/codegen/typed-page-classes.md
@@ -0,0 +1,197 @@
+# Typed page classes that rebuild typed params, not URL strings
+
+> Design spec. The Kotlin/Java in this document is **target generator output**, not code compiled
+> in this repository.
+
+## Problem
+
+For a list operation, cursor-based next-page navigation has two ways to build the next request:
+
+1. **Splice the URL** — take the previous request's URL and set/replace a query parameter
+   (`?cursor=…`). This is what the runtime's `RequestRebuilder` does today (it is `internal`,
+   `URLEncoder`-based, and the backbone of the bring-your-own strategy path).
+2. **Rebuild a typed params object** — take the operation's params, set the cursor field, and ask
+   the params object to produce the next request: `params.toBuilder().after(cursor).build()`.
+
+The runtime's strategy-based path (1) is the right *generic* mechanism, but it is the wrong thing
+to **generate** per operation. URL surgery in generated code is opaque (the cursor lives in a
+stringly-typed query param), bypasses the operation's own param validation and encoding, and
+cannot carry typed paging state (a structured cursor, a composite page token) cleanly. openai-java's
+generated `*Page` types take approach (2): `nextPage()` calls the same operation again with a
+rebuilt params object.
+
+This spec specifies approach (2) for **generated** pages while keeping approach (1) — the existing
+`PaginationStrategy` + `RequestRebuilder` — as the supported bring-your-own path.
+
+## Constraints from `sdk-core`
+
+The pagination runtime in `org.dexpace.sdk.core.pagination` is the foundation:
+
+- **`Paginator<T>`** — strategy-driven, **stateless**, page-lazy (exactly one HTTP exchange per
+  page yielded), with a `maxPages` safety cap. Exposes `iterateAll(): Iterable<T>` and
+  `streamAll(): Stream<T>`. It executes an `initialRequest` against an `HttpClient`, hands each
+  `Response` to a `PaginationStrategy`, **closes the response after `parse`**, and uses the returned
+  `Page<T>` to decide whether and how to fetch the next page.
+- **`PaginationStrategy<T>`** — `fun parse(response: Response, initialRequest: Request): Page<T>`.
+- **`Page<T>`** — `items: List<T>`, `hasNext: Boolean`, `nextPageRequest(): Request?`. The
+  `Paginator` calls `nextPageRequest()` *once* per page and retains the resulting `Request` so it
+  never holds onto the closed `Response`.
+- **`CursorPaginationStrategy<T>`** — the reference cursor strategy: a single `extractor:
+  (Response) -> CursorResult<T>` reads items + next cursor in one pass (single-read discipline,
+  since the body is single-use), then calls `RequestRebuilder.withQueryParam(initialRequest,
+  cursorQueryParam, nextCursor)`. This is the **bring-your-own URL-splice path** we keep.
+
+The key contract: `Paginator` only knows how to drive a `PaginationStrategy` that yields a `Page`
+exposing a `nextPageRequest()`. So a generated typed page must still ultimately produce a
+`Request` — but it produces it by *rebuilding typed params and asking them for a request*, never by
+splicing a URL.
+
+## Proposed generated shape
+
+For each list operation the generator emits a typed `*Page<T>` and a tiny adapter that lets it run
+under the existing `Paginator`. The typed page's `nextPage()` rebuilds the operation's params via
+`params.toBuilder()`, sets the cursor into a **typed param field**, and re-invokes the operation —
+no URL string is touched in generated code.
+
+### Typed page (target output)
+
+```kotlin
+// GENERATED — illustrative target output, not compiled here.
+public class ModelPage internal constructor(
+    private val service: ModelService,
+    private val params: ModelListParams,      // the typed params that produced THIS page
+    private val response: ModelListResponse,  // typed, already-deserialized envelope
+) {
+    public fun items(): List<Model> = response.data
+
+    /** True if the response carried a non-blank next cursor in a typed field. */
+    public fun hasNextPage(): Boolean = !response.nextCursor.isNullOrBlank()
+
+    /**
+     * Rebuilds the TYPED params with the next cursor and calls the operation again.
+     * No URL surgery — the cursor rides a typed param field, and ModelListParams.toRequest()
+     * owns the encoding.
+     */
+    public fun nextPage(): ModelPage {
+        val nextCursor = response.nextCursor
+            ?: throw NoSuchElementException("No next page.")
+        val nextParams = params.toBuilder()
+            .after(nextCursor)   // typed cursor param, not ?cursor=… string splice
+            .build()
+        return service.list(nextParams)
+    }
+}
+```
+
+### Adapter onto the runtime `Paginator` (target output)
+
+A generated `PaginationStrategy` bridges the typed page back to the stateless `Paginator`. It
+deserializes the envelope once (single-read), then builds the next `Request` by rebuilding typed
+params — calling `params.toRequest()`, **not** `RequestRebuilder`:
+
+```kotlin
+// GENERATED — illustrative target output, not compiled here.
+internal class ModelListPaginationStrategy(
+    private val serde: Serde,
+    private val params: ModelListParams,
+) : PaginationStrategy<Model> {
+
+    override fun parse(response: Response, initialRequest: Request): Page<Model> {
+        // Single read: items + next cursor out of one body pass.
+        val envelope: ModelListResponse =
+            serde.deserializer.deserialize(
+                response.body!!.source().inputStream(),
+                ModelListResponse::class.java,
+            )
+        val nextCursor = envelope.nextCursor
+        val hasNext = !nextCursor.isNullOrBlank()
+
+        val nextRequest: Request? =
+            if (hasNext) {
+                // Rebuild TYPED params, then ask them for a request.
+                params.toBuilder().after(nextCursor).build().toRequest()
+            } else {
+                null
+            }
+
+        return object : Page<Model> {
+            override val items: List<Model> = envelope.data
+            override val hasNext: Boolean = hasNext
+            override fun nextPageRequest(): Request? = nextRequest
+        }
+    }
+}
+```
+
+The generated service exposes both surfaces over the same machinery:
+
+```kotlin
+// GENERATED — illustrative target output, not compiled here.
+public fun ModelService.list(params: ModelListParams): ModelPage { /* one exchange, typed page */ }
+
+public fun ModelService.listPaginated(params: ModelListParams): Paginator<Model> =
+    Paginator(client, params.toRequest(), ModelListPaginationStrategy(serde, params))
+
+// Caller — auto-pagination, page-lazy, capped:
+for (model in client.models().listPaginated(params).iterateAll()) { /* … */ }
+
+// Caller — manual page walk on the typed page:
+var page = client.models().list(params)
+while (page.hasNextPage()) { page = page.nextPage() }
+```
+
+## Design decisions and trade-offs
+
+- **Typed param rebuild, never URL surgery in generated code.** `nextPage()` and the generated
+  strategy both go through `params.toBuilder().<cursor>(…).build()` → `params.toRequest()`. The
+  cursor is a typed param field, so it inherits the operation's own validation and encoding, and a
+  structured/opaque cursor survives round-trips without manual percent-encoding. The string-splice
+  `RequestRebuilder` stays the bring-your-own path for callers who write their own
+  `PaginationStrategy`, exactly as `CursorPaginationStrategy` uses it today.
+
+- **Single-read discipline carries over.** Response bodies are single-use and `Paginator` closes
+  each response right after `parse`. The generated strategy deserializes the envelope **once** and
+  pulls both the items and the next cursor from that one typed object — the same single-read
+  reasoning behind `CursorPaginationStrategy` + `CursorResult`, but typed instead of via an
+  extractor lambda.
+
+- **Compute the next request inside `parse`, hold no `Response`.** `Paginator` retains the
+  `Request` from `nextPageRequest()` and never the (closed) `Response`. The generated `Page` builds
+  `nextRequest` eagerly inside `parse` so it honors that contract — no reference to the response or
+  its body escapes.
+
+- **Two surfaces, one mechanism.** The typed `*Page.nextPage()` is for callers who want explicit,
+  one-page-at-a-time control; `listPaginated(...).iterateAll()` is for auto-pagination. Both rebuild
+  typed params; they differ only in who drives the loop (the caller vs. the `Paginator`). The
+  manual `nextPage()` does not get the `maxPages` cap (the caller controls the loop), whereas the
+  `Paginator` path does — callers who want the safety cap should prefer `listPaginated`.
+
+- **Async paging.** When the async tier is generated, the typed page's `nextPageAsync()` returns
+  `CompletableFuture<ModelPage>` and the operation re-invocation goes through the async service. The
+  runtime async paginator (tracked under the async-pagination work) drives it; the typed-param
+  rebuild is identical.
+
+## Ties into the runtime
+
+- `org.dexpace.sdk.core.pagination.Paginator` — stateless, page-lazy driver with `maxPages` cap;
+  `iterateAll()` / `streamAll()`.
+- `org.dexpace.sdk.core.pagination.PaginationStrategy` / `Page` — the `parse` → `Page` →
+  `nextPageRequest()` contract the generated strategy implements.
+- `org.dexpace.sdk.core.pagination.CursorPaginationStrategy` + `RequestRebuilder` — the **retained
+  bring-your-own** URL-splice path; generated code does not use `RequestRebuilder`.
+- `org.dexpace.sdk.core.serde.Serde` — single-pass deserialization of the typed list envelope.
+- `org.dexpace.sdk.core.http.request.Request` (`newBuilder()`) — produced by `params.toRequest()`,
+  the typed rebuild's output.
+
+## Dependency note
+
+This spec assumes the **OperationParams SPI** — the typed, builder-backed params object with
+`toBuilder()` and `toRequest()`. That SPI is tracked separately and is a hard prerequisite: without
+a typed params object there is nothing to rebuild, and the design degenerates back to URL surgery.
+
+## Acceptance mapping
+
+- *Typed-param next-page generation* — `nextPage()` and the generated strategy both rebuild
+  `params.toBuilder().<cursor>(…).build()`.
+- *No URL string surgery in generated paging* — generated code never calls `RequestRebuilder`; it
+  only calls `params.toRequest()`. `RequestRebuilder` remains exclusively the bring-your-own path.