Disambiguate buffer_size from async scheduler work admission

## Context

With the async scheduler, `buffer_size` no longer acts like the main control for parallelism. Scheduler work admission is now governed by task leases, and the public cap for that is `RunConfig.max_in_flight_tasks`.

That leaves `buffer_size` carrying a name that suggests runtime buffering/concurrency, while its remaining durable role is closer to row-group/checkpoint granularity.

## Current mental model

- `max_in_flight_tasks`: maximum async scheduler tasks that may hold task leases at once.
- scheduler/resource admission: should become responsible for choosing when to admit more executable work, including future resource-aware signals such as memory pressure.
- `buffer_size`: currently determines row-group size, which affects how many rows are processed/checkpointed together and how often writes happen.

## Proposal

Consider disambiguating `buffer_size` into a name like `checkpoint_batch_size` or `row_group_size`.

The goal is to make the configuration tell users what the knob actually controls:

- how many rows are grouped before checkpoint/finalization/write boundaries
- how much batch/full-column work is shaped per row group
- processor batch boundaries

The scheduler should own decisions about when to pick up additional work based on resource admission and pressure signals, rather than users needing to tune `buffer_size` as an indirect concurrency or memory-control knob.

## Why this matters

The old name creates a confusing overlap between three concepts:

1. rows grouped for checkpointing/writes
2. scheduler task work admitted in flight
3. runtime resource pressure such as memory/model request capacity

As the scheduler becomes more resource-aware, we should separate these concepts explicitly so users do not treat row-group size as the primary async effort cap.

## Open questions

- Should the public replacement name be `checkpoint_batch_size`, `row_group_size`, or something else?
- Should `buffer_size` remain as a deprecated alias for one release cycle?
- What migration warning/doc wording should explain the difference from `max_in_flight_tasks`?
- Are there places where `buffer_size` still genuinely means memory buffering rather than checkpoint/write granularity?
- What telemetry/capacity-plan fields should be renamed or grouped to reflect this split?

## Possible acceptance criteria

- Public config and docs distinguish row/checkpoint batching from scheduler task admission.
- Existing `buffer_size` users have a clear migration path.
- Capacity reporting presents row-group/checkpoint sizing separately from task lease capacity.
- Scheduler resource admission remains the place for future memory/resource-aware work pickup decisions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disambiguate buffer_size from async scheduler work admission #700

Context

Current mental model

Proposal

Why this matters

Open questions

Possible acceptance criteria

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Disambiguate buffer_size from async scheduler work admission #700

Description

Context

Current mental model

Proposal

Why this matters

Open questions

Possible acceptance criteria

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions