Skip to content

Disambiguate buffer_size from async scheduler work admission #700

@eric-tramel

Description

@eric-tramel

Context

With the async scheduler, buffer_size no longer acts like the main control for parallelism. Scheduler work admission is now governed by task leases, and the public cap for that is RunConfig.max_in_flight_tasks.

That leaves buffer_size carrying a name that suggests runtime buffering/concurrency, while its remaining durable role is closer to row-group/checkpoint granularity.

Current mental model

  • max_in_flight_tasks: maximum async scheduler tasks that may hold task leases at once.
  • scheduler/resource admission: should become responsible for choosing when to admit more executable work, including future resource-aware signals such as memory pressure.
  • buffer_size: currently determines row-group size, which affects how many rows are processed/checkpointed together and how often writes happen.

Proposal

Consider disambiguating buffer_size into a name like checkpoint_batch_size or row_group_size.

The goal is to make the configuration tell users what the knob actually controls:

  • how many rows are grouped before checkpoint/finalization/write boundaries
  • how much batch/full-column work is shaped per row group
  • processor batch boundaries

The scheduler should own decisions about when to pick up additional work based on resource admission and pressure signals, rather than users needing to tune buffer_size as an indirect concurrency or memory-control knob.

Why this matters

The old name creates a confusing overlap between three concepts:

  1. rows grouped for checkpointing/writes
  2. scheduler task work admitted in flight
  3. runtime resource pressure such as memory/model request capacity

As the scheduler becomes more resource-aware, we should separate these concepts explicitly so users do not treat row-group size as the primary async effort cap.

Open questions

  • Should the public replacement name be checkpoint_batch_size, row_group_size, or something else?
  • Should buffer_size remain as a deprecated alias for one release cycle?
  • What migration warning/doc wording should explain the difference from max_in_flight_tasks?
  • Are there places where buffer_size still genuinely means memory buffering rather than checkpoint/write granularity?
  • What telemetry/capacity-plan fields should be renamed or grouped to reflect this split?

Possible acceptance criteria

  • Public config and docs distinguish row/checkpoint batching from scheduler task admission.
  • Existing buffer_size users have a clear migration path.
  • Capacity reporting presents row-group/checkpoint sizing separately from task lease capacity.
  • Scheduler resource admission remains the place for future memory/resource-aware work pickup decisions.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions