Skip to content

Commit b9672e2

Browse files
authored
Use fixed freebuff grace period
Remove the freebuff session grace environment variable and use the fixed server-side grace window from free-session config. Also refresh the default deps comment to describe the current config getter and test injection path.
1 parent f85cf87 commit b9672e2

4 files changed

Lines changed: 7 additions & 14 deletions

File tree

docs/freebuff-waiting-room.md

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -18,9 +18,8 @@ The entire system is gated by the env flag `FREEBUFF_WAITING_ROOM_ENABLED`. When
1818
# Disable entirely (both the gate on chat/completions and the admission loop)
1919
FREEBUFF_WAITING_ROOM_ENABLED=false
2020

21-
# Other knobs (only read when enabled)
21+
# Other knob (only read when enabled)
2222
FREEBUFF_SESSION_LENGTH_MS=3600000 # 1 hour
23-
FREEBUFF_SESSION_GRACE_MS=1800000 # 30 min — drain window after expiry
2423
```
2524

2625
Flipping the flag is safe at runtime: existing rows stay in the DB and will be admitted / expired correctly whenever the flag is flipped back on.
@@ -161,7 +160,7 @@ The final tick result carries a `queueDepthByModel` map and a single `skipped` r
161160
| `FIREWORKS_DEPLOYMENT_MAP` | `web/src/llm-api/fireworks-config.ts` | `glm-5.1` | Models with dedicated Fireworks deployments. Models not listed are treated as `healthy` (serverless fallback) — drop this default when they migrate to their own deployments. |
162161
| `HEALTH_CACHE_TTL_MS` | `fireworks-health.ts` | 25000 | Fleet probe cache TTL. Sits just under the Fireworks 30s exporter cadence and 6 req/min rate limit. |
163162
| `FREEBUFF_SESSION_LENGTH_MS` | env | 3_600_000 | Session lifetime |
164-
| `FREEBUFF_SESSION_GRACE_MS` | env | 1_800_000 | Drain window after expiry — gate still admits requests so an in-flight agent can finish, but the CLI is expected to block new prompts. Hard cutoff at `expires_at + grace`. |
163+
| `SESSION_GRACE_MS` | `web/src/server/free-session/config.ts` | 1_800_000 | Drain window after expiry — gate still admits requests so an in-flight agent can finish, but the CLI is expected to block new prompts. Hard cutoff at `expires_at + grace`. |
165164

166165
## HTTP API
167166

@@ -275,7 +274,7 @@ When the waiting room is disabled, the gate returns `{ ok: true, reason: 'disabl
275274

276275
## Drain / Grace Window
277276

278-
We don't want to kill an agent mid-run just because the user's session ticked over. After `expires_at`, the row enters a "draining" state for `FREEBUFF_SESSION_GRACE_MS` (default 30 min). During the drain window:
277+
We don't want to kill an agent mid-run just because the user's session ticked over. After `expires_at`, the row enters a "draining" state for `SESSION_GRACE_MS` (30 min). During the drain window:
279278

280279
- `checkSessionAdmissible` returns `{ ok: true, reason: 'draining', gracePeriodRemainingMs }` — chat completions still go through.
281280
- `getSessionState` / `requestSession` return `{ status: 'ended', instanceId, ... }` on the wire. The CLI hides the input and shows the Enter-to-rejoin banner while still forwarding the instance id so in-flight agent work can keep streaming.

packages/internal/src/env-schema.ts

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -64,11 +64,6 @@ export const serverEnvSchema = clientEnvSchema.extend({
6464
.int()
6565
.positive()
6666
.default(60 * 60 * 1000),
67-
FREEBUFF_SESSION_GRACE_MS: z.coerce
68-
.number()
69-
.int()
70-
.nonnegative()
71-
.default(30 * 60 * 1000),
7267
})
7368
export const serverEnvVars = serverEnvSchema.keyof().options
7469
export type ServerEnvVar = (typeof serverEnvVars)[number]
@@ -127,5 +122,4 @@ export const serverProcessEnv: ServerInput = {
127122
// Freebuff waiting room
128123
FREEBUFF_WAITING_ROOM_ENABLED: process.env.FREEBUFF_WAITING_ROOM_ENABLED,
129124
FREEBUFF_SESSION_LENGTH_MS: process.env.FREEBUFF_SESSION_LENGTH_MS,
130-
FREEBUFF_SESSION_GRACE_MS: process.env.FREEBUFF_SESSION_GRACE_MS,
131125
}

web/src/server/free-session/config.ts

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ export const FREEBUFF_ADMISSION_LOCK_ID = 573924815
1717
* drip rate: staggering admissions keeps newly-admitted CLIs from all hitting
1818
* Fireworks simultaneously even when a large block of sessions expires at once. */
1919
export const ADMISSION_TICK_MS = 15_000
20+
export const SESSION_GRACE_MS = 30 * 60 * 1000
2021

2122
export function isWaitingRoomEnabled(): boolean {
2223
return env.FREEBUFF_WAITING_ROOM_ENABLED
@@ -43,7 +44,7 @@ export function getSessionLengthMs(): number {
4344
* expected to stop accepting new user prompts. Hard cutoff at
4445
* `expires_at + grace`; past that the gate returns `session_expired`. */
4546
export function getSessionGraceMs(): number {
46-
return env.FREEBUFF_SESSION_GRACE_MS
47+
return SESSION_GRACE_MS
4748
}
4849

4950
/**

web/src/server/free-session/public-api.ts

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -185,9 +185,8 @@ const defaultDeps: SessionDeps = {
185185
getInstantAdmitCapacity,
186186
isWaitingRoomEnabled,
187187
get graceMs() {
188-
// Read-through getter so test overrides via env still work; the value
189-
// itself is materialized once per call. Cheaper than a thunk because
190-
// callers don't have to invoke a function.
188+
// Read-through getter keeps the default deps aligned with config while
189+
// tests can still inject a plain graceMs value through SessionDeps.
191190
return getSessionGraceMs()
192191
},
193192
get sessionLengthMs() {

0 commit comments

Comments
 (0)