Skip to content

refactor(usage): extract buffering into a standalone Accumulator#15

Merged
loks0n merged 9 commits into
mainfrom
feat/coroutine-safe-multitenancy-2
Jun 23, 2026
Merged

refactor(usage): extract buffering into a standalone Accumulator#15
loks0n merged 9 commits into
mainfrom
feat/coroutine-safe-multitenancy-2

Conversation

@loks0n

@loks0n loks0n commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

What

Extracts the in-memory metric buffering out of Usage into a dedicated Accumulator class that consumers create themselves against an adapter.

  • Accumulator owns collect(), flush(), and the buffer. It exposes raw signals — count() (buffered entries) and elapsedSeconds() (since last flush) — and leaves the flush policy to the caller.
  • Usage is now a pure adapter facade: no buffer, no collect/flush/shouldFlush, no threshold config.

Why

  • Separates "what's buffered" from "how to query/store" — Usage no longer carries mutable buffer state, which is the wrong place for it in coroutine/multi-tenant setups where each context wants its own buffer.
  • Drops the shouldFlush() + setFlushThreshold/setFlushInterval policy machinery in favour of primitives the caller composes:
    $accumulator = new Accumulator($adapter);
    $accumulator->collect('bandwidth', 5000, Usage::TYPE_EVENT);
    if ($accumulator->count() >= 5000 || $accumulator->elapsedSeconds() >= 10) {
        $accumulator->flush();
    }

Tests

  • New AccumulatorTest — DB-free, uses a recording fake adapter to cover event summing, gauge last-write-wins, per-type batch separation, partial-failure buffer retention, and the signals/validation.
  • Buffer tests removed from UsageBase (the real-DB addBatch round-trip stays covered by testAddBatchEvent/testAddBatchGauge).

Net -17 LoC overall.

Note: vendor/ isn't installed in the worktree, so I couldn't run composer test/check locally. CI should validate.

🤖 Generated with Claude Code

Move collect()/flush() and the in-memory buffer out of Usage into a
dedicated Accumulator that consumers instantiate against an adapter.

Replace the shouldFlush()/threshold policy with raw signals — count()
and elapsedSeconds() — so callers compose their own flush decision.
Usage is now a pure adapter facade.

Buffer behaviour is covered by a new DB-free AccumulatorTest using a
recording fake adapter; the buffer tests are removed from UsageBase.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@greptile-apps

greptile-apps Bot commented Jun 23, 2026

Copy link
Copy Markdown

Greptile Summary

This PR moves metric buffering out of Usage and makes tenant scope explicit. The main changes are:

  • Added a standalone Accumulator for collect(), flush(), buffer count, and elapsed-time signals.
  • Simplified Usage into an adapter facade with tenant arguments on read and mutation methods.
  • Updated ClickHouse and Database adapters to take tenant scope per call or per batch row.
  • Added a Tenant decorator for callers that want a tenant-bound view.
  • Updated README examples, tests, and benchmarks for the new API.

Confidence Score: 5/5

The refactor is narrowly scoped and covered by updated unit and adapter tests.

No code issues were identified in the reviewed changes, and the API split is consistently reflected across implementation, tests, benchmarks, and documentation.

T-Rex T-Rex Logs

What T-Rex did

  • Compared base and head states to confirm that the head now includes an Accumulator and that Usage no longer owns collect and flush.
  • Validated that the new Accumulator reports countAfterCollect as 3 unique entries and that separate event and gauge batches are emitted.
  • Verified tenant-based partitioning of event rows into tenantA and tenantB, the last-write-wins gauge value of 12, and the exposure of elapsedSeconds.
  • Confirmed the system retains a failed event batch with countAfterFailedFlush equal to 1 and that a retry completes successfully with exitCode 0.
  • Compared the base and head code for policy methods and observed that mutable methods like collect and flush are removed, with read signatures now starting with tenant.
  • Verified that Usage::addBatch accepts mixed tenant rows and that Tenant::addBatch overwrites or stamps rows with tenant-bound data while forwarding getTotal, find, count, purge, sum, findDaily, sumDaily, and sumDailyBatch under the tenant bound context.

View all artifacts

T-Rex Ran code and verified through T-Rex

Reviews (9): Last reviewed commit: "fix(clickhouse): reject empty tenant on ..." | Re-trigger Greptile

Comment thread src/Usage/Accumulator.php Outdated
Comment thread src/Usage/Accumulator.php Outdated
Comment thread src/Usage/Usage.php Outdated
Thread tenant through the API instead of carrying it as mutable adapter
state, so a single Usage/adapter is safe across tenants and coroutines.

- All read/query methods take `string $tenant` as the first parameter;
  addBatch carries `tenant` per metric row (a batch may span tenants).
- Remove setTenant from Usage and both adapters. ClickHouse injects the
  tenant as a synthetic query filter (shared-tables only) via
  scopeToTenant(); Database scopes the underlying db per call.
- Accumulator.collect() takes tenant first and keys the buffer by tenant.
- Add Tenant decorator: binds a tenant once and forwards to Usage with it
  pre-filled (stamping addBatch rows) for single-tenant callers.

Tests: inline adapter construction (drop makeAdapter helper), pass tenant
explicitly, add a dedicated TenantTest, update README.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment thread src/Usage/Adapter/Database.php
- ClickHouse: extract sumDailyScoped() so routedSum() stops calling the
  now-tenant-first sumDaily() with stale args (caught by phpstan).
- ClickHouse purge: keep the tenant out of the daily-forwarding decision
  and apply it only as a scope on the resulting delete, so a purge by a
  daily-incompatible/raw-only filter no longer wipes the tenant's whole
  rollup.
- Database: tenant key is required per the addBatch shape, drop dead guard.
- Tests: stamp tenant on remaining multiline addBatch rows, tighten
  RecordingAdapter types, pint formatting.

Full suite green (236 tests) against mariadb + clickhouse; phpstan + pint clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment thread src/Usage/Accumulator.php Outdated
Comment thread src/Usage/Adapter/Database.php
Make tenant scoping unavoidable instead of a step callers must remember to
compose. parseQueries() — the single chokepoint every read/delete WHERE is
built from — now takes the tenant as a required first argument and prepends
the tenant filter itself (shared-tables mode). There is no longer any way to
produce a WHERE clause that isn't tenant-scoped, and PHP/phpstan flag any
caller that omits it.

scopeToTenant() is removed; the tenant is threaded down to parseQueries
through the private read helpers. purge keeps reasoning about the caller's
own (tenant-free) filters for the daily-rollup forwarding decision.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment thread src/Usage/Adapter/ClickHouse.php Outdated
Comment thread src/Usage/Accumulator.php Outdated
Comment thread src/Usage/Accumulator.php Outdated
Comment thread src/Usage/Adapter/Database.php
loks0n and others added 2 commits June 23, 2026 20:07
Flush buffered metrics through the Usage facade instead of holding the
adapter directly — the accumulator is a higher-level buffer over Usage.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… tenant guards

- Accumulator: hash the full (tenant, metric, type, sorted-tags) identity so
  ambiguous splits like tenant "a"/metric "b:c" vs "a:b"/"c" can't collide,
  and tag insertion order no longer splits entries. Reject empty tenant.
- Accumulator: only restart the flush timer when the buffer fully drains, so a
  partial-failure retry stays overdue instead of waiting a fresh interval.
- ClickHouse: reject a shared-tables batch row without a tenant — it would be
  invisible to every tenant-scoped read.
- Database: document that the adapter owns the injected db's tenant (no
  set/restore dance; hand it a db dedicated to usage).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment thread src/Usage/Accumulator.php Outdated
empty() treats the string "0" as empty, so a tenant or metric literally
named "0" was wrongly rejected. Compare against '' instead.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment thread src/Usage/Tenant.php
Comment thread src/Usage/Tenant.php
Comment on lines +37 to +45
public function addBatch(array $metrics, string $type, int $batchSize = 1000): bool
{
foreach ($metrics as &$metric) {
$metric['tenant'] = $this->tenant;
}
unset($metric);

return $this->usage->addBatch($metrics, $type, $batchSize);
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Preserve tenant mismatches

This wrapper overwrites any existing tenant on each metric row. If a caller accidentally passes a mixed-tenant batch into a bound tenant view, the rows are all written under the bound tenant with no signal, which can misattribute usage. Rejecting rows that already carry a conflicting tenant would keep this decorator from hiding a bad producer.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

new Tenant($usage, '') would silently scope every read/write to the empty
tenant in shared-tables mode. Reject '' at construction, matching the
accumulator/write-side rule.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment thread src/Usage/Usage.php
Comment on lines +94 to +96
public function getTimeSeries(string $tenant, array $metrics, string $interval, string $startDate, string $endDate, array $queries = [], bool $zeroFill = true, ?string $type = null): array
{
return $this->adapter->getTimeSeries($metrics, $interval, $startDate, $endDate, $queries, $zeroFill, $type);
return $this->adapter->getTimeSeries($tenant, $metrics, $interval, $startDate, $endDate, $queries, $zeroFill, $type);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Validate tenant scope

The direct Usage methods now take a tenant, but this path forwards an empty string unchanged. In shared-table ClickHouse mode that becomes a tenant = '' filter, so a caller with a missing tenant can silently read an empty scope instead of failing like Accumulator and Tenant do. The same guard should be applied before forwarding tenant-scoped reads and mutations.

Comment thread src/Usage/Accumulator.php
Comment on lines +72 to +74
// tuples never collide on the key — a raw `:`-join would let
// e.g. tenant "a"/metric "b:c" and tenant "a:b"/metric "c" share one
// entry. Tags are sorted first so key order doesn't matter.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Canonicalize nested tags

ksort() only normalizes the top-level tag keys before hashing. If a tag value is itself an array, the same logical tags with nested keys in a different insertion order still produce different hashes, so event values split into multiple buffered rows and gauges can keep more than one pending value for the same metric identity. Recursively sorting array tag values before json_encode() would keep the buffer identity stable.

An empty tenant compiled to `tenant = ''` and silently read an empty scope.
parseQueries() is the single chokepoint every read funnels through, so guard
there — matching the write-side and the Accumulator/Tenant rules.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@loks0n loks0n merged commit 39dc9f1 into main Jun 23, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant