Skip to content

Add LiquidOracle and ContiguousOracle#48

Open
rbx wants to merge 1 commit intomasterfrom
oracle
Open

Add LiquidOracle and ContiguousOracle#48
rbx wants to merge 1 commit intomasterfrom
oracle

Conversation

@rbx
Copy link
Copy Markdown
Member

@rbx rbx commented Apr 29, 2026

No description provided.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 29, 2026

📝 Walkthrough

Walkthrough

Adds optional oracle-based evaluation: two offline solvers (LiquidOracle, ContiguousOracle) implemented and wired into the environment, metrics tracker, CLI, and train-iteration plumbing to record and report per-episode and aggregated oracle costs and related metrics.

Changes

Oracle benchmarks

Layer / File(s) Summary
Data Shape / Types
src/oracle.py
Adds CarriedJob alias and the LiquidOracle / ContiguousOracle classes with episode-scoped internal state and public counters (unscheduled_count, spillover_count).
Core Implementation
src/oracle.py
Implements reset(carried_jobs?), record(price, jobs), and solve() for both oracles; Liquid aggregates core-hours and fills cheapest hours greedily; Contiguous implements least-slack first greedy placement with sliding-window capacity checks and prefix-sum cost evaluation.
Environment Wiring
src/environment.py
Adds enable_oracle ctor flag; instantiates self.oracle/self.contiguous_oracle when enabled; resets oracles with carried jobs during reset(); calls record(...) after generate_jobs() in step(); calls solve() at episode end and stores oracle results into metrics fields.
Metrics Integration
src/metrics_tracker.py
Adds timeline counters and per-episode fields for oracle costs and contiguous oracle counts; computes savings_vs_oracle and includes oracle fields in record_episode_completion output; increments cumulative counters.
CLI / Orchestration
train.py, train_iter.py
Adds --oracle CLI flag; threads args.oracle into ComputeClusterEnv(enable_oracle=...) and into subprocess invocations; evaluation printing conditionally includes oracle fields and aggregated oracle benchmark section.
Documentation
src/oracles.md
New spec describing oracle hierarchy, detailed Liquid and Contiguous algorithms, reporting fields, and standardized evaluation metrics (gaps, Agent Capture Rate).

Sequence Diagram(s)

sequenceDiagram
    actor Agent as TrainingLoop
    participant Env as ComputeClusterEnv
    participant Liquid as LiquidOracle
    participant Contig as ContiguousOracle
    participant Metrics as MetricsTracker

    Agent->>Env: reset(enable_oracle=True)
    Env->>Liquid: reset(carried_jobs)
    Env->>Contig: reset(carried_jobs)

    loop each timestep
        Agent->>Env: step(action)
        Env->>Liquid: record(price, new_jobs)
        Env->>Contig: record(price, new_jobs)
        Env-->>Agent: observation, reward
    end

    rect rgba(100,150,200,0.5)
        Note over Env,Metrics: episode completion
        Env->>Liquid: solve()
        Liquid-->>Env: oracle_cost
        Env->>Contig: solve()
        Contig-->>Env: oracle_contiguous_cost, unscheduled, spillover
        Env->>Metrics: record_episode_completion(..., oracle fields)
        Metrics-->>Agent: episode_data (oracle metrics, gaps)
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Description check ⚠️ Warning No pull request description was provided by the author, making it impossible to assess whether it relates to the changeset. Add a pull request description explaining the purpose, implementation approach, and key features of the oracle benchmarking system being introduced.
Docstring Coverage ⚠️ Warning Docstring coverage is 47.62% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main changes: adding two new oracle classes (LiquidOracle and ContiguousOracle) that are central to the changeset across multiple files.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/metrics_tracker.py (1)

185-205: ⚠️ Potential issue | 🟠 Major

savings_vs_oracle is backwards for a metric with that name.

Line 185 computes agent_cost - oracle_cost, but the other savings metrics use baseline_cost - agent_cost. Publishing that value as savings_vs_oracle on Line 204 will invert the meaning for any downstream reporting or plotting that treats positive values as “better.” Either flip the subtraction or rename it to something like gap_to_oracle / above_oracle_cost. As per coding guidelines, "Metrics tracker module (src/metrics_tracker.py) must track and report performance metrics".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/metrics_tracker.py` around lines 185 - 205, The variable
savings_vs_oracle is computed backwards (currently self.episode_total_cost -
self.episode_oracle_cost) causing sign inversion versus the other savings
metrics; change the calculation in metrics_tracker.py so savings_vs_oracle =
self.episode_oracle_cost - self.episode_total_cost (so positive means the agent
saved cost versus the oracle), and ensure the value placed into
episode_data['savings_vs_oracle'] uses that corrected symbol (savings_vs_oracle)
— alternatively, if you prefer semantics, rename the symbol to gap_to_oracle and
update its usage consistently (but keep the subtraction direction oracle_cost -
episode_total_cost).
src/environment.py (1)

365-381: ⚠️ Potential issue | 🟠 Major

Record only the workload that actually survives admission.

Line 365 feeds every generated job into the oracles before add_new_jobs() decides whether some arrivals are rejected (backlog_dropped on Lines 379-381). In overflow episodes that makes the oracle benchmark price work the agent/baseline never had a chance to run, which inflates oracle cost and distorts the capture-rate comparison. Please move oracle recording until after admission and pass the accepted workload, or explicitly account for queue-full rejections in the oracle input.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/environment.py` around lines 365 - 381, The oracles (self.oracle.record
and self.contiguous_oracle.record) are being fed all generated arrivals before
add_new_jobs decides which jobs are admitted, causing oracle stats to include
jobs that were rejected (backlog_dropped); move the oracle.record calls to after
add_new_jobs and pass only the accepted workload (new_jobs) or, alternatively,
reduce the recorded counts by backlog_dropped so the oracle sees the actual
admitted durations/nodes/cores; update the code around add_new_jobs and the
subsequent measurements so oracle recording uses new_jobs (or adjusted arrays)
and not the original generated lists.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/oracles.md`:
- Around line 83-84: Update the documentation so the ContiguousOracle status is
consistent: change the heading or description that currently labels "Oracle 2 —
Job-Contiguous Greedy (Proposed)" to match the implementation-status table's
"shipped" state (or alternatively update the status table to "proposed" if that
is the intended state); ensure both the section title for "Oracle 2 —
Job-Contiguous Greedy" and the mentions of ContiguousOracle in the later
paragraph(s) (around the ContiguousOracle references) use the same status word
so readers see a single, consistent state.

In `@train.py`:
- Around line 395-414: The new oracle-summary prints use f-strings with no
placeholders and Unicode minus signs, which breaks Ruff; change the two header
prints (the ones that print "\n  Liquid Oracle..." and "\n  Contiguous
Oracle...") to plain string literals (remove the leading f) and replace all
Unicode minus characters (the '−' in the parenthetical labels like "baseline_off
− oracle_liq", "baseline_off − oracle_jcg", "agent − oracle_jcg", "oracle_jcg −
oracle_liq") with the ASCII hyphen-minus '-' so the messages around
total_oracle_liquid_cost, total_oracle_contiguous_cost and continuity_cost
conform to lint rules.

---

Outside diff comments:
In `@src/environment.py`:
- Around line 365-381: The oracles (self.oracle.record and
self.contiguous_oracle.record) are being fed all generated arrivals before
add_new_jobs decides which jobs are admitted, causing oracle stats to include
jobs that were rejected (backlog_dropped); move the oracle.record calls to after
add_new_jobs and pass only the accepted workload (new_jobs) or, alternatively,
reduce the recorded counts by backlog_dropped so the oracle sees the actual
admitted durations/nodes/cores; update the code around add_new_jobs and the
subsequent measurements so oracle recording uses new_jobs (or adjusted arrays)
and not the original generated lists.

In `@src/metrics_tracker.py`:
- Around line 185-205: The variable savings_vs_oracle is computed backwards
(currently self.episode_total_cost - self.episode_oracle_cost) causing sign
inversion versus the other savings metrics; change the calculation in
metrics_tracker.py so savings_vs_oracle = self.episode_oracle_cost -
self.episode_total_cost (so positive means the agent saved cost versus the
oracle), and ensure the value placed into episode_data['savings_vs_oracle'] uses
that corrected symbol (savings_vs_oracle) — alternatively, if you prefer
semantics, rename the symbol to gap_to_oracle and update its usage consistently
(but keep the subtraction direction oracle_cost - episode_total_cost).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4126e53e-193a-43dc-8f8a-328f0cf649e2

📥 Commits

Reviewing files that changed from the base of the PR and between 9821c32 and b090845.

📒 Files selected for processing (6)
  • src/environment.py
  • src/metrics_tracker.py
  • src/oracle.py
  • src/oracles.md
  • train.py
  • train_iter.py

Comment thread src/oracle.py
Comment thread src/oracles.md Outdated
Comment thread train.py Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
src/metrics_tracker.py (1)

185-185: ⚡ Quick win

Disambiguate savings_vs_oracle now that two oracle baselines exist.

Line 185 computes savings vs the liquid oracle only, but the payload also includes contiguous-oracle cost. Consider emitting explicit fields for both to avoid accidental misuse.

Suggested patch
-        savings_vs_oracle: float = self.episode_total_cost - self.episode_oracle_cost
+        savings_vs_oracle_liquid: float = self.episode_total_cost - self.episode_oracle_cost
+        savings_vs_oracle_contiguous: float = (
+            self.episode_total_cost - self.episode_oracle_contiguous_cost
+        )
@@
-            'savings_vs_oracle': savings_vs_oracle,
+            'savings_vs_oracle_liquid': savings_vs_oracle_liquid,
+            'savings_vs_oracle_contiguous': savings_vs_oracle_contiguous,

Also applies to: 204-204

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/metrics_tracker.py` at line 185, The current single field
savings_vs_oracle is ambiguous because there are two oracle baselines; change
the metrics emission to compute and emit two explicit fields — e.g.,
savings_vs_liquid_oracle = episode_total_cost - episode_oracle_cost and
savings_vs_contiguous_oracle = episode_total_cost -
episode_contiguous_oracle_cost — instead of overwriting/using savings_vs_oracle;
update all places that currently set or read savings_vs_oracle (search for the
variable name and the emission logic in MetricsTracker/emit_metrics) to use the
two new fields so both baselines are unambiguous.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/oracle.py`:
- Around line 49-60: The record method currently zips new_jobs_durations,
new_jobs_nodes, and new_jobs_cores which silently truncates if their lengths
differ; add an explicit guard at the start of record (and likewise in
ContiguousOracle.record) that checks len(new_jobs_durations) ==
len(new_jobs_nodes) == len(new_jobs_cores) and raise a ValueError (including the
three lengths in the message) if not equal so mismatched input is detected
rather than silently undercounted.

In `@src/oracles.md`:
- Around line 142-148: Add explicit language identifiers to the two fenced code
blocks that currently lack them: change the block containing the cost names
("baseline_cost ... oracle_jcg_cost") to use ```text and likewise change the
block containing the formula line "Agent Capture Rate = (baseline_off − agent) /
(baseline_off − oracle_jcg)" to use ```text so markdownlint MD040 is satisfied;
locate these blocks by searching for the cost list and the "Agent Capture Rate"
formula in src/oracles.md and prepend "text" to their opening ``` fences.
- Around line 143-148: The documented metric keys in src/oracles.md are out of
sync with the actual emitted metric names from metrics_tracker (they use
baseline_cost_off, oracle_cost, oracle_contiguous_cost); update the listed keys
so they exactly match the emitted names: change baseline_off_cost to
baseline_cost_off, oracle_liq_cost to oracle_cost, and oracle_jcg_cost to
oracle_contiguous_cost (leave baseline_cost and agent_cost as-is if they already
match).

In `@train_iter.py`:
- Line 450: The help text for the CLI flag defined by
parser.add_argument("--oracle", ...) is misleading: update the help string in
train_iter.py to state that this flag enables both the liquid and contiguous
oracles (not only liquid). Locate the parser.add_argument call for "--oracle"
and replace the help value with a concise description such as "Enable both
liquid and contiguous oracles in the environment" (or similar wording) so it
accurately reflects the actual behavior.

---

Nitpick comments:
In `@src/metrics_tracker.py`:
- Line 185: The current single field savings_vs_oracle is ambiguous because
there are two oracle baselines; change the metrics emission to compute and emit
two explicit fields — e.g., savings_vs_liquid_oracle = episode_total_cost -
episode_oracle_cost and savings_vs_contiguous_oracle = episode_total_cost -
episode_contiguous_oracle_cost — instead of overwriting/using savings_vs_oracle;
update all places that currently set or read savings_vs_oracle (search for the
variable name and the emission logic in MetricsTracker/emit_metrics) to use the
two new fields so both baselines are unambiguous.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b655e37d-3f14-4007-b9a7-ee02510a7f31

📥 Commits

Reviewing files that changed from the base of the PR and between b090845 and 1883801.

📒 Files selected for processing (6)
  • src/environment.py
  • src/metrics_tracker.py
  • src/oracle.py
  • src/oracles.md
  • train.py
  • train_iter.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • train.py

Comment thread src/oracle.py
Comment thread src/oracles.md
Comment thread src/oracles.md
Comment thread train_iter.py Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/oracles.md`:
- Around line 154-163: The gap formulas use legacy metric names; update them to
the documented/output keys so they match the emitted metrics: replace
`baseline_off` → `baseline_cost_off`, `oracle_jcg` → `oracle_cost` (or
`oracle_contiguous_cost` where the contiguous oracle is intended), `oracle_liq`
→ `oracle_contiguous_cost` (if job-continuity cost) and `agent` → `agent_cost`;
also update the Agent Capture Rate formula to Agent Capture Rate =
(baseline_cost_off − agent_cost) / (baseline_cost_off − oracle_cost) (or use
oracle_contiguous_cost in the denominator when measuring against the contiguous
oracle) and ensure the table rows and any accompanying text use the same keys
consistently (`baseline_cost_off`, `oracle_cost`, `oracle_contiguous_cost`,
`agent_cost`).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 831c8940-5980-4845-a6c9-0e79dabc3644

📥 Commits

Reviewing files that changed from the base of the PR and between 1883801 and 84e0e27.

📒 Files selected for processing (6)
  • src/environment.py
  • src/metrics_tracker.py
  • src/oracle.py
  • src/oracles.md
  • train.py
  • train_iter.py
🚧 Files skipped from review as they are similar to previous changes (3)
  • train.py
  • src/metrics_tracker.py
  • train_iter.py

Comment thread src/oracles.md
Comment on lines +154 to +163
| `baseline_off − agent` | Savings the agent actually achieved |
| `baseline_off − oracle_jcg` | Maximum workload-shifting savings available in principle |
| `agent − oracle_jcg` | How far the agent is from the realistic optimum |
| `oracle_jcg − oracle_liq` | Cost of honoring job-continuity constraints |

The key metric for evaluating agent quality becomes:

```
Agent Capture Rate = (baseline_off − agent) / (baseline_off − oracle_jcg)
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use the same metric keys in the gap formulas as in the emitted metrics list.

The formulas still use legacy aliases (baseline_off, oracle_jcg, oracle_liq, agent) while the documented/output keys are baseline_cost_off, oracle_contiguous_cost, oracle_cost, agent_cost. Keep one naming scheme to avoid confusion.

Proposed doc fix
-| `baseline_off − agent` | Savings the agent actually achieved |
-| `baseline_off − oracle_jcg` | Maximum workload-shifting savings available in principle |
-| `agent − oracle_jcg` | How far the agent is from the realistic optimum |
-| `oracle_jcg − oracle_liq` | Cost of honoring job-continuity constraints |
+| `baseline_cost_off − agent_cost` | Savings the agent actually achieved |
+| `baseline_cost_off − oracle_contiguous_cost` | Maximum workload-shifting savings available in principle |
+| `agent_cost − oracle_contiguous_cost` | How far the agent is from the realistic optimum |
+| `oracle_contiguous_cost − oracle_cost` | Cost of honoring job-continuity constraints |
@@
-Agent Capture Rate = (baseline_off − agent) / (baseline_off − oracle_jcg)
+Agent Capture Rate = (baseline_cost_off − agent_cost) / (baseline_cost_off − oracle_contiguous_cost)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
| `baseline_off − agent` | Savings the agent actually achieved |
| `baseline_off − oracle_jcg` | Maximum workload-shifting savings available in principle |
| `agent − oracle_jcg` | How far the agent is from the realistic optimum |
| `oracle_jcg − oracle_liq` | Cost of honoring job-continuity constraints |
The key metric for evaluating agent quality becomes:
```
Agent Capture Rate = (baseline_off − agent) / (baseline_off − oracle_jcg)
```
| `baseline_cost_off − agent_cost` | Savings the agent actually achieved |
| `baseline_cost_off − oracle_contiguous_cost` | Maximum workload-shifting savings available in principle |
| `agent_cost − oracle_contiguous_cost` | How far the agent is from the realistic optimum |
| `oracle_contiguous_cost − oracle_cost` | Cost of honoring job-continuity constraints |
The key metric for evaluating agent quality becomes:
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)

[warning] 161-161: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/oracles.md` around lines 154 - 163, The gap formulas use legacy metric
names; update them to the documented/output keys so they match the emitted
metrics: replace `baseline_off` → `baseline_cost_off`, `oracle_jcg` →
`oracle_cost` (or `oracle_contiguous_cost` where the contiguous oracle is
intended), `oracle_liq` → `oracle_contiguous_cost` (if job-continuity cost) and
`agent` → `agent_cost`; also update the Agent Capture Rate formula to Agent
Capture Rate = (baseline_cost_off − agent_cost) / (baseline_cost_off −
oracle_cost) (or use oracle_contiguous_cost in the denominator when measuring
against the contiguous oracle) and ensure the table rows and any accompanying
text use the same keys consistently (`baseline_cost_off`, `oracle_cost`,
`oracle_contiguous_cost`, `agent_cost`).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant