Skip to content

feat: add pretty run report#416

Open
kaeun97 wants to merge 11 commits into
egraphs-good:mainfrom
kaeun97:kaeun97/pretty-report
Open

feat: add pretty run report#416
kaeun97 wants to merge 11 commits into
egraphs-good:mainfrom
kaeun97:kaeun97/pretty-report

Conversation

@kaeun97
Copy link
Copy Markdown

@kaeun97 kaeun97 commented May 5, 2026

Resolves #398.

Here is an example code:

from __future__ import annotations
from egglog import *

egraph = EGraph()

class Num(Expr):
    def __init__(self, n: i64Like) -> None: ...
    def __add__(self, other: Num) -> Num: ...
    def __mul__(self, other: Num) -> Num: ...

x, y = vars_("x y", Num)
egraph.register(rewrite(x + y).to(y + x))
egraph.register(Num(1) + Num(2))
report = egraph.run(10)
print(report)

Output before:

RunReport { iterations: [IterationReport { rule_set_report: RuleSetReport { changed: true, rule_reports: {"(rewrite (__main___Num___add__ _x _y) (__main___Num___add__ _y _x))": [RuleReport { plan: None, search_and_apply_time: 2.625µs, num_matches: 1 }]}, search_and_apply_time: 5.375µs, merge_time: 583ns }, rebuild_time: 1.125µs }, IterationReport { rule_set_report: RuleSetReport { changed: false, rule_reports: {"(rewrite (__main___Num___add__ _x _y) (__main___Num___add__ _y _x))": [RuleReport { plan: None, search_and_apply_time: 1.125µs, num_matches: 1 }]}, search_and_apply_time: 2.75µs, merge_time: 1.041µs }, rebuild_time: 0ns }], updated: true, search_and_apply_time_per_rule: {"(rewrite (__main___Num___add__ _x _y) (__main___Num___add__ _y _x))": 3.75µs}, num_matches_per_rule: {"(rewrite (__main___Num___add__ _x _y) (__main___Num___add__ _y _x))": 2}, search_and_apply_time_per_ruleset: {"": 8.125µs}, merge_time_per_ruleset: {"": 1.624µs}, rebuild_time_per_ruleset: {"": 1.125µs} }

Output after:

PrettyRunReport(iterations=[PrettyIterationReport(rule_set_report=PrettyRuleSetReport(changed=True, rule_reports={'rewrite(x + y).to(y + x)': [PrettyRuleReport(plan=None, search_and_apply_time=datetime.timedelta(0), num_matches=1)]}, search_and_apply_time=datetime.timedelta(0), merge_time=datetime.timedelta(0)), rebuild_time=datetime.timedelta(0)), PrettyIterationReport(rule_set_report=PrettyRuleSetReport(changed=False, rule_reports={'rewrite(x + y).to(y + x)': [PrettyRuleReport(plan=None, search_and_apply_time=datetime.timedelta(0), num_matches=1)]}, search_and_apply_time=datetime.timedelta(0), merge_time=datetime.timedelta(0)), rebuild_time=datetime.timedelta(0))], updated=True, search_and_apply_time_per_rule={'rewrite(x + y).to(y + x)': datetime.timedelta(0)}, num_matches_per_rule={'rewrite(x + y).to(y + x)': 2}, search_and_apply_time_per_ruleset={'': datetime.timedelta(0)}, merge_time_per_ruleset={'': datetime.timedelta(0)}, rebuild_time_per_ruleset={'': datetime.timedelta(0)})

@kaeun97 kaeun97 marked this pull request as ready for review May 5, 2026 23:09
@kaeun97 kaeun97 mentioned this pull request May 5, 2026
Copy link
Copy Markdown
Member

@saulshanabrook saulshanabrook left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this! Added a few comments. Could you also add this to the changelog file with a link to this PR?

Comment thread python/egglog/egraph.py Outdated
Comment thread python/egglog/egraph_state.py
Comment thread python/egglog/run_report.py Outdated
Comment thread python/egglog/run_report.py Outdated
Comment thread python/egglog/run_report.py Outdated
@kaeun97 kaeun97 requested a review from saulshanabrook May 7, 2026 01:11
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 7, 2026

Merging this PR will improve performance by 67.74%

⚡ 1 improved benchmark
✅ 5 untouched benchmarks
⏩ 8 skipped benchmarks1

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation test_jit[lda] 11.6 s 6.9 s +67.74%

Tip

Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.


Comparing kaeun97:kaeun97/pretty-report (01802ec) with main (8812ec9)

Open in CodSpeed

Footnotes

  1. 8 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Copy link
Copy Markdown
Member

@saulshanabrook saulshanabrook left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixes, I left a few small comments. There are also some mypy and formatting issues I think.

There is a bigger question about performance, if the codspeed is correct it looks like this slows things down by a ton!

Image

Taking almost 40% of the time in a bigger benchmark just to translate bindings.

It makes me wonder about a different approach, where we set each rewrite and rule with a manual name like 1, 2, 3, ... and then we don't have to do the name searching and mangling and can just parse the name as an int then look it up? And if it's a birewrite just take off the <= or >=?

It would make the egglog file a bit more verbose, but makes parsing the reports more straightforward and more performant which seems like a good tradeoff?

I was also going back and forth on whether the RunReport should store a RewriteOrRule or the decl? If we just store the RewriteOrRule it's easier to pretty print, can just use the builtin one, and it's easier for users to grab that off and compare it or use it... But most of the other exposed objects just store the decls, so I will leave it up to you!

EDIT: It looks like the docs failures also highlight some other exceptions from this. I imagine also if we name the rules here that might also help since it seems like it's hitting on looking up the string?

Comment thread python/egglog/egraph_state.py Outdated
Comment thread python/egglog/run_report.py Outdated
Comment thread python/egglog/egraph_state.py Outdated
Comment thread python/egglog/run_report.py
Comment thread python/tests/test_run_report.py
Comment thread python/egglog/run_report.py Outdated
@kaeun97
Copy link
Copy Markdown
Author

kaeun97 commented May 7, 2026

@saulshanabrook Thanks for the thorough review! I do agree that the performance looks concerning. The numeric name approach you mentioned would work for bindings with a "name" field - so not for, RewriteDecl nor BiRewriteDecl. That would require the rust side change. Happy to prioritize that before continuing on with this PR. Also, we can do lazy loading (translate when the user needs it) to have minimal impact to performance.

@saulshanabrook
Copy link
Copy Markdown
Member

The numeric name approach you mentioned would work for bindings with a "name" field - so not for, RewriteDecl nor BiRewriteDecl. That would require the rust side change. Happy to prioritize that before continuing on with this PR.

Ah yeah I kept forgetting about this! I just talked to some other folks on the egglog team and they said that sounds like a great feature to add, just something we hadn't gotten around to yet. It should also I think be relatively straightforward so a good first PR to egglog core if you don't mind doing that...

Then once that is merged hopefully should just be able to update the pin here and can use that feature. I believe the version of egglog we depend on here is pretty recent, so hopefully won't be other changes we have to adapt to.

@kaeun97 kaeun97 requested a review from saulshanabrook May 20, 2026 01:23
@kaeun97
Copy link
Copy Markdown
Author

kaeun97 commented May 20, 2026

Hey @saulshanabrook , thank you again for your feedback. The benchmark seems much better now. Let me know how it looks!

Copy link
Copy Markdown
Member

@saulshanabrook saulshanabrook left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for your continued updates on this!

I have some additional cleanup feedback, to try and keep the data structures a bit more minimal and specific, raise any errors earlier, and make sure bi-rewrite preserves both times.

Comment on lines +306 to +307
name = str(self.rule_name_counter)
self.rule_name_counter += 1
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we now support name for rewrite/birewrite, could we expose this to the user level as well? And then this logic here would be similar to the RuleDecl handling, where it checks for an explicit name and if it doesn't have one generates one. This would entail adding the name to pretty.py, declarations.py and egraph.py I believe.

This isn't strictly necessary for this PR though so if you don't feel like doing this here that's fine.

type_ref_to_egg_sort: dict[JustTypeRef, str] = field(default_factory=dict)
egg_sort_to_type_ref: dict[str, JustTypeRef] = field(default_factory=dict)

egg_rule_to_command_decl: dict[str, CommandDecl] = field(default_factory=dict)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we instead just use rule_name_to_command_decl, so we can remove this additional mapping and there is just one source? We will know which ones are named, because we can see if the CommandDecl has a name or not. We can also update it to be more specific and just go from str to RuleDecl | BiRewriteDecl | RewriteDecl I believe.

case _:
assert_never(schedule)

def translate_rule_key(self, egglog_key: str) -> CommandDecl | str:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we remove this, and instead store in the rule_name_to_command_decl version for <= and => when adding a bi-rewrite? Then that structure should always include all egglog rules we output, so we can do a lookup and if it's missing the exception just percolates up, avoiding a silent failure?

Comment on lines +88 to +89
search_and_apply_time_per_rule: dict[CommandDecl | str, timedelta] = field(default_factory=dict)
num_matches_per_rule: dict[CommandDecl | str, int] = field(default_factory=dict)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we just store CommandDecl's here regardless of if it has a name or not, then just change the repr/str to display it the name as a string if has one, otherwise pretty print the full command?

search_and_apply_time_per_rule={
state.translate_rule_key(k): v for k, v in report.search_and_apply_time_per_rule.items()
},
num_matches_per_rule={state.translate_rule_key(k): v for k, v in report.num_matches_per_rule.items()},
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we build these dictionaries from the bindings, could we check for duplicate keys (either named or unnamed) and combine the values for them? So that for BiRewrite, we don't lose the first one?

Comment thread python/egglog/egraph.py
from .run_report import RunReport
from .runtime import *
from .thunk import *

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add RunReport to __all__

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Pretty Run Report

2 participants