feat: add pretty run report#416
Conversation
saulshanabrook
left a comment
There was a problem hiding this comment.
Thank you for this! Added a few comments. Could you also add this to the changelog file with a link to this PR?
Merging this PR will improve performance by 67.74%
Performance Changes
Tip Curious why this is faster? Comment Comparing Footnotes
|
There was a problem hiding this comment.
Thanks for the fixes, I left a few small comments. There are also some mypy and formatting issues I think.
There is a bigger question about performance, if the codspeed is correct it looks like this slows things down by a ton!
Taking almost 40% of the time in a bigger benchmark just to translate bindings.
It makes me wonder about a different approach, where we set each rewrite and rule with a manual name like 1, 2, 3, ... and then we don't have to do the name searching and mangling and can just parse the name as an int then look it up? And if it's a birewrite just take off the <= or >=?
It would make the egglog file a bit more verbose, but makes parsing the reports more straightforward and more performant which seems like a good tradeoff?
I was also going back and forth on whether the RunReport should store a RewriteOrRule or the decl? If we just store the RewriteOrRule it's easier to pretty print, can just use the builtin one, and it's easier for users to grab that off and compare it or use it... But most of the other exposed objects just store the decls, so I will leave it up to you!
EDIT: It looks like the docs failures also highlight some other exceptions from this. I imagine also if we name the rules here that might also help since it seems like it's hitting on looking up the string?
|
@saulshanabrook Thanks for the thorough review! I do agree that the performance looks concerning. The numeric name approach you mentioned would work for bindings with a "name" field - so not for, |
Ah yeah I kept forgetting about this! I just talked to some other folks on the egglog team and they said that sounds like a great feature to add, just something we hadn't gotten around to yet. It should also I think be relatively straightforward so a good first PR to egglog core if you don't mind doing that... Then once that is merged hopefully should just be able to update the pin here and can use that feature. I believe the version of egglog we depend on here is pretty recent, so hopefully won't be other changes we have to adapt to. |
|
Hey @saulshanabrook , thank you again for your feedback. The benchmark seems much better now. Let me know how it looks! |
saulshanabrook
left a comment
There was a problem hiding this comment.
Thanks again for your continued updates on this!
I have some additional cleanup feedback, to try and keep the data structures a bit more minimal and specific, raise any errors earlier, and make sure bi-rewrite preserves both times.
| name = str(self.rule_name_counter) | ||
| self.rule_name_counter += 1 |
There was a problem hiding this comment.
Since we now support name for rewrite/birewrite, could we expose this to the user level as well? And then this logic here would be similar to the RuleDecl handling, where it checks for an explicit name and if it doesn't have one generates one. This would entail adding the name to pretty.py, declarations.py and egraph.py I believe.
This isn't strictly necessary for this PR though so if you don't feel like doing this here that's fine.
| type_ref_to_egg_sort: dict[JustTypeRef, str] = field(default_factory=dict) | ||
| egg_sort_to_type_ref: dict[str, JustTypeRef] = field(default_factory=dict) | ||
|
|
||
| egg_rule_to_command_decl: dict[str, CommandDecl] = field(default_factory=dict) |
There was a problem hiding this comment.
Can we instead just use rule_name_to_command_decl, so we can remove this additional mapping and there is just one source? We will know which ones are named, because we can see if the CommandDecl has a name or not. We can also update it to be more specific and just go from str to RuleDecl | BiRewriteDecl | RewriteDecl I believe.
| case _: | ||
| assert_never(schedule) | ||
|
|
||
| def translate_rule_key(self, egglog_key: str) -> CommandDecl | str: |
There was a problem hiding this comment.
What if we remove this, and instead store in the rule_name_to_command_decl version for <= and => when adding a bi-rewrite? Then that structure should always include all egglog rules we output, so we can do a lookup and if it's missing the exception just percolates up, avoiding a silent failure?
| search_and_apply_time_per_rule: dict[CommandDecl | str, timedelta] = field(default_factory=dict) | ||
| num_matches_per_rule: dict[CommandDecl | str, int] = field(default_factory=dict) |
There was a problem hiding this comment.
What if we just store CommandDecl's here regardless of if it has a name or not, then just change the repr/str to display it the name as a string if has one, otherwise pretty print the full command?
| search_and_apply_time_per_rule={ | ||
| state.translate_rule_key(k): v for k, v in report.search_and_apply_time_per_rule.items() | ||
| }, | ||
| num_matches_per_rule={state.translate_rule_key(k): v for k, v in report.num_matches_per_rule.items()}, |
There was a problem hiding this comment.
When we build these dictionaries from the bindings, could we check for duplicate keys (either named or unnamed) and combine the values for them? So that for BiRewrite, we don't lose the first one?
| from .run_report import RunReport | ||
| from .runtime import * | ||
| from .thunk import * | ||
|
|
There was a problem hiding this comment.
Can you add RunReport to __all__
Resolves #398.
Here is an example code:
Output before:
Output after: