fix(interface): reject duplicate names within output_processors (#675)#697
Conversation
…IA-NeMo#675) Signed-off-by: SAY-5 <say.apm35@gmail.com>
|
All contributors have signed the DCO ✍️ ✅ |
|
I have read the DCO document and I hereby sign the DCO. |
Greptile SummaryThis PR fixes a silent bug where two
|
| Filename | Overview |
|---|---|
| packages/data-designer/src/data_designer/interface/composite_workflow.py | Adds intra-list duplicate name detection to _validate_distinct_output_processors; logic is correct and the existing cross-list check is preserved. |
| packages/data-designer/tests/interface/test_composite_workflow.py | Adds a well-scoped regression test that verifies the new error message fires when two output processors share a name. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[add_stage called with output_processors] --> B[_validate_distinct_output_processors]
B --> C{Any duplicate names\nwithin output_processors?}
C -- Yes --> D[Raise DataDesignerWorkflowError\n'distinct within output_processors']
C -- No --> E{Any names clash with\nexisting stage processors?}
E -- Yes --> F[Raise DataDesignerWorkflowError\n'distinct from stage processor names']
E -- No --> G[Validation passes]
Reviews (1): Last reviewed commit: "fix(interface): reject duplicate names w..." | Re-trigger Greptile
|
Thanks for the quick turnaround on this, @SAY-5 — nice tight, well-scoped fix. SummaryExtends FindingsSuggestions — Worth considering
from collections import Counter
counts = Counter(processor.name for processor in output_processors)
duplicates = sorted(name for name, count in counts.items() if count > 1)
if duplicates:
raise DataDesignerWorkflowError(
f"Output processor names must be distinct within output_processors: {', '.join(duplicates)}."
)Totally happy to leave the explicit loop if you prefer it — readability is on par.
with pytest.raises(DataDesignerWorkflowError, match=r"distinct within output_processors: drop_scratch"):What Looks Good
VerdictShip it (with nits) — both suggestions above are optional. This review was generated by an AI assistant. |
|
Issue #675 has been triaged. The linked issue check is being re-evaluated. |
johnnygreco
left a comment
There was a problem hiding this comment.
🚢 Looks good – Thank you!!
📋 Summary
_validate_distinct_output_processorsonly compared output-processor names against the stage's existing processors, so two output processors with the same name silently overwrote one another's artifact directory. This patch raisesDataDesignerWorkflowErrorwhen names repeat withinoutput_processors.🔗 Related Issue
Fixes #675
🔄 Changes
_validate_distinct_output_processorsbefore the cross-list check.test_composite_workflow.py.🧪 Testing
composite_workflow(44 tests) passes.✅ Checklist