Skip to content

[CI-Nightly]: Validating the nightly Result with Previous Result #992

Open
abukhoy wants to merge 4 commits into
quic:mainfrom
abukhoy:nightly-ci-validate
Open

[CI-Nightly]: Validating the nightly Result with Previous Result #992
abukhoy wants to merge 4 commits into
quic:mainfrom
abukhoy:nightly-ci-validate

Conversation

@abukhoy
Copy link
Copy Markdown
Contributor

@abukhoy abukhoy commented May 18, 2026

This PR is created for validating/comparing the perf and tokens with the previously nightly run Results.

abukhoy added 4 commits May 18, 2026 04:05
Signed-off-by: Abukhoyer Shaik <abukhoye@qti.qualcomm.com>
Signed-off-by: Abukhoyer Shaik <abukhoye@qti.qualcomm.com>
Signed-off-by: Abukhoyer Shaik <abukhoye@qti.qualcomm.com>
Signed-off-by: Abukhoyer Shaik <abukhoye@qti.qualcomm.com>
],
"validation_configs": {
"default": {
"percentage_tolerance": 50.0,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please explain how these thresholds are decided?

"meta-llama/Llama-3.2-11B-Vision-Instruct",
"meta-llama/Llama-3.2-90B-Vision-Instruct",
"allenai/Molmo-7B-D-0924",
"Qwen/Qwen3-VL-30B-A3B-Instruct",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we skipping QWEN3-VL models ?

def _numeric_mad(previous_value: Any, current_value: Any) -> float | str:
previous_flat = _flatten_numeric_values(previous_value)
current_flat = _flatten_numeric_values(current_value)
common_length = min(len(previous_flat), len(current_flat))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also check if the lengths are different, consider a case when current_length is 200 but previous_length is 500, we won't be able to catch that, if mad for first 200 token is same for both the cases

total_difference = sum(abs(current_flat[index] - previous_flat[index]) for index in range(common_length))
return total_difference / common_length


Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add doc strings for all the functions, you can use agent to do that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants