Feature Type
It's important to test skills to ensure that they are validated and triggered properly in different scenarios
Description
I'm considering an LLM native testing framework like Gherkin or to implement something that is suitable for this use case.
Use Case
New skill added -> different depth and checklists are validated:
- skill practices (less than 500 lines, refrences, python only scripts etc)
- prompt or skill injection guardrails
- CI triggered
Optionally Periodic validations
Problem it solves:
ensures new added skills are high quality and consistent throughout the agent kit.
Willingness to Contribute
Feature Type
It's important to test skills to ensure that they are validated and triggered properly in different scenarios
Description
I'm considering an LLM native testing framework like Gherkin or to implement something that is suitable for this use case.
Use Case
New skill added -> different depth and checklists are validated:
Optionally Periodic validations
Problem it solves:
ensures new added skills are high quality and consistent throughout the agent kit.
Willingness to Contribute