Skip to content

devimano2011/Explainable-Test-Prioritizer

 
 

Repository files navigation

Explainable AI Test Prioritization Engine

Python ML Explainability CI

An AI-driven, explainable test automation optimization framework designed to improve CI CD pipeline efficiency by intelligently prioritizing high-risk test cases and providing transparent decision explanations.

Why this project matters

Large regression suites are expensive to run for every build. Teams often execute too many tests, wait too long for feedback, and still struggle to focus on the most failure-prone areas. This project helps software teams rank tests using machine learning and explain why those tests should be run first.

Key features

  • AI-based risk prediction with a production-friendly random forest model
  • Explainable prioritization using SHAP
  • Domain-agnostic input format for real projects
  • FastAPI service for integration with internal tooling
  • GitHub Actions CI workflow
  • Docker support
  • Sample data, tests, and outputs included

Project structure

explainable-test-prioritizer/
├── .github/workflows/ci.yml
├── app.py
├── data/
├── docs/github_publish_steps.md
├── outputs/
├── src/explainable_test_prioritizer/
├── tests/
├── Dockerfile
├── prioritize.py
├── pyproject.toml
├── README.md
├── requirements.txt
└── train.py

Input schema

Your CSV should include these columns:

  • test_id
  • historical_defect_density
  • code_complexity
  • change_frequency
  • coverage_gap
  • execution_cost
  • module_criticality
  • recent_failure_count
  • dependency_volatility

Training data must also include:

  • is_high_priority

Quick start

git clone https://github.com/yourusername/explainable-test-prioritizer.git
cd explainable-test-prioritizer
python -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
pip install -e .
python train.py --data data/train.csv
python prioritize.py --input data/new_build.csv

This generates:

  • outputs/prioritized_tests.csv
  • outputs/explanations.json

Run as an API

uvicorn app:app --reload

Health check:

curl http://127.0.0.1:8000/health

POST example:

curl -X POST http://127.0.0.1:8000/prioritize \
  -H "Content-Type: application/json" \
  -d @sample_request.json

Run tests

pytest -q

Docker

docker build -t explainable-test-prioritizer .
docker run -p 8000:8000 explainable-test-prioritizer

How to use in a real project

  1. Export your historical test execution data into the required CSV schema.
  2. Train the model on your own project data:
    python train.py --data path/to/your_train.csv
  3. For each new build, generate a candidate test list and features:
    python prioritize.py --input path/to/new_build.csv
  4. Use recommended_bucket and priority_rank to select which tests run first in your CI pipeline.
  5. Review outputs/explanations.json to understand why the model ranked specific tests highly.

GitHub setup instructions

Create the repository with:

  • Repository name: explainable-test-prioritizer
  • Description: Explainable AI-driven test prioritization engine that optimizes CI CD pipelines by predicting high-risk tests and providing transparent decision explanations using SHAP.
  • Visibility: Public

Recommended options:

  • Add a README file
  • Add .gitignore = Python
  • Add license = MIT

Topics to add:

  • ai-testing
  • test-automation
  • machine-learning
  • explainable-ai
  • software-quality
  • ci-cd

Push this code:

git init
git add .
git commit -m "Initial commit - Explainable AI Test Prioritization Engine"
git branch -M main
git remote add origin https://github.com/yourusername/explainable-test-prioritizer.git
git push -u origin main

If the repo already exists:

git clone https://github.com/yourusername/explainable-test-prioritizer.git
cd explainable-test-prioritizer
# copy extracted project files here
git add .
git commit -m "Add full project code and documentation"
git push origin main

Create the first release:

  • Tag: v1.0.0
  • Title: Initial Release - Explainable AI Test Prioritization Engine

Research contribution

This project extends explainable AI in software engineering from defect prediction into actionable test selection and prioritization for CI CD pipelines.

Related publication:

S. Kavuri, An Explainable Machine Learning Framework for Predicting Software Defects in Large-Scale Software Systems, 2026 IEEE 5th International Conference on AI in Cybersecurity (ICAIC), Houston, TX, USA, 2026, pp. 1-6, doi: 10.1109/ICAIC67076.2026.11395777

Future enhancements

  • GitHub Actions workflow to auto-prioritize on pull requests
  • dashboard visualizations for priority trends
  • reinforcement learning for adaptive prioritization
  • PyPI publishing
  • DOI archival through Zenodo

License

MIT License

About

Explainable AI-driven test prioritization engine that optimizes CI CD pipelines by predicting high-risk tests and providing transparent decision explanations using SHAP.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 97.6%
  • Dockerfile 2.4%