Overview
Within the CCPBioSim organisation there are a wide range of software packages supporting different areas of biomolecular simulation, often developed independently and at different stages of maturity. While this reflects the breadth and strength of the community, it can also make it difficult to easily discover, understand, and keep track of the tools that are available.
This project is an opportunity to explore how we can bring more structure and visibility to this ecosystem by building a small federated system, where each repository can describe itself and automatically contribute to a central, up-to-date catalogue. The aim is to prototype an approach that could scale as more tools are developed, making it easier for researchers to find, understand, and use software across CCPBioSim.
Problem
In large research projects, software is often spread across many repositories. This can make it difficult to:
- know what tools exist
- find up-to-date information
- maintain consistent documentation
We want to explore a solution where:
- each repository describes itself using structured metadata
- this information is processed automatically
- a central system collects everything into a single, organised index
Your Task
You will work together to design and build a simplified prototype of this system.
The system should:
- allow a repository to define its own metadata
- process that metadata using a script
- run this process automatically using CI (GitHub Actions)
- collect and store the results in a central "registry" repository
Suggested Steps
You do not need to complete everything, but you should aim to build a working part of the system by breaking it into smaller tasks.
Examples of steps include:
- Create a metadata file in a repository (
software.yml)
- Define basic fields such as name, version, and description
- Write a Python script to read and process the metadata
- Convert the metadata into JSON format
- Set up a GitHub Actions workflow to run the script
- Create a central file (
assets.json) to store aggregated data
- Add data from one repository into the central registry
- (Optional) integrate this with the CCPBioSim Developer Organisation
Tools You Will Use
- Git and GitHub
- Python
- YAML and JSON
- GitHub Actions (CI workflows)
How You Will Work
- You will work together on this as a shared project
- You will break the work into smaller, manageable tasks
- You are encouraged to experiment and try different approaches
- You will work in a sandbox GitHub organisation, so there is no risk to real systems
What Success Looks Like
You are not expected to build a complete system.
A successful outcome could include:
- a working script that reads and converts metadata
- a CI workflow that runs automatically
- a basic central registry file (
assets.json)
- or a combination of these working together
You should aim to build something that you can clearly explain and demonstrate at the end of the week.
Notes
- This is a prototype project, not a finished product
- Focus on making things work, rather than making them perfect
- Try to understand how the different parts connect together
- Ask questions and work through problems as a team
Stretch Ideas (Optional)
If you have time, you could explore:
- validating metadata automatically
- improving the structure of the registry
- generating documentation from your data
- supporting multiple repositories
Outcome
By the end of this project, you should have:
- experience working with real development tools
- an introduction to CI workflows and automation
- an understanding of how software can be organised across multiple repositories
- a piece of work you can demonstrate and discuss
Overview
Within the CCPBioSim organisation there are a wide range of software packages supporting different areas of biomolecular simulation, often developed independently and at different stages of maturity. While this reflects the breadth and strength of the community, it can also make it difficult to easily discover, understand, and keep track of the tools that are available.
This project is an opportunity to explore how we can bring more structure and visibility to this ecosystem by building a small federated system, where each repository can describe itself and automatically contribute to a central, up-to-date catalogue. The aim is to prototype an approach that could scale as more tools are developed, making it easier for researchers to find, understand, and use software across CCPBioSim.
Problem
In large research projects, software is often spread across many repositories. This can make it difficult to:
We want to explore a solution where:
Your Task
You will work together to design and build a simplified prototype of this system.
The system should:
Suggested Steps
You do not need to complete everything, but you should aim to build a working part of the system by breaking it into smaller tasks.
Examples of steps include:
software.yml)assets.json) to store aggregated dataTools You Will Use
How You Will Work
What Success Looks Like
You are not expected to build a complete system.
A successful outcome could include:
assets.json)You should aim to build something that you can clearly explain and demonstrate at the end of the week.
Notes
Stretch Ideas (Optional)
If you have time, you could explore:
Outcome
By the end of this project, you should have: