Open-source biomedical informatics infrastructure for interoperable genomic, phenotypic, clinical, and AI-assisted research workflows.
We are a biomedical informatics research and software engineering initiative affiliated with the Biomedical Genomics Group at CNAG.
Our work focuses on the development of practical, standards-driven tools for genomic, phenotypic, and clinical data interoperability. We build reusable infrastructure for translational research, ranging from sequencing pipelines and biomedical data harmonization to semantic phenotype analysis and AI-assisted biomedical workflows.
Our work spans the evolution of computational biology, from structural bioinformatics and genome analysis to interoperable clinical data systems and AI-assisted biomedical workflows.
Our software ecosystem is primarily developed using Python, Perl, JavaScript/React, and R.
We develop open-source biomedical informatics infrastructure for interoperable genomic, phenotypic, clinical, and AI-assisted research workflows.
Our work focuses on:
- 🔄 Biomedical data interoperability
- 🧬 Translational informatics
- 🤖 AI-assisted biomedical systems
- 🔓 Sustainable open infrastructure
We build practical tools that help researchers and clinicians work with heterogeneous biomedical data using transparent, standards-driven workflows.
Our work is shaped by more than two decades of experience in computational biology and bioinformatics across evolving sequencing technologies, biomedical standards, and computational systems.
We prioritize:
- 🔄 Interoperability over isolated silos
- 🧪 Reproducibility through reusable workflows
- 🏗️ Sustainable infrastructure instead of short-lived prototypes
- 📚 Open science and accessible software
- 🤖 Pragmatic AI adoption grounded in structured biomedical data
We believe robust biomedical AI depends on reliable infrastructure, interoperable standards, transparent data models, and sustainable software ecosystems.
Interconversion between biomedical and phenotypic data standards.
- Phenopackets interoperability
- REDCap, OMOP-CDM, and CDISC-ODM support
- CLI, API, and Web UI
- Dockerized deployment
Resources:
- https://github.com/CNAG-Biomedical-Informatics/convert-pheno
- https://github.com/CNAG-Biomedical-Informatics/convert-pheno-ui
- https://cnag-biomedical-informatics.github.io/convert-pheno
GA4GH Beacon v2 interoperability, validation, and ingestion tooling.
Resources:
Validation workflows for OMOP-CDM CSV datasets.
Resources:
Semantic comparison and ranking of interoperable phenotypic data.
- Semantic similarity workflows
- Cross-format interoperability
- Interactive Web UI
Resources:
- https://github.com/CNAG-Biomedical-Informatics/pheno-ranker
- https://github.com/CNAG-Biomedical-Informatics/pheno-ranker-ui
- https://cnag-biomedical-informatics.github.io/pheno-ranker
Schema-driven biomedical identifier generation and validation.
Resources:
- https://github.com/CNAG-Biomedical-Informatics/clarid-tools
- https://cnag-biomedical-informatics.github.io/clarid-tools
Configuration-driven genomic variant-calling workflows.
Resources:
A new generation of patient-centric biomedical navigation tools integrating interoperable clinical data, semantic systems, and AI-assisted workflows.
- GA4GH Beacon v2
- GA4GH Phenopackets
- OMOP-CDM
- REDCap
- CDISC-ODM
- openEHR
- JSON Schema
- Python
- Perl
- JavaScript / React
- R
- Docker
- MongoDB
- Snakemake
- REST APIs
- Local LLMs
- MCP-compatible workflows
We actively collaborate within the ELIXIR community and participate in European initiatives focused on interoperable biomedical data, federated analytics, translational informatics, and precision medicine.
Large-scale European precision medicine initiative studying treatment response and molecular mechanisms across immune-mediated diseases.
Federated multimodal biomedical data integration platform for privacy-preserving clinical, genomic, and semantic analytics.
https://hereditary-project.eu/
Rueda M, Gut IG. ClarID: a human-readable and compact identifier specification for biomedical metadata integration. Journal of Biomedical Semantics (2026).
https://doi.org/10.1186/s13326-026-00349-6
Rueda M, et al. Beacon v2 Reference Implementation: a toolkit to enable federated sharing of genomic and phenotypic data. Bioinformatics (2022).
https://doi.org/10.1093/bioinformatics/btac568
Rueda M, et al. Enhancing Semantic Interoperability in Precision Medicine: Converting OMOP CDM to Beacon v2 in the Spanish IMPaCT-Data Project. medRxiv (2024).
https://doi.org/10.1101/2024.12.25.24319606
Rueda M, et al. Convert-Pheno: a software toolkit for the interconversion of standard data models for phenotypic data. Journal of Biomedical Informatics (2023).
https://doi.org/10.1016/j.jbi.2023.104558
Rueda M, et al. Pheno-Ranker: a toolkit for comparison of phenotypic data stored in GA4GH standards and beyond. BMC Bioinformatics (2024).
https://doi.org/10.1186/s12859-024-05993-2
We believe sustainable biomedical infrastructure benefits from open collaboration across research, clinical, and engineering communities.
We welcome:
- scientific collaborations
- interoperability initiatives
- standards-related projects
- open-source contributions
- bug reports and feature requests
- https://github.com/CNAG-Biomedical-Informatics
- https://cnag-biomedical-informatics.github.io/convert-pheno
- https://cnag-biomedical-informatics.github.io/pheno-ranker
- https://cnag-biomedical-informatics.github.io/clarid-tools
Building sustainable biomedical informatics infrastructure for interoperable and AI-assisted research.