Co-leads:
- Carissa Bleker - Department of Biotechnology and Systems Biology, National Institute of Biology, Slovenia -
- Sebastian Lobentanzer - Computational Health Center, Helmholtz Munich, Germany -
SBML (Systems Biology Markup Language) and SBGN-ML (Systems Biology Graphical Notation Markup Language) enable structured, reusable models of biological systems. However, these XML-based formats suffer from poor human-readability, poor queryability, require specialized tools for processing and visualization, and offer limited interoperability. In contrast, knowledge graphs excel in traversing complex relationships, and provide dynamic interfaces for exploration. Graph databases are also a platform for enriching models with additional information and metadata not encoded in SBML or SBGN-ML (e.g. KEGG, Open Targets, OmniPath, ENCODE/IGVF, and ontologies in OWL format). Making systems biology models machine-accessible in an enriched knowledge graph format will also enable seamless integration with AI technologies such as large language models (LLMs) through existing advances in generative AI and associated Retrieval-Augmented Generation (RAG) and Model Context Protocol (MCP), making them accessible for a wide range of users
This project will bring together expertise in systems biology, knowledge management, and LLMs to develop a common labelled property graph schema and bidirectional protocols for transforming system biology models into human- and AI-accessible knowledge graphs. We will review how existing schemas are fit to support AI approaches, and consolidate and enhance ongoing efforts (e.g. stonpy, skm, neo4jsbml) to transform biological models to labelled property graphs. We will specifically focus on developing a framework utilising Neo4j and the BioCypher infrastructure.
Biohackathon outcomes:
- A common (and extendable) labelled property graph schema for system biology models, utilising as foundation existing standard ontologies (e.g. Biolink, SBO, EDAM, KiSAO),
- BioCypher adapters for SBGN and SBGN-ML, using the common schema,
- SBGN and SBGN-ML export functionalities, and
- One or more example applications, using either participant-provided use cases or models provided in BioModels (e.g. disease maps, metabolic maps, signalling networks, ODE models, Boolean models, GEMS).
Stretch goals and future work:
- A user application that can load an SBML/SBGN-ML into Neo4j,
- BioChatter service for systems biology models, and
- Tools for merging models in Neo4j.
We welcome participants familiar with one or more of the following:
- Python programming,
- Graph databases and knowledge graphs, and
- Systems biology standards and ontologies.