(Bio)Schemas4NFDI, lightweight domain metadata (not only) for NFDI consortia

Bioschemas is a community effort to improve FAIRness of resources in the Life sciences by defining specific metadata schemas as JSON+LD and exposing that metadata from resources that have adopted it. To this end, it offers some tailored types (e.g., protein, chemical substance) but also recommendations, aka profiles, on how to use generic types in sciences (e.g., dataset, training material, application tools and workflows) that are readily applicable in many neighboring disciplines. In addition to types and profiles, Bioschemas also offers some validation and harvesting tools, making it easier to comply with specifications and to consume the markup.

A simple and direct approach to FAIRness is a key element across different NFDI consortia. Getting Bioschemas adopted and adapted across NFDI consoria, i.e., Schemas4NFDI, would facilitate an initial lightweight and common approach to implement the FAIR principles on the corresponding web pages and (possibly) beyond. In this BioHackathon Germany, we propose to bring together metadata experts from Bioschemas and NFDI consortia to adopt and adapt Bioschemas to NFDI use cases. Several NFDI consortia have already expressed their interest in this effort, e.g., NFDI4DataScience, NFDI4Microbiota, and NFDI4Chem, and we expect more to join. We expect de.NBI / ELIXIR to greatly benefit from this proposal because it furthers the adoption of Bioschemas, and many NFDI consortia have a significant overlap with de.NBI service centers, associated de.NBI partners and ELIXIR communities. Members of ELIXIR-DE are already using Bioschemas e.g. in the Galaxy Training Network (GTN), MassBank and RDMKit.

Several hacking topics have been identified by prospective participants already: on the provider side, we will work on adapting and creating JSON+LD schema metadata export from their resources. We will start by understanding and using generic profiles including dataset and computational tool (extend or adjust the profiles for machine/deep learning applications, NFDI4DataScience), training material (NFDI4Microbiota), some of the life science types such as chemical substances, molecules and reactions (e.g., Chemotion, nmrXiv, or the de.NBI Service MassBank from CIBI/NFDI4Chem) and cross-walk from the metadata schema developed in GHGA. Our approach will be “bring your own data and mark it up with Bioschemas”, providing tooling support for efficient implementations and active discussion on what additional support, e.g., types and profiles, is needed in the NFDI consortia.
On the consumer side, participants will work on ingesting this metadata into 1) a generic OAI-PMH metadata server, 2) generic knowledge graphs and 3) a chemistry specific CKAN repository.

We expect participants to exchange preliminary work (what data already exists, what APIs or interfaces are available and which import/export formats can already be handled) by mail/google docs etc. prior to the hackathon, including e.g. github issues to track progress also beyond the hackathon.

Project Lead: Steffen Neumann <This email address is being protected from spambots. You need JavaScript enabled to view it.; and Leyla Jael Castro <This email address is being protected from spambots. You need JavaScript enabled to view it.;

Events

Latest events around de.NBI

(Bio)Schemas4NFDI, lightweight domain metadata (not only) for NFDI consortia