MiCoDa

Making the data from the world's largest publicly accessible microbiome database reusable in the long term – that is the goal of the Use Case project Microbial Community Database (MiCoDa).

“MiCoDa is a rescue mission for metabarcoding data.”

Stephanie Jurburg (Department of Applied Microbial Ecology, Helmholtz Center for Environmental Research (UFZ))

About MiCoDa

MiCoDa – the Microbial Community Database – was developed by the Helmholtz Centre for Environmental Research (UFZ), the German Centre for Integrative Biodiversity Research (iDiv) and the Friedrich Schiller University Jena (FSU) to improve the discoverability, interoperability and reusability of sequence data. MiCoDa is a public, curated, searchable, interoperable 16S rRNA gene metabarcoding database and in its first version hosts more than 35,000 processed microbiome samples and allows researchers to search and download microbiome data for their own use. MiCoDa is the largest database of its kind in the world.

MiCoDa retains INSDC accession numbers as data IDs and uses the Earth Microbiome Project Ontology (EMPO) and MIxS. The database also links sequence data to the publications (DOIs) in which they were first presented, facilitating automatic integration with bibliometric data. Through careful curation and bioinformatic processing, all sequences in MiCoDa contain the same segment of the 16S rRNA gene as a permanent taxon/species identifier, enabling cross-study comparisons of specific bacterial taxa as well as comparisons with other databases (e.g. via NCBI's Nucleotide BLAST).

MiCoDa has two ways of collecting data: automatic data collection with validated text parsing algorithms (Jurburg et al. 2020) and direct and targeted collaboration with data collectors, especially from biodiversity blindspots – with the aim of involving data producers in archiving and reuse.

At annual events in different parts of the world – so-called Datathons – participants are trained in archiving sequences in INSDC databases using special guidelines (here in Spanish, for example) and encouraged to deposit data and then reuse it themselves. These efforts have already created a broad network of disciplinary users.

MiCoDa aims to improve the reusability of data by means of the following measures:

  • Involvement of data collectors in data archiving and reuse through annual datathons
  • Improving the accessibility and comparability of bacterial amplicon sequence data by providing the data in a ready-to-use format that uses a standardized and universal taxon definition
  • Enrichment of existing metadata associated with metabarcoding from linked literature

The Use Case MiCoDa

The MiCoDa use case aims to harmonize the data portal with the technical standards of NFDI4Biodiversity in order to be able to seamlessly integrate the data into the data infrastructure being created in NFDI4Biodiversity, the Research Data Commons (RDC) – with the result of increasing the visibility, accessibility and reusability of the data in MiCoDa. In addition, the exchange between the research data management community and the microbiome data community is to be promoted through training and education on MiCoDa, to which members of the consortium and external interested parties are cordially invited.

Improved availability and reusability of the data is to be ensured through improved standardization and harmonization of data formats and metadata. Biodiversity research will also benefit from this, as researchers and other interest groups will have easy access to high-quality microbiome data. Joining iDiv as an NFDI4Biodiversity partner institution in the second funding phase of the consortium from 2025 will further strengthen the connection to a strong network of biodiversity researchers and the project will be able to benefit from a dedicated community and its resources. In addition, MiCoDa's training program will be complemented by training courses with a focus on data processing.

Further information

Contact

Would you like to find out more about the MiCoDa use case? You are welcome to contact the people involved.

Use Case Manager (NFDI4Biodiversity)

Sarah Fischer (fischer.sarah@fbn-dummerstorf.de)

Use case partner (MiCoDa)

Stephanie Jurburg (stephanie.jurburg@ufz.de)

News articles on the topic

Das Konsortium wächst!

Neues Use-Case-Projekt gestartet: "MiCoDa soll nic...

Zuwachs für die NFDI4Biodiversity-Use-Case-Projekte! Im Interview spricht Stephanie Jurburg darüber, wie sie die Metabarcoding-Datenbank MiCoDa ins Le...