Over the years, we had the pleasure to collaborate with a number of superb human beings. Any list presented here would necessarily be incomplete, so we are listing a number of projects which should cover most scientific, infrastructural and political themes that drive our collaborations. In alphabetical order:

In the following, we outline further involvements in international collaborative projects:

Computer-Assisted Structure Elucidation

With Prof. Emma Schymanski at the University of Luxemburg and our joint PhD student Adelene Lai, we are investigating cheminformatics approaches to identify unknowns in mixtures and biological systems.

Standards

NMReData Initative

We are a member of the NMReData Initiative supporting a light-weight data format to report NMR data, structures and assignments for small molecules in articles.

Ontologies

We are actively involved in several aspects of the development, adoption and dissemination of ontologies as standards for the annotation of life-science data. Ontologies are structured controlled vocabularies that have several features that make them ideal for the standardisation of annotations, including hierarchical organisation for flexible aggregation, semantics-free stable identifiers, and a plug-in architecture without dependence on a fixed database schema. We developed the ChEBI database and ontology for chemical entities of biological interest [1]. ChEBI is the chemical ontology of choice for many life science data annotation projects, and has been adopted by the OBO Foundry as the reference ontology for chemical entities. ChEBI is also used by the Gene Ontology to identify chemicals in chemical-involving processes and functions.

We have developed the CHEMINF ontology for chemical information entities [2], such as descriptors, algorithms and toolkits, for use in providing provenance and disambiguation for the properties of chemical entities being made available as open data in the context of the in the Semantic Web.

Metabolomics

Our group led the COSMOS effort, Coordination of Standards in MetabOlomicS [3], aiming to drive forward the definition and adoption of standards for data exchange and annotation in the field of metabolomics. Metabolomics is an important phenotyping technique for molecular biology and medicine. It assesses the molecular state of an organism or collections of organisms through the comprehensive quantitative and qualitative analysis of all small molecules in cells, tissues, and body fluids. Metabolic processes are at the core of physiology. Consequently, metabolomics is ideally suited as a medical tool to characterise disease states in organisms, as a tool for the assessment of organisms for their suitability in, for example, renewable energy production or for biotechnological applications in general.

We are now seeing the emergence of metabolomics databases and repositories in various subareas of metabolomics and the emergence of large general e-infrastructures in the life sciences. In particular, the BioMedBridges project is set to link a variety of European Strategy Forum on Research Infrastructures (ESFRI) projects, such as ELIXIR and BBMRI. Metabolomics generates large and diverse sets of analytical data and therefore impose significant challenges for the above mentioned e-infrastructures. The COSMOS effort is designed to develop standards and policies to ensure that metabolomics data are:

  • Encoded in open standards to allow barrier-free and wide-spread analysis.
  • Tagged with a community-agreed, complete set of metadata (minimum information standard).
  • Supported by a communally developed set of open source data management and capturing tools.
  • Disseminated in open-access databases adhering to the above standards.
  • Supported by vendors and publishers, who require deposition upon publication
  • Properly interfaced with data in other biomedical and life-science e-infrastructures (such as ELIXIR, BioMedBridges, EU-Openscreen).

COSMOS brought together leading European groups in Metabolomics and interfaced with all interested players in the Metabolomics and beyond, world-wide.

Standards in Chemical Biology

Our group was a partner in the EU-OPENSCREEN effort, the European Infrastructure of Open Screening Platforms for Chemical Biology, which aims to integrate high-throughput screening platforms, chemical libraries, chemical resources for hit discovery and optimisation, bio- and cheminformatics support, and a database containing screening results, assay protocols, and chemical information. We led the Standardisation work package, tasked with defining a core set of representational and transfer data standards for open data sharing and reproducible analysis in European chemical biology. As a part of this effort we are collaborating closely with the PubChem team for chemical data standardisation and the BioAssay Ontology team for biological assay description standardisation. We have also contributed to the development of the Minimum Information to Annotate a Bioactive Entity (MIABE) project.

Standards for Chemical Markup — CML and CMLSpect

Chemical Markup Language (CML) is an XML language designed to facilitate the creation, interchange, and deposition of chemical information. CML covers many areas of mainstream chemistry including:

  • Molecules – structures and properties
  • Reactions, including properties and reaction schemes
  • Spectra, especially as found in chemical publications (CMLSpect)
    crystallography, especially the interplay of structure and chemistry
  • computational chemistry
  • The Steinbeck group has been closely involved in the development of CML, especially CMLSpect. CMLSpect is heavily used in Bioclipse to handle spectral information.

References

  1. Hastings, Janna and Owen, Gareth and Dekker, Adriano and Ennis, Marcus and Kale, Namrata and Muthukrishnan, Venkatesh and Turner, Steve and Swainston, Neil and Mendes, Pedro and Steinbeck, Christoph (2015): ChEBI in 2016: Improved services and an expanding collection of metabolites.. In: Nucleic Acids Research, vol. 44, no. D1, pp. gkv1031–D1219, 2015.
  2. Hastings, Janna and Chepelev, Leonid and Willighagen, Egon and Adams, Nico and Steinbeck, Christoph and Dumontier, Michel (2011): The chemical information ontology: provenance and disambiguation for chemical data on the biological semantic web.. In: PLoS ONE, vol. 6, no. 10, pp. e25513, 2011.
  3. Salek, Reza M and Neumann, Steffen and Schober, Daniel and Hummel, Jan and Billiau, Kenny and Kopka, Joachim and Correa, Elon and Reijmers, Theo and Rosato, Antonio and Tenori, Leonardo and Turano, Paola and Marin, Silvia and Deborde, Catherine and Jacob, Daniel and Rolin, Dominique and Dartigues, Benjamin and Conesa, Pablo and Haug, Kenneth and Rocca-Serra, Philippe and O'Hagan, Steve and Hao, Jie and van Vliet, Michael and Sysi-Aho, Marko and Ludwig, Christian and Bouwman, Jildau and Cascante, Marta and Ebbels, Timothy and Griffin, Julian L and Moing, Annick and Nikolski, Macha and Oresic, Matej and Sansone, Susanna-Assunta and Viant, Mark R and Goodacre, Royston and Günther, Ulrich L and Hankemeier, Thomas and Luchinat, Claudio and Walther, Dirk and Steinbeck, Christoph (2015): COordination of Standards in MetabOlomicS (COSMOS): facilitating integrated metabolomics data access. In: Metabolomics, vol. 11, no. 6, pp. 1–11, 2015.