CaosDB—Research Data Management for complex, changing, and automated research workflows

Abstract

We present CaosDB, a Research Data Management System (RDMS) designed to ensuretextlessbrtextgreaterseamless integration of inhomogeneous data sources and repositories of legacy data in a FAIR way.textlessbrtextgreaterIts primary purpose is the management of data from biomedical sciences, both from simulations andtextlessbrtextgreaterexperiments during the complete research data lifecycle. An RDMS for this domain faces particulartextlessbrtextgreaterchallenges: research data arise in huge amounts, from a wide variety of sources, and traverse a highlytextlessbrtextgreaterbranched path of further processing. To be accepted by its users, an RDMS must be built aroundtextlessbrtextgreaterworkflows of the scientists and practices and thus support changes in workflow and data structure.textlessbrtextgreaterNevertheless, it should encourage and support the development and observation of standards andtextlessbrtextgreaterfurthermore facilitate the automation of data acquisition and processing with specialized software.textlessbrtextgreaterThe storage data model of an RDMS must reflect these complexities with appropriate semantics andtextlessbrtextgreaterontologies while offering simple methods for finding, retrieving, and understanding relevant data.textlessbrtextgreaterWe show how CaosDB responds to these challenges and give an overview of its data model, thetextlessbrtextgreaterCaosDB Server and its easy-to-learn CaosDB Query Language. We briefly discuss the status of thetextlessbrtextgreaterimplementation, how we currently use CaosDB, and how we plan to use and extend it.

Publication
Data 42: 83