Governments, funders and publishers expect greater Findability, Accessibility, Interoperability and Reusability of the (meta)data that supports research findings, according to the widely accepted FAIR Principles (doi:10.1038/sdata.2016.18), which we helped author. The use of community-developed standards for identification and description of the (meta)data, and the deposition in trusted repositories, underpin FAIRness and reproducible research.
Since 2007, our group has helped many communities to tackle these requirements via our open-source ISA Tools (isa-tools.org ; isacommons.org), enabling standards-compliant description, deposition and publication- of a variety of experiment types. Since 2011 our group runs FAIRsharing (fairsharing.org), guiding researchers, journals, publishers and other communities to discover, select and use repositories and community-developed standards with confidence.
The user will be guided to provide (semi)structured descriptions of the experimental design, and of the post-processed data, to generate, respectively, the Methods and a set of statements to populate the Results section of a manuscript. Datascriptor will work: (i) as a stand-alone tool - for anyone to use - implementing generic metadata models, such as W3C Data Catalog (DCAT) vocabulary; and (ii) as a component of the ISA Tools and the InterMine data-warehouse - for their user communities - implementing the ISA metadata model.
To output short sentences from the (semi)structured input, we will evaluate a mixed data-to-text approach using template-based and neural-based (i.e. machine learning) methods. To further enrich the content of the manuscript, Datascriptor will connect to existing authoring systems, including Substance, Texture, Stenci.la and Manuscripts, and export the result in JATS format. Our plans also include an export as a DAR file and in LaTeX format.