Data Management Plan
Good research data management applies to the entire data lifecycle and should include a Data Management Plan (DMP): which data the researcher will work with and how will he/she collect, process, organize, analyse, store, share and reuse them. The DMP is a "live" document that needs to be updated continuously to reflect what has actually been happening with the data. Since 2022, the DMP has been established by the Czech law as a mandatory document that complements the interim and annual reports of publicly funded scientific projects (amendment to Act No. 130/2002 Sb., Section 12a Access to Research Data). Thus, the preparation and subsequent implementation of the DMP is now also required by funders and is becoming a common part of research projects.
Content of the DMP
A good DMP assumes that researchers have thought through how they will proceed with their research. The DMP contains information about the primary and secondary data that will be used in the research, describes how the data will be obtained, how they will be protected, how and where they will be stored, and under what conditions they will be made available for further use.
The law does not prescribe what a DMP should look like. It therefore depends on the requirements of the funder, the specific project and also on the field of science as research data and their management vary quite significantly from one discipline to another (e.g., sensitive data in medicine or sociology require specific protection, data in physics or biology are demanding to store and back up due to their large volume).
In general, the DMP should include:
1. Administrative data
You should provide background information to provide context for the DMP – basic information about the research (e.g., project title, name, contact details and ID of the principal investigator, funder, and project partners), a short description of the research to which the data relate (abstract), regulations, measures or guidelines governing data retention.
2. Data description
You should define what data will be collected or generated and how – the type and estimated volume of data, use of existing data, what formats the data will be stored in.
3. Documentation and metadata
You should provide information that is needed for reading and interpreting data in the future – what metadata will be used to describe the data, what standards will be followed for formatting and documenting the data, persistent identifiers used (e.g., DOI), how data quality control will be ensured (methods for verifying data accuracy, completeness and consistency).
4. Ethical and legal issues
You should consider ethical and legal issues – whether consent is required for storing and sharing data, how sensitive and personal data will be protected, how data will be protected from unauthorised access, loss or misuse (anonymisation/pseudonymisation of data, etc.).
5. Data storage and backup
You should consider where the data will be stored and how they will be backed up, including access and security – sufficient space for data storage (data repository), cost of the storage space, volume of data, determination of responsibility for data backup and recovery, potential risk to data security including solutions, secure access to data by co-investigators.
6. Archiving and long-term storage
You should determine which data are suitable for long-term protection and what is the best way to store them – existing contractual/legal conditions for data storing, selection of data for long-term storage, time and financial costs of preparing data for long-term storing and sharing.
7. Data sharing and availability
You should think about what data you will share, how you will share them (e.g., open repositories), who you will share them with and under what conditions (licence), when the data will be available, and how potential users will know about the data.
8. Roles and responsibilities
You should identify roles and responsibilities for all data activities within the research – who will be responsible for data management and implementation of the DMP, how responsibilities will be divided among the project members, etc.
9. Funding and costs
You should think about the costs associated with data storage, backup, sharing and archiving (equipment, expertise, use of software and hardware, additional financial and human resources) and how these costs will be financed (e.g., grant funding, institutional support, etc.).
Examples of published DMPs for inspiration
Tools for creating DMPs
The DMP can be in the form of a shared document, or online tools can be used to help create the DMP and guide the researcher to specific answers through relevant questions, from which they then generate the required document. These tools include specialised software such as DMPonline or Data Stewardship Wizard. They also have the advantage of supporting not only the creation of the DMP document and its ongoing updating, but also the research data management process itself and collaboration between researchers. The form of the plan in these online tools can be adapted to the requirements of individual projects or to institutional and disciplinary specificities.
Shared document
- Google Docs, Office 365 Word, Overleaf, etc.
Specialized software
- ARGOS (Open AIRE) – https://argos.openaire.eu/splash/
- DMPTool (University od California) – https://dmptool.org/
- DMPOnline (Digital Curation Centre) – https://dmponline.dcc.ac.uk/
- Data Stewardship Wizard (ELIXIR) – https://ds-wizard.org/
- FAIR Wizard – https://fair-wizard.com/
DMPOnline
DMPOnline enables the creation of DMPs using templates prepared for the requirements of specific funders and supports collaboration with colleagues and public sharing of the plan created. If a user is not preparing a DMP for a specific funder, they can choose a generic Digital Curation Centre template. In addition to the pre-set template, it also includes a help section to assist the user with completing each section. The created plan can then be downloaded in various formats (e.g., csv, html, docx, pdf, or json).
Data Stewardship Wizard
It is a freely available web-based tool that guides the author intuitively through the creation of a DMP using guiding questions. It displays the different sections of the managed research data and, thanks to the form of knowledge-based models (tree questionnaires), the tool displays relevant only questions based on previous answers. This tool also has the advantage of integrating the DMP creation process with FAIR principles, thus allowing the data to be processed in accordance with Open Science. DS-Wizard also provides help links and supports collaboration and sharing of team projects. The final DMP can be exported in several available templates (Machine-Actionable DMP, Horizon 2020, Horizon Europe, and Science Europe) and formats (e.g., pdf, docx, html, LaTeX, and json).
Instructions for working with DSW and creating a DMP
Section navigation: Research data