What Is a Data Repository and What Are Its Benefits

A data repository is an information system designed for the long-term and secure storage, preservation, protection, integrity, authenticity, and publication of datasets.

A trusted repository should provide at least the following functions:

  • Providing open access to datasets
  • Assigning a persistent identifier to uploaded content.
  • Assigning standardized, machine-readable metadata to the uploaded dataset. This is done via a form that appears for the user to fill out when uploading the dataset to the repository
  • The ability to assign the required license for datasets
  • Ensuring the long-term preservation of authentic content
  • Does not simulate remote storage. Once published here, content cannot be hidden again except in justified cases
  • When dataset content is corrected, a new version is created, but the old one does not disappear
  • Allows registration using an academic identity

Which repository should you choose

CTU does not have an institutional data repository. When choosing a repository, it is therefore recommended that you follow these steps:

  1. If a trusted discipline-specific repository exists, use it. Their advantages include greater flexibility in accommodating discipline-specific metadata requirements, potentially higher visibility of research within the academic community, and the ability to meet specific disciplinary requirements. Examples of such subject-specific repositories include HEP data for high-energy physics or DANTEc for materials science, which is being developed as part of the National Repository Platform (NRP).
  2. Multidisciplinary repositories are a very common choice. Their advantage is user-friendliness, but they may not meet certain discipline-specific requirements.

The following are two examples of multidisciplinary repositories:

Zenodo – the most widely used repository, operated by CERN. It is a so-called “catch-all” repository where, in addition to research data, users can also upload articles, software, or presentations.

It offers 50 GB of storage per dataset, the assignment of a persistent DOI identifier, and licensing options including Creative Commons. The metadata description follows the DataCite schema, but Zenodo also supports other schemas (e.g., Dublin Core) if needed.

National Repository – a multidisciplinary repository established by the CESNET association as part of the National Data Infrastructure. It offers 500 GB of storage per dataset, assigns a persistent DOI identifier, and provides licensing options, including Creative Commons. Metadata description follows the new Czech CCMM (Czech Core Metadata Model) schema.

Repositories can also be searched using the Re3Data and OpenDOAR repository databases; however, not every repository found is trustworthy.

We can consider repositories that have received one of the following certifications to be trustworthy: CoreTrustSeal, Nestor Seal, or ISO16363. However, this does not mean that repositories without certification cannot be trustworthy. For example, Zenodo has no certification, yet it is trustworthy. We are also happy to assist you with selecting a repository at the Methodological Center of the CTU Central Library.



, Last change: 27.05.2026