This lesson is still being designed and assembled (Pre-Alpha version)

Open-access repositories

Overview

Teaching: 45 min
Exercises: 0 min
Questions
  • What is an open-acess repository?

  • What is the difference between a generic and a domain-specific data repository?

  • What are examples of open-access repositories?

Objectives
  • Understand where generic open-access repositories stand in the research data life cycle.

  • Be able to upload a dataset to one of the open-access repositories.

1. Table of contents

1. Introduction

In this section, we will see how to preserve and support access to your research data generated during the course of your work. This episode on steps 5 and 6 of the Research Data Life Cycle.

Open Access often refers to free access to research publications. In this episode, Open Access will strictly refer to research outputs that are not publications themselves but rather additional resources collected that support the research conclusions and related publications.

1.1 Definition of an open-access repository

An open-access repository is a digital hosting service for the safe storage, data annotation and retrieval of research outputs. Open Access here means that datasets are made freely accessible either instantly or after an embargo period (becomes free after a certain time).

Question

Do you already know examples of open-access repositories?

Answer

If you continue the lesson, you will discover that Zenodo, FigShare or Dryad are examples of open-access repositories.

1.2 Open and public data

Open access does not necessary imply that all datasets are made public directly after upload to the data repository. Rather, these datasets should be made accessible and indicate on which conditions these can be used by others (i.e. license).

Taken from the EU FOSTER portal

Question

What would be a real-life example of a dataset that should be made accessible (open) but not public (keep private)?
Can you name a few examples?

Answer

Clinical studies in which human patient data are collected is such an example of research data that should be accessible (by other clinical researchers for instance) but should not be made public. DNA sequencing is also treated

2. Choosing a license

2.1 Why licensing your dataset?

2.2 Available licenses and their differences

3. Real-life example with Zenodo

In this section, you will mimick a real-life example. You have just obtained statistics on

  1. Download a test dataset. Description of the dataset
  2. If not already done, create an account on Zenodo.

Sandbox: Zenodo has a “sandbox” website that is made to test data upload https://zenodo.org/communities/

3.1

3.1 Zenodo

4. FigShare

5. Dryad

4. Real-life exercise

5. Resources

5.1 Open-access repositories

5.2 Open data

Key Points

  • Next-Generation Sequencing techniques are massively parallel cDNA sequencing.