This lesson is still being designed and assembled (Pre-Alpha version)

Version control your analysis with git

Overview

Teaching: 45 min
Exercises: 15 min
Questions
  • What is version control?

  • WHow can I be sure of the script and data I used to generate my manuscript figure?

  • What are popular version control systems?

Objectives
  • Understand the importance of version control for tracability of results.

  • Be able to display the differences between two (small) datasets or two workflows.

  • Be able to upload a dataset to one of the open-access repositories.

1. Table of contents

1. Introduction

1.1 What is version control?

1.2 How is it valuable for my research and Research Data Management?

1.3 What is the git software?

2. Version control your next publication

2.1 Exercise 1: keeping track of a figure.

Make a folder. Respect the previous folder structure and naming convention that you’ve choosen before.

Add a dataset (link)

Open RStudio and create a new script. Add a comment (using the # sign followed by your comment.)

Open the Shell and navigate (cd) into the folder that you have created previously. Type git init

Type git status

Create a plot:

Type git status

(untracked files)

git add git commit

git tree

You are not so happy with this figure. You could perhaps add some color.

git add git commit

git tree

2.2 Tracking a processed dataset

Your figure is produced from a relatively small files (X megabytes). You have noticed that some numbers need to be corrected because this measurement was mixed up.

3. Moving online

3.1 GitHub

Provides hosting for software development and version control using Git. Free for Open Source project.

3.2 GitLab

3.3. BitBucket

Key Points

  • Version control allows you to retrieve the complete history of your processed data and analytical workflows.