Workshop Introduction

This website provides supporting materials for the part 2 of the workshop Projects’ workflow for reproducibility and replicability using R held at the Second Rostock Open Science Workshop organized by the Max Planck Institute for Demographic Research in Rostock, Germany. This tutorial is licensed under CC0. Part 1 of the workshop is available in this GitHub repository.

You can get the slides in PDF and PPTX (with some animations).

This tutorial includes the following sections:

  1. “R packages version management with renv, which covers {renv} for R packages version management;
  2. “Building analysis pipelines with targets, which covers the {targets} R package for structuring the analysis code in a clean and modular way;
  3. “Containerizing R and R Packages for Ultimate Reproducibility”, which covers Docker (or Docker-compatible tools) for building containers that include all operating system files, R installation, any additional system depenencies and R packages in a single file that can be used to run the analysis even if some software becomes outdated or even unavailable for installation in the future.

To participate in the workshop, we recommend that you have a GitHub account and a Docker Hub account (the latter is optional and only needed for the last section on Docker).

For the section 3 on Docker, you may also want to install Docker on your computer (for Windows, macOS or Linux). For macOS we also highly recommend the free version of OrbStack as a complete and lightweight Docker Desktop replacement.

Expected learning outcomes

  • Able to explain why it is important to track the versions of software that are used for the analysis

  • Able to use {renv} to setup a project directory with specific R package versions

  • Able to explain the advantages of structuring the analysis code in a modular way

  • Able to use {targets} to setup a modular analysis pipeline

  • Able to explain the concept of containers and their role in computational reproducibility and discuss further reproducibility challenges

  • Able to create a Docker container with RStudio and R of a specific version and install the R packages previously saved with {renv} into the container

  • (optional) Able to create a GitHub repository that can be executed in the cloud using Binder