Reproducible R Environment Using Guix

Mehrad Mahmoudian published on
7 min, 1338 words

Abstract

It is essential for a good analytical project to be 100% reproducible. Reproducibility issues in projects typically stem from three parts: 1) the data, 2) the software, and 3) random number generation. The reproducibility from software stack perspective can either come from the code that the analyst writes, or the packages, libraries, and software they use in their analysis. In this article I try to propose a simple solution for the latter without needing to change your Linux distro.

Why should we even bother

I have been working with a friend to come up with an all-inclusive solution for containerized declarative, reproducible Guix-based R development environment, but that project still has some rough edges. So in this article I'm presenting a declarative Guix-based approach that can be used for any programming language and any IDE, but without containerizing the environment. This approach uses Guix for package management and direnv for preparing environment. You can substitute Guix with Nix, but in this article I only discuss Guix, as I like Guile Scheme more than Nix language. :)

The reason that we need to use Guix/Nix is that they:

  1. allow installation of multiple versions of the same software independently along with their independent dependencies
  2. avoid package and dependency conflicts
  3. allow the user to declare the environmently exactly as the wish
  4. provide a very high level of project reproducibility due to the way they construct derivatives, profiles, and storing packages in their respective stores

One of the biggest advantages of Guix compared to Nix is that it allows you to specifically specify a particular time in history, and all the software you ask it will be installed as if you have went back in time (this is also possible in Nix, but that's a discussion for another day). In other words, if you specify a commit hash for December 2nd 2024, and as it to install R and Python, it would install the version that was available on that particular point of time. This feature is available through guix time-machine. We will get a little bit more into this in this article in practice, but you can read more about it here and here.

Installation of required software

For the sake of this article, all you need is having Guix and direnv installed on your computer. Everything else will be handles by the Guix itself.

Guix

Just to clarify, Guix is a package manager and although they also have Guix Operating System, you can have Guix on any distro. In this article I assume you are on a conventional distro like Ubuntu, Arch, Debian, etc. and you want to setup the environment without installing a new distro.

You can go to the official Guix website and then use the "GNU Guix Binary" option which is suitable to have guix on other Linux distros. I would suggest follow the official Guix binary installation instructions which would take about 1 minute to give it the information it needs, and then it would take about 10 minutes to get fully installed on your computer. Just make sure that your computer has 50GB of extra disk space, At first Guix will not consume 50GB, but after a year of using Guix, it will eventually grow as you will install more and more software with it. The installation guideline at the time of writing this article (2025-11-06T12:32:47+02:00) is:

# Go to the /tmp folder (generally this will be wiped when you reboot)
cd /tmp
# Download the installation script
wget -O guix-install.sh https://guix.gnu.org/install.sh
# Make the installation script executable
chmod +x guix-install.sh
# Execute the installation script
./guix-install.sh

direnv

For direnv you have to do two things:

  1. Installation of direnv The direnv already exists on many linux distro package repositories. You can find more information about this on direnv official website.
  2. Add the direnv hook to your shell

For adding the hook, there is a very nice explanation page on direnv website, but it can be simplified by adding the following to your .bashrc or .zshrc:

# if direnv is installed, run the hook
if hash direnv 2> /dev/null; then
    # get the shell name
    local tmp_shell
    tmp_shell="basename $(echo $SHELL)"
    # add the hook
    eval "$(direnv hook ${tmp_shell})"
fi

Workflow

Apart from Guix and direnv, you can install the IDE you want. In this article I'm going to discuss both Emacs and Positron, but I think Rstudio, VScode and other IDEs would work the same way.

Bare-minimum workflow

Let's create a toy project. In this project we will have an R file called test.R in which we use my own package: varhandle as an example. So let's create a folder to add the files:

# create the project folder
mkdir /path/to/the/project    #change this based on your preference

# go to the project folder
cd /path/to/the/project       #remember to change this accordingly too

# create the R file
touch test.R

Now open the test.R file with whatever text editor you want and add some code. For example:

# -*- mode: ess-r; fill-column: 80; -*-

#-- some code to demonstrate that the package is loaded ------------------------
{
    # just to show we can load a package
    library("varhandle")
    # get some data
    my_iris <- iris
    # add 20 NAs randomly
    for(i in 1:260){
        my_iris[sample(1:nrow(my_iris), 2), sample(c(1,2,3,1,3,3,3), 1)] <- NA
    }
    # now we can inspect the NAs
    inspect.na(my_iris)
}


# get the session info
sessionInfo()

# check the path of the R executible in this path
Sys.which("R")

We also need at least two more files:

  1. manifest.scm which is the Guix package declaration file
    ;; -*- mode: scheme; -*-
    
    (specifications->manifest
     (list
      "r-minimal"
      "r-varhandle"
      ))
    
  2. .envrc
    # -*- mode: sh; -*-
    
    # this is required to create and load a guix profile in the current directory
    GUIX_PROFILE="$PWD/.envrc.guix-profile"
    
    # create guix profile
    eval $(guix package -p "$GUIX_PROFILE"  --manifest=manifest.scm)
    eval $(guix package -p "$GUIX_PROFILE" --search-paths)
    
    # this is required to have guix and common tools (less, man etc.) in PATH 
    PATH="/bin:/usr/bin:/gnu/remote/bin:$PATH"
    

Now that we have all the files, we can have the most minimal workflow:

  1. Open a new terminal (it is important to be new so that we load a fresh environment)
  2. Navigate to the project folder (remember to change the path according to what you made above)
    cd /path/to/the/project
    
  3. The moment you enter the folder. The direnv should ask you to allow running and reading the .envrc. You should allow it (by running direnv allow), and it will try to install the packages in the manifest.scm and create a folder called .envrc.guix-profile. This might take a while depending on how much stuff you have added there, but it only takes time for the first time, and every subsequent run will be quick since those software are already installed. The folder should now look like this:
    ❯ ls -alh
    total 24K
    drwxr-xr-x 2 mehrad mehrad 4.0K Nov 12 12:52 ./
    drwxr-xr-x 4 mehrad mehrad 4.0K Nov 10 14:50 ../
    -rw-r--r-- 1 mehrad mehrad  400 Nov 10 15:21 .envrc
    lrwxrwxrwx 1 mehrad mehrad   26 Nov 11 11:46 .envrc.guix-profile -> .envrc.guix-profile-1-link/
    lrwxrwxrwx 1 mehrad mehrad   51 Nov 11 11:46 .envrc.guix-profile-1-link -> /gnu/store/ra2vyx3c213gq64sgdzpmk3nqbvsq78q-profile/
    -rw-r--r-- 1 mehrad mehrad   94 Nov 11 12:34 manifest.scm
    -rw-r--r-- 1 mehrad mehrad  201 Nov 10 14:52 README.md
    -rw-r--r-- 1 mehrad mehrad  601 Nov 11 12:33 test.R
    
  4. When guix is done, run positron, emacs, or any IDE you like in the terminal. You can also open R directly in the terminal if you for now don't want to include an IDE in your workflow.

Now, for educational purposes, let's cofirm few things in the R console:

  1. if the varhandle package is installed:
    is.element("varhandle", row.names(installed.packages()))
    
  2. the R version, by running version (Note this is not a conventional function, so it does not need ()). At the time of writing this, I get this output:
    > version
                   _
    platform       x86_64-unknown-linux-gnu
    arch           x86_64
    os             linux-gnu
    system         x86_64, linux-gnu
    status
    major          4
    minor          5.0
    year           2025
    month          04
    day            11
    svn rev        88135
    language       R
    version.string R version 4.5.0 (2025-04-11)
    nickname       How About a Twenty-Six
    
  3. We can also use the equivalent of the shell's which command in R and see where R is pointed to:
    Sys.which("R")
    
    The output should be in your project folder.
  4. You can of course see the path to the R binary which is much more solid approach than the step 3:
    file.path(R.home("bin"), "R")
    
    This should result in something like this (the hash can be different for you, but the rest should match):
    > file.path(R.home("bin"), "R")
    [1] "/gnu/store/8ahzimkjx5xhdgir21z5rbj811q972qq-r-minimal-4.5.0/lib/R/bin/R"
    

As the final last step, let's also run the test.R:

source("test.R")

Using Guix time-machine

Now that we got the minimum setup covered, let's improve it and use the guix time-machine to make sure the software we have in the project is pinned to a specific time in history. The guix time-machine is in my opinion the most interesting and most handy tool it provides. Anyways, let's try something fun.

For using the time-machine and travel to a point in time, we need to have an additional file, channels.scm, to define the repository and their respective commit hashes. Let's use the following as the content of channels.scm:

(list (channel
        (name 'guix)
        (url "https://codeberg.org/guix/guix.git")
        (branch "master")
        (commit
          "a2590694ae0350f9d7400f6f6f41fdbac2fa5340")
        (introduction
          (make-channel-introduction
            "9edb3f66fd807b096b48283debdcddccfea34bad"
            (openpgp-fingerprint
              "BBB0 2DDF 2CEA F6A8 0D1D  E643 A2A0 6DF2 A33A 54FA")))))

Note that the "a2590694ae0350f9d7400f6f6f41fdbac2fa5340" is the commit hash which indicates the point in time which dates back to Oct 12, 2025, 09:09 PM GMT+3, and the "9edb3f66fd807b096b48283debdcddccfea34bad" is the commit that was signed with the GPG key A2A06DF2A33A54FA.

We also need to update the .envrc file to instruct Guix to use the channel file:

# -*- mode: sh; -*-

## this is required to create and load a guix profile in the current directory
export GUIX_PROFILE="$PWD/.guix-profile"
eval $(guix time-machine \
            --channels='channels.scm' \
            -- package \
               --substitute-urls='https://ci.guix.gnu.org' \
               --manifest='manifest.scm' \
               --profile="${GUIX_PROFILE}")
eval $(guix package -p "${GUIX_PROFILE}" --search-paths)


## this is required to have guix and common tools (less, man etc.) in PATH 
export PATH="/bin:/usr/bin:/gnu/remote/bin:$PATH" 
export R_LIBS_USER="${GUIX_PROFILE}/site-library/"
lrwxrwxrwx 1 mehrad mehrad   20 Nov 12 12:40 .guix-profile -> .guix-profile-1-link
lrwxrwxrwx 1 mehrad mehrad   51 Nov 12 12:40 .guix-profile-1-link -> /gnu/store/5nl8mg32ab57qpa9mjqvs6h6ncxdjriz-profile
-rw-r--r-- 1 mehrad mehrad  943 Nov 12 12:29 channels.scm
-rw-r--r-- 1 mehrad mehrad  617 Nov 12 12:33 .envrc
-rw-r--r-- 1 mehrad mehrad   94 Nov 11 12:34 manifest.scm
-rw-r--r-- 1 mehrad mehrad  181 Nov 12 12:31 README.md
-rw-r--r-- 1 mehrad mehrad  601 Nov 11 12:33 test.R

Final notes

This approach can be extended in a very simple way. For example not all R packages are packaged in the official Guix channel, so it make sense to add other channels to provide the software you need. There is a good search tool to know which channel has packaged the software you want: https://toys.whereis.social

For example the channels.scm can extended to be:

;; -*- mode: scheme; -*-

(list
 ;; Guix main channel
 (channel (name 'guix)
  (url "https://codeberg.org/guix/guix.git")
  (branch "master")
  (commit
   "a2590694ae0350f9d7400f6f6f41fdbac2fa5340")
  (introduction
   (make-channel-introduction
    "9edb3f66fd807b096b48283debdcddccfea34bad"
    (openpgp-fingerprint
     "BBB0 2DDF 2CEA F6A8 0D1D  E643 A2A0 6DF2 A33A 54FA"))))
 
 ;; The CRAN (official R package repository)
 (channel (name 'guix-cran)
  (url "https://github.com/guix-science/guix-cran.git")
  (branch "master")
  (commit
   "8e31deefce41c4f2d4c83ac6271dda2dc553c957"))
 
 ;; The Guix Science channel
 (channel (name 'guix-science)
  (url "https://codeberg.org/guix-science/guix-science.git")
  (branch "master")
  (commit
   "d78e1d5763e44705ee901f8c7c47b3aedd565ed6")
  (introduction
   (make-channel-introduction
    "b1fe5aaff3ab48e798a4cce02f0212bc91f423dc"
    (openpgp-fingerprint
     "CA4F 8CF4 37D7 478F DA05  5FD4 4213 7701 1A37 8446")))))