Chapter 1 Prerequisites

1.1 Hardware and Operating System

The pipeline was developed and tested on Ubuntu 20.04.3 LTS on top of the (GNU/Linux 5.4.0-88-generic x86_64) kernel. The output of the commands uname and neofetch are provided to further detail our configuration.

$ uname -a
Linux machine329a7396-059f-41aa-94c7-4c41b4ec8290 5.4.0-88-generic #99-Ubuntu SMP Thu Sep 23 17:29:00 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

$ neofetch
            .-/+oossssoo+/-.               ubuntu@machine3f9ae3dc-6a4d-46b7-8131-04f00a1be146 
        `:+ssssssssssssssssss+:`           -------------------------------------------------- 
      -+ssssssssssssssssssyyssss+-         OS: Ubuntu 20.04.3 LTS x86_64 
    .ossssssssssssssssssdMMMNysssso.       Host: OpenStack Compute 18.2.1-1.el7 
   /ssssssssssshdmmNNmmyNMMMMhssssss/      Kernel: 5.4.0-88-generic 
  +ssssssssshmydMMMMMMMNddddyssssssss+     Uptime: 13 hours, 14 mins 
 /sssssssshNMMMyhhyyyyhmNMMMNhssssssss/    Packages: 719 (dpkg), 4 (snap) 
.ssssssssdMMMNhsssssssssshNMMMdssssssss.   Shell: bash 5.0.17 
+sssshhhyNMMNyssssssssssssyNMMMysssssss+   Theme: Adwaita [GTK3] 
ossyNMMMNyMMhsssssssssssssshmmmhssssssso   Icons: Adwaita [GTK3] 
ossyNMMMNyMMhsssssssssssssshmmmhssssssso   Terminal: /dev/pts/0 
+sssshhhyNMMNyssssssssssssyNMMMysssssss+   CPU: Intel (Haswell, no TSX, IBRS) (16) @ 2.294GHz 
.ssssssssdMMMNhsssssssssshNMMMdssssssss.   GPU: 00:02.0 Cirrus Logic GD 5446 
 /sssssssshNMMMyhhyyyyhdNMMMNhssssssss/    Memory: 635MiB / 64323MiB 
  +sssssssssdmydMMMMMMMMddddyssssssss+
   /ssssssssssshdmNNNNmyNMMMMhssssss/                              
    .ossssssssssssssssssdMMMNysssso.                               
      -+sssssssssssssssssyyyssss+-
        `:+ssssssssssssssssss+:`
            .-/+oossssoo+/-.

This configuration was actually a virtual machine hosted on Biosphere’s RAINBio a cloud service maintained by the French Institue of Bioinformatics (Institut Français de Bioinformatique).

1.2 BioPipes: a Biosphere-commons app

The instance of the virtual machine we used is called BioPipes. It provides the most notable bioinformatics pipeline tools:

1.3 Main Tools

Their versions are specified to maximise reproducibility:

$ conda --version
conda 4.11.0
$ nextflow -v
nextflow version 21.10.0.5640
$ docker --version
Docker version 20.10.11, build dea9396

Detailed information about our development docker installation :

$ docker info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Build with BuildKit (Docker Inc., v0.6.3-docker)
  scan: Docker Scan (Docker Inc., v0.9.0)

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 4
 Server Version: 20.10.11
 Storage Driver: overlay2
  Backing Filesystem: xfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 7b11cfaabd73bb80907dd23182b9347b4245eb5d
 runc version: v1.0.2-0-g52b36a2
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: default
 Kernel Version: 5.4.0-88-generic
 Operating System: Ubuntu 20.04.3 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 16
 Total Memory: 62.82GiB
 Name: machine3f9ae3dc-6a4d-46b7-8131-04f00a1be146
 ID: XT4Y:2HUL:HXEA:CDXV:ERC7:Z7JZ:YYRU:WZBT:ERCU:6GGA:OBZ6:QLXE
 Docker Root Dir: /mnt/docker-data
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false