Persistent Nextflow JuiceFS

Introduction to JuiceFS

JuiceFS is an open-source, high-performance distributed file system designed specifically for cloud environments. It offers unique features, such as:

Separation of Data and Metadata: JuiceFS stores files in chunks within object storage like Amazon S3, while metadata can be stored in various databases, including Redis.
Performance: Achieves millisecond-level latency and nearly unlimited throughput, depending on the object storage scale.
Easy Integration with MMCloud: MMCloud provides pre-configured nextflow head node templates with JuiceFS setup, simplifying deployment.
Comparison with S3FS: For a detailed comparison between JuiceFS and S3FS, see JuiceFS vs. S3FS. JuiceFS typically offers better performance and scalability.

Pre-requisites for Using JuiceFS with Nextflow

Before you begin, ensure you meet the following pre-requisites:

A VPC Security Group with inbound-rule for port 6868.
- Navigation: AWS EC2 console -> Network & Security -> Security Groups

Inbound rules should include:
- Custom TCP over TCP on port 6868, used by the Redis server in this juiceflow setup.

Create a New S3 Bucket:
- Cloud storage is required for JuiceFS's backend storage.
- Create a new S3 bucket and note its region, as this will be needed later.
- Path: AWS S3 console -> Create Bucket

Deployment Steps for Individual Users on MMCloud

Ensure you are using the latest version of float CLI:

sudo float release sync

float login -a <opcenter-ip-address> -u <user>

After entering your password, verify that you see Login succeeded!

Float Secret

Set your AWS credentials as secrets in float:

float secret set AWS_BUCKET_ACCESS_KEY <BUCKET_ACCESS_KEY>
float secret set AWS_BUCKET_SECRET_KEY <BUCKET_SECRET_KEY>

To verify the secrets:

float secret ls

Expected output:

+-----------------------+
|           NAME        |
+-----------------------+
| AWS_BUCKET_ACCESS_KEY |
| AWS_BUCKET_SECRET_KEY |
+-----------------------+

Deploy Nextflow Head Node

Deploy the Nextflow head node using the nextflow:jfs template:

float submit -n <head-node-name> \
--template nextflow:jfs \
--securityGroup <sg-00000000000a> \
-e BUCKET=https://<bucket-name>.s3.<bucket-region>.amazonaws.com \
--env BUCKET_ACCESS_KEY={secret:AWS_BUCKET_ACCESS_KEY} \
--env BUCKET_SECRET_KEY={secret:AWS_BUCKET_SECRET_KEY}

Note: Replace <head-node-name>, <security-group>, <bucket-name>, and <bucket-region> with your specific details. The nextflow:jfs template comes pre-configured with JFS setup.

Overriding Template Defaults

Customizing CPU and Memory

To override default CPU and memory settings:

--overwriteTemplate "*" -c <number-of-cpus> -m <memory-in-gb>

Example: To set 8 CPUs and 32GB memory
--overwriteTemplate "*" -c 8 -m 32

Specifying a Subnet

For deploying in a specific AWS subnet:

--overwriteTemplate "*" --subnet <SUBNET-ID>

Mounting S3 buckets as Data Volumes

you can mount the input data bucket using S3FS as a data volume on the Nextflow head node and worker nodes as follows:

--dataVolume [mode=r,accesskey=xxx,secret=xxx,endpoint=s3.REGION.amazonaws.com]s3://BUCKET_NAME:/staged-files

Read more on Juiceflow Performance Optimization

Incremental Snapshot Feature (From v2.4)

Enables faster checkpointing and requires larger storage:

--overwriteTemplate "*" --dumpMode incremental

Advantages:

Supports larger workloads

Lower impact on job running time

No need for periodic snapshot interval configuration

Disadvantages:

Requires larger storage for delta saves

Final snapshot is necessary for restore

Checking Head Node Deployment Status

float list

Example Output:

+-----------------------+----------------+------------------------------------+---------+-----------+----------------------+------------+-------------+
|          ID           |      NAME      |            WORKING HOST            |  USER   |  STATUS   |     SUBMIT TIME      |  DURATION  |    COST     |
| NlygkM1dA0qIucncPwjgD | jfs-head-1     | 54.211.194.151 (2Core4GB/OnDemand) | sateesh | Executing | 2023-11-02T17:40:06Z | 6m54s      | 0.0049 USD  |
+-----------------------+----------------+------------------------------------+---------+-----------+----------------------+------------+-------------+

SSH into Head Node

Locate the public IP address of the head node in the Working Host column.
Retrieve the SSH key from Float's secret manager:

float secret get <job-id>_SSHKEY > <head-node-name>-ssh.key

Note: If you encounter a Resource not found error, wait a few more minutes for the head node and SSH key to initialize.

Set the appropriate permissions for the SSH key:

chmod 600 <head-node-name>-ssh.key

SSH to Nextflow JFS Head Node

SSH into the Nextflow head node using the provided SSH key, username, and the head node's public IP address:

ssh -i <head-node-name>-ssh.key <user>@<head-node-public-ip-address>

Note: Use the username nextflow to login as admin

MMC nf-float configuration

Editing the configuration file

Copy the template and edit the configuration file:

cp mmcloud.config.template mmc-jfs.config
vi mmc-jfs.config

Note: If you're new to using vi, check out this Beginner's Guide to Vi for basic instructions.

The mmc-jfs.config file copied from the template will be pre-filled with the OpCenter IP address, and the PRIVATE IP address of the Nextflow head node. You only need to provide your OpCenter username, password and AWS access and secret keys in the config.

plugins {
  id 'nf-float'
}

workDir = '/mnt/jfs/nextflow'

process {
    executor = 'float'
    errorStrategy = 'retry'
    extra ='  --dataVolume [opts=" --cache-dir /mnt/jfs_cache "]jfs://172.31.41.331:6868/1:/mnt/jfs --dataVolume [size=120]:/mnt/jfs_cache'
    /*
    For some special tasks like Qualimap which generates very small IO request, using this -o writeback_cache can help with performance. Here's an example.
    withName: "QUALIMAP_RNASEQ" {
      extra ='  --dataVolume [opts=" --cache-dir /mnt/jfs_cache  -o writeback_cache"]jfs://172.31.41.331:6868/1:/mnt/jfs --dataVolume [size=120]:/mnt/jfs_cache'
    }
    */
}

podman.registry = 'quay.io'

float {
    address = '172.31.42.11:443'
    username = '<your_user_name>'
    password = '<your_password>'
}

// AWS access info if needed
aws {
  client {
    maxConnections = 20
    connectionTimeout = 300000
  }
  accessKey = '<bucket_access_key>'
  secretKey = '<bucket_secret_key>'
}

Using Tmux

Start a tmux session named nextflow:

tmux new -s nextflow

To attach to an existing tmux session:
tmux attach -t nextflow

Tip: If you're new to tmux, here's a handy Tmux Cheat Sheet.

Nextflow Version Check

Check the Nextflow version and update if necessary:

nextflow -v

Example Output:
nextflow version 23.10.0.5889

Launch Nextflow

Launch a Nextflow or nf-core/<pipeline> by providing the MMC config file:

nextflow run nf-core/<pipeline> \
-profile test_full \
-c mmc-jfs.config \
--outdir s3://nextflow-work-dir/<pipeline>

Head Node Management

In this persistent head node setup, user is responsible to dispose off the head node. To cancel the head node job click on the cancel button for the job as shown below

FAQ

Q: Why does nf-core/rnaseq fail at the QUALIMAP step?

A: The QUALIMAP process in nf-core/rnaseq often fails due to its high-frequency, small-size write requests, leading to timeouts. Enabling -o writeback_cache consolidates these requests and improves performance significantly. However, it turns sequential writes into random writes, affecting sequential write performance. Use this setting only in scenarios with intensive random writes.

Add the following in the process {} scope of your config:

withName: "QUALIMAP_RNASEQ" {
     extra ='  --dataVolume [opts=" --cache-dir /mnt/jfs_cache  -o writeback_cache"]jfs://<head-node-private-ip>:6868/1:/mnt/jfs --dataVolume [size=120]:/mnt/jfs_cache'
   }

Additional Reading

Data Volumes

For jobs that generate file system I/O, specifying data volumes is essential. The OpCenter supports a variety of data volume types. Learn more about configuring data volumes in MMCloud in the MMCloud Data Volumes Guide.

Allow / Deny Instance Types

Configure which instance types are allowed or denied in your setup.

See detailed docs on Controlling Virtual Machine Types

Persistent Nextflow JuiceFS (AWS)