Book a Demo
Book a Demo
Benchmarking nf-core rnaseq on AWS with MemVerge's nf-float Nextflow plugin

Benchmarking nf-core rnaseq on AWS with MemVerge's nf-float Nextflow plugin

Ashley Tung, Sateesh Peri 2023-10-13709

By: Ashley Tung, Software Engineer, MemVerge

      Sateesh Peri, Developer Advocate, MemVerge

Running Nextflow on AWS Batch

In 2017, Nextflow added initial support for AWS Batch, which opened the door to a batch scheduler with (theoretically) infinite capacity. Since then, AWS Batch has become the workhorse for cloud-native Nextflow pipelines and users of Seqera Platform (formerly known as Nextflow Tower). It also presented a new trade-off to consider: cost. Whereas traditional HPC systems provide fixed allocations of CPU-hours or dedicate certain nodes to certain projects, AWS Batch allows you to pay-as-you-go.

The primary aspect of cost is the choice of on-demand vs spot instances. On-demand instances provide a guaranteed level of service at a fixed price. Spot instances, on the other hand, can be preempted or “reclaimed” at any time, but can be as much as 90% cheaper than on-demand instances depending on current demand. The details of this pricing model have evolved over the years, but the core trade-off remains the same. Do you pay a premium for guaranteed instances, and relax knowing that your jobs will complete? Or do you play the game, try to deal with the spot reclamations, and hope you still come out ahead?

Fortunately, Nextflow already has such error handling capabilities built-in. By default, tasks that fail due to spot reclamation are retried up to 5 times (see the `aws.batch.maxSpotAttempts` config option) before they are registered as task failures to Nextflow. Even then, you could add further error handling, such as redirecting the task to an on-demand queue:

process.queue = { task.attempt == 1 ? ‘my-spot-queue’ : ‘my-ondemand-queue’ }

This strategy works well for many users, but still has some shortcomings. For one thing, the reclaimed task must be retried from the beginning, which is a waste of time and money, not to mention carbon emissions, which many organizations are now trying to minimize in order to be more sustainable. But there is a broader issue, which is that tasks can also fail due to insufficient resources. A typical strategy is to have dynamic resources based on the task attempt, similar to the dynamic queue:

process.cpus = { 2 * task.attempt }
process.memory = { 8.GB * task.attempt }

However, by solving one problem we have created another – if a spot instance runs out of memory, it will be retried with double the memory, but also on an on-demand instance… the combined upgrade could make the task up to 20x more expensive! The fundamental issue is, how do we tell whether a task failed due to spot reclamation or insufficient resources? At the time of writing, we can’t, at least not with Nextflow alone.

Introducing Memory Machine Cloud by MemVerge

It is with this exposition that we introduce a service that seems purpose-built to solve exactly the problem we just laid out. Memory Machine Cloud (MMCloud) is a container-native compute platform that streamlines the way applications are executed in the cloud. You can think of it as a drop-in replacement for AWS Batch, but with several key enhancements, which we’ll get into shortly. MMCloud can currently be deployed on AWS, Google Cloud, Ali Cloud and Baidu Cloud.

Let’s dig into the main features of MMCloud:

Float is MMCloud’s job management service. You can use the Float CLI to interact with MMCloud like any other HPC scheduler.

WaveWatcher is MMCloud’s resource and cost observability service. It provides an accurate and real-time accounting of the resource usage (CPU, memory, storage I/O, network), cost, and even the carbon footprint of each job.

SpotSurfer is MMCloud’s checkpointing and recovery service. When a spot instance is about to be reclaimed, MMCloud saves a snapshot of the instance to disk and restores the snapshot to a new spot instance with the same or similar size based on vCPU and memory, allowing the job to continue as if nothing happened.

WaveRider is MMCloud’s continuous rightsizing service. When a job is over- or under-utilizing its resources, MMCloud takes a snapshot (using the same mechanism as SpotSurfer) and restores it to a new instance with adjusted resources. For example, when a job’s memory usage is at least 90% for 1 minute, MMCloud can pause the job and restore it on a new instance with double the memory. This feature is highly configurable by the user.

As you can see, MMCloud has the potential to solve many of our problems at once. It can handle spot reclamations and insufficient resource failures at the same time, and it can do so much more efficiently. Whereas Nextflow can restart a job after it fails, MMCloud can save and restore a job before it fails!

But how does it work? The key component that makes MMCloud possible is the two-minute warning that AWS gives a spot instance when it is about to be reclaimed. This warning gives MMCloud just enough time to save the instance state before it is destroyed. Of course, large-memory instances might take too long to save, and other cloud providers might not be as lenient. Google Cloud, for example, only gives a 30-second warning. MMCloud can handle these cases by checkpointing the instance periodically, which is not as efficient but still better than a complete restart.

Integrating with Nextflow via the nf-float plugin

Running your Nextflow pipelines with MMCloud is about as easy as any other cloud executor. All you need is the `float` CLI and a bit of Nextflow configuration.

Running Nextflow pipelines on AWS EC2 with MMCloud Example

Here is an example config using S3 as the work directory in a nextflow config file:

plugins {
    id 'nf-float'

workDir = 's3://my-bucket/work'

float {
    address = '...'
    username = '...'
    password = '...'

process {
    executor = 'float'

aws {
    accessKey = '...'
    secretKey = '...'

Make sure to provide the address of your MMCloud endpoint, as well as your username and password. You can also provide them as environment variables or secrets so that they aren’t exposed in your config. You can provide your AWS credentials in the usual ways. See the nf-float documentation for details.

You can then run your pipeline as usual – any Nextflow pipeline you've run on AWS Batch for example will probably run on MMCloud.

In addition to S3 Bucket support, JuiceFS is a high-performance, reliable, and secure distributed file system optimized for managing large volumes of data, distinct in its technical approach compared to other FUSE-based systems like S3FS. MMCloud offers an easy template to launch nextflow head nodes mounted with Juice FS that can be launched via CLI as below:

float submit --template nextflow:jfs \
--securityGroup <vpc-sg> \
-e BUCKET= \
--env BUCKET_ACCESS_KEY={secret:your_secret_name} \
--env BUCKET_SECRET_KEY={secret:your_secret_key}

You can also use MMCloud with Fusion:

wave.enabled = true
fusion.enabled = true
fusion.exportAwsAccessKeys = true

And you can use it with Wave and Conda:

wave.enabled = true
wave.strategy = 'conda'
process.conda = '...'

Fusion cannot be used yet with the VM checkpoint/restore feature of MMCloud, but the MemVerge team is working hard to make that happen and we hope to support it soon. Until then, you can still use Fusion with MMCloud via our Nextflow plugin and benefit from MMCloud’s other features such as monitoring and cost reporting.

Finally, you can also monitor your runs with Seqera Platform (ie: Nextflow Tower) using `-with-tower` as usual with your Tower access token. Stay tuned for more updates regarding the Seqera Platform integration!

nf-core rnaseq cost comparison: AWS Batch vs MMCloud

To demonstrate the cost savings of MMCloud over a conventional AWS Batch setup, we took the nf-core/rnaseq pipeline with the `test_full` profile and ran it with a variety of different setups:

  1. AWS Batch vs MMCloud
  2. AWS CLI vs. JuiceFS vs Fusion
  3. On-Demand EC2 vs EC2 Spot

Note: when not using Fusion, Nextflow uses the AWS CLI with AWS Batch jobs, while MMCloud uses JuiceFS to mount S3 buckets into the task containers. MMCloud can also be used with S3FS.

We then compared the resource usage and cost of these runs. The results are shown below:

Based on these results, MMCloud emerges as a more cost-effective solution in both EC2 On-Demand and EC2 Spot pricing models, demonstrating lower average costs and cost per CPU-hour metrics across the assessed workloads. Additionally, MMCloud exhibits a tendency towards greater time efficiency, particularly in the EC2 On-Demand pricing model, where it outperforms AWS Batch in average walltime. 

Seqera’s Fusion FS, outperforming base S3 comprehensively and substantiating Seqera’s earlier Fusion benchmark, is also particularly impressive, even when Fusion+EC2 Spot runs cannot utilize MMCloud’s VM recycling. The forthcoming combined benefits of Fusion and MMCloud’s VM recycling are eagerly anticipated.

Pricing and Final Thoughts

MemVerge offers a free trial and a pay as you go based license that costs $0.02 per core hour at list price. MemVerge (the company behind MMCloud) estimates savings by running workloads on Spot compute managed by MMCloud are $2,500 per 50,000 core hours and the license fee for MMCloud would ranges from 15-25% of your savings. Special pricing is available for Non-Profit and Higher Education organizations. Alternatively, the "Enterprise" tier delivers custom SLAs and unlimited usage for a fixed annual rate.

This article is merely a preview of the value that MMCloud and its nf-float Nextflow plugin can bring to Nextflow pipelines. The MemVerge team plans to perform a larger benchmark over the course of several weeks, in order to account for variability in spot pricing and provide a better picture of the potential savings and performance optimization. For more information about the MMCloud Nextflow solution click here.


Special thanks to Ben Sherman at Seqera Labs for helping us with the Nextflow plugin and the nf-core rnaseq benchmark discussions.