## Known Issues-1: AWS Rebalance Recommendation Signal

1. **Frequent Rebalance Recommendation Signals**:

**Issue**: AWS may send a [Rebalance Recommendation signal](https://docs.memverge.com/MMCloud/latest/User%20Guide/Exploring%20MMCloud%20Features/SpotSurfer/?h=rebala#aws-rebalance-recommendation-signal) if it determines that a Spot Instance has an increased likelihood of being reclaimed. This signal can be intercepted by the OpCenter, which uses a rules-based approach to decide whether to maintain the current Spot Instance or proactively capture a snapshot and transition to a new instance.
 
**Impact**: Frequent rebalance signals may lead to numerous interruptions, which can cause jobs to fail with checkpoint errors due to multiple spot reclaims.
 
**Example**: In the job example screenshot below, the OpCenter encountered numerous rebalance signals, resulting in 17 interruptions over a short period. These interruptions reset the buffer cache, causing processing delays and timeouts.

![](https://baywatch-api.memverge.com/media/upload/202408/240809162426315505.png)


2. **Default Rebalance Threshold Settings**:

**Issue**: The rebalance signal threshold on a new OpCenter is set to 64GB by default. This setting might trigger unnecessary rebalance actions that don't lead to actual spot reclaims.

**Impact**: This can cause jobs that require substantial buffer cache to fail or slow down significantly due to frequent cache resets and timeouts.
 
**Resolution**: Increasing the rebalance threshold to 125GB can help by ensuring that jobs are only checkpointed on actual spot reclaim signals rather than on every rebalance signal.
 
> **Note**: Changes to the rebalance threshold will only apply to newly launched jobs and not to those already in progress.

# Setup MMC OpCenter on AWS


## Setup & Download EC2 Key-Pair for OpCenter Access

* Login to `AWS` & navigate to the `EC2` console and choose the appropriate region

![](https://baywatch-api.memverge.com/media/upload/202406/240626233908180703.png)

* Navigate to `Network & Security` from the left navigation bar and click on `Key Pairs`.

![](https://baywatch-api.memverge.com/media/upload/202406/240626234021897622.png)

* Create a key-pair if you don't have one and save this key. **NOTE: You will need this key only to SSH into the opcenter instance.** 

![](https://baywatch-api.memverge.com/media/upload/202408/240807081513238257.png)

---

## AWS Marketplace - Memory Machine Cloud Subscription

* Next, navigate to `AWS Marketplace` and search for `memory machine cloud` and click on `Continue to Subscribe`

![](https://baywatch-api.memverge.com/media/upload/202408/240807071354851639.png)

![](https://baywatch-api.memverge.com/media/upload/202408/240807071450108428.png)

* Next click on `Continue to Configuration`

![](https://baywatch-api.memverge.com/media/upload/202408/240807071628691257.png)

* Make sure the region is correct and leave the rest to defaults and click on `Continue to Launch`

![](https://baywatch-api.memverge.com/media/upload/202408/240807071801687942.png)

* Next choose `Launch CloudFormation` from the drop down box and click on the `Launch` button

![](https://baywatch-api.memverge.com/media/upload/202408/240807071930819017.png)

* You will be re-directed to a new window to create a stack. Use defaults on this page for tempaltes and click `Next`

![](https://baywatch-api.memverge.com/media/upload/202408/240807072107784401.png)

* Next, specify a stack name

![](https://baywatch-api.memverge.com/media/upload/202408/240807072307600844.png)

* Next, select an instance type for MMCE OpCenter under `OpCenterType`

![](https://baywatch-api.memverge.com/media/upload/202408/240807072442790376.png)

| Type | CPU | Memory(GB) | Price/hour | Guidance |
| - | - | - | - | - |
| POC | 2 | 2 | $0.0188 | All historic jobs < 13K| 
| Small | 2 | 4 | $0.0765 | All historic jobs < 26K |
| Medium | 4 | 8 | $0.1530 | All historic jobs < 80K |
| Large | 4 | 16 | $0.2016 | All historic jobs < 185K |

* If you prefer a `CustomizedInstanceType` specify other instance type for MMCE OpCenter. Note this field is **OPTIONAL**.

![](https://baywatch-api.memverge.com/media/upload/202408/240807072958088594.png)

* Next specify disk type for MMCE OpCenter. Available types are `gp2` & `gp3`. Choose default `gp3` for better performance.

![](https://baywatch-api.memverge.com/media/upload/202408/240807073248757580.png)


* Next, select an EC2 Key Pair for all instances

![](https://baywatch-api.memverge.com/media/upload/202408/240807073435594906.png)

* Next, select a web-accessible VPC to run MMCE service

![](https://baywatch-api.memverge.com/media/upload/202408/240807073559749009.png)

* Next, select a subnet in the VPC to run OpCenter instance

![](https://baywatch-api.memverge.com/media/upload/202408/240807073712437070.png)

* Next, select the Availability Zone in which the subnet resides

![](https://baywatch-api.memverge.com/media/upload/202408/240807073830333880.png)

* Next, select `True` if you want to assign public IP to MMCE service

![](https://baywatch-api.memverge.com/media/upload/202408/240807073938118039.png)

* Next, define the IP address range to access MMCE service under `ExternalAccessCidr` & `SshCidr`.

> `0.0.0.0/0` to allow all access

![](https://baywatch-api.memverge.com/media/upload/202408/240807074118592777.png)

* Next, select `True` if you want to enable worker nodes to access other nodes internally. (Default is `False`)

![](https://baywatch-api.memverge.com/media/upload/202408/240807074355912386.png)

* Next, For `Configure stack options` options keep all defaults on this page and click `Next`

![](https://baywatch-api.memverge.com/media/upload/202408/240807074626143844.png)

* Next on the `Review and create` page, review all the parameters and finally select the check-box to allow CloudFormation to create resources and click on `Submit`

![](https://baywatch-api.memverge.com/media/upload/202408/240807074750728081.png)

* Next, you will see the stack creation in progress

![](https://baywatch-api.memverge.com/media/upload/202408/240807075129129926.png)

* Successful creation of stack will look as shown below with status `CREATE_COMPLETE`

![](https://baywatch-api.memverge.com/media/upload/202408/240807075426259908.png)

* Click on the `Resources` tab and click on the `mvOpCenter` id to re-direct to the EC2 instance assigned to the MMC OpCenter

![](https://baywatch-api.memverge.com/media/upload/202408/240807075508326039.png)

* Next, click on the `Instance ID` for more details of the instance

![](https://baywatch-api.memverge.com/media/upload/202408/240807075659076667.png)

* Next, click on the open address link of `Public IPv4 address` to be re-directed to the OpCenter

![](https://baywatch-api.memverge.com/media/upload/202408/240807075735012857.png)

> NOTE: When logging in for the first time to the opcenter, you might see a warning about invalid security certificate. Click on `Advanced` -> `Accept the Risk and Continue`

![](https://baywatch-api.memverge.com/media/upload/202408/240807075838231038.png)

* The MMCloud OpCenter landing page will look as below.

> The opcenter can now be accessed via its public IP address: `https://54.295.216.250/#/login`
> Bookmark this link for easy access

![](https://baywatch-api.memverge.com/media/upload/202408/240807080247340972.png)

> NOTE:
> default username: `admin`
> default password: `memverge`

---

# Change Admin Password

* We recommend changing the default username (`admin`) and password (`memverge`) for the opcenter after logging in for the first time.

* Click on `Users & Groups` from the left navigation menu and for the admin user click on `Actions` and change the password as necessary.

![](https://baywatch-api.memverge.com/media/upload/202408/240807080723259525.png)

![](https://baywatch-api.memverge.com/media/upload/202408/240807080855526736.png)

---

# Activate MMCloud License

* Next, activate your mmcloud license by clicking on the star icon in the top-right corner and enter your mmcloud account credentials

> Not registered? Visit www.mmcloud.io

![](https://baywatch-api.memverge.com/media/upload/202408/240807081016728068.png)

---

# Download FLOAT CLI

* The float CLI binary helps connect a user from any local machine to the MMCloud opcenter. To download the CLI, click on the terminal icon towards the top right corner of the opcenter and choose the version of CLI tool based on the operating system

![](https://baywatch-api.memverge.com/media/upload/202402/240207015753981576.png)

* From macOS 10.15 (Catalina) onwards, zsh is the default shell. In zsh, float is a reserved word. To use the float CLI with macOS, either change the shell to bash or alias the word float to the float binary, as follows:

```bash=
alias float = /path/to/float_binary/float
```
> where `/path/to/float_binary/` is the path to where you placed the float binary.

* To change the MAC-OS default shell from `zsh` to `bash` see the following [guide](https://phoenixnap.com/kb/change-zsh-to-bash-mac)

---

# Juiceflow - AWS

## Introduction

Juiceflow combines JuiceFS and Nextflow on MMCloud, offering a powerful, scalable solution for managing and executing workflows in the cloud.

<details>
<summary>Expand for more details on JuiceFS</summary>

[JuiceFS](https://juicefs.com/docs/community/introduction/) is an open-source, high-performance distributed file system designed specifically for cloud environments. It offers unique features, such as:

* Separation of Data and Metadata: JuiceFS stores files in chunks within object storage like Amazon S3, while metadata can be stored in various databases, including Redis.
* Performance: Achieves millisecond-level latency and nearly unlimited throughput, depending on the object storage scale.
* Easy Integration with MMCloud: MMCloud provides pre-configured nextflow head node templates with JuiceFS setup, simplifying deployment.
* Comparison with S3FS: For a detailed comparison between JuiceFS and S3FS, see [JuiceFS vs. S3FS](https://juicefs.com/docs/community/comparison/juicefs_vs_s3fs/). JuiceFS typically offers better performance and scalability.


</details>

---

## Pre-requisites

- A VPC Security Group with inbound-rule for port `6868`.
 - **Navigation:** `AWS EC2 console -> Network & Security -> Security Groups`

![](https://baywatch-api.memverge.com/media/upload/202406/240626212749399541.png)

> **Inbound rules should include:**
 - Custom TCP over TCP on port `6868`, used by the Redis server in this juiceflow setup.

- AWS S3 buckets for Nextflow work and output directories.
- AWS S3 access, secret keys & MMC OpCenter Credentials stored as float secrets

---

## Overview of the Setup

This solution leverages two scripts:
- `transient_JFS_AWS.sh`: Formats the work directory S3 bucket to JuiceFS format.
- `hostTerminate_AWS.sh`: Allows for graceful exit of workflow from nextflow head-node
- `job_submit_AWS.sh`: Contains Nextflow input parameters and MMC config.

---

## Steps

### Download Scripts

* Host-init script (you don't have to edit this script, but you'll need it later)

```bash!
wget https://mmce-data.s3.amazonaws.com/juiceflow/v1/aws/transient_JFS_AWS.sh
```

* Host-terminate script (you don't have to edit this script, but you'll need it later)

```bash!
wget https://mmce-data.s3.amazonaws.com/juiceflow/v1/aws/hostTerminate_AWS.sh
```

* Job-submit script (Download the template or create one locally based on the template below with your Nextflow inputs and run configurations)

```bash!
wget https://mmce-data.s3.amazonaws.com/juiceflow/v1/aws/job_submit_AWS.sh
```

<details>
<summary>Expand to view a sample `job_submit_AWS.sh` script</summary>

```bash!
#!/bin/bash

# ---- User Configuration Section ----
# These configurations must be set by the user before running the script.

# ---- Optional Configuration Section ----
# These configurations are optional and can be customized as needed.

# JFS (JuiceFS) Private IP: Retrieved from the WORKER_ADDR environment variable.
jfs_private_ip=$(echo $WORKER_ADDR)

# Work Directory: Defines the root directory for working files. Optional suffix can be added.
workDir_suffix=''
workDir='/mnt/jfs/'$workDir_suffix
mkdir -p $workDir # Ensures the working directory exists.
cd $workDir # Changes to the working directory.
export NXF_HOME=$workDir # Sets the NXF_HOME environment variable to the working directory.

# ------------------------------------------
# ---- vvv DO NOT EDIT THIS SECTION vvv ----
# ------------------------------------------

function install_float {
 # Install float
 local address=$(echo "$FLOAT_ADDR" | cut -d':' -f1)
 wget https://$address/float --no-check-certificate --quiet
 chmod +x float
}

function get_secret {
 input_string=$1
 local address=$(echo "$FLOAT_ADDR" | cut -d':' -f1)
 secret_value=$(./float secret get $input_string -a $address)
 if [[ $? -eq 0 ]]; then
 # Have this secret, will use the secret value
 echo $secret_value
 return
 else
 # Don't have this secret, will still use the input string
 echo $1
 fi
}

# Set Opcenter credentials
install_float 
access_key=$(get_secret AWS_BUCKET_ACCESS_KEY)
secret_key=$(get_secret AWS_BUCKET_SECRET_KEY)
export AWS_ACCESS_KEY_ID=$access_key
export AWS_SECRET_ACCESS_KEY=$secret_key

opcenter_ip_address=$(get_secret OPCENTER_IP_ADDRESS)
opcenter_username=$(get_secret OPCENTER_USERNAME)
opcenter_password=$(get_secret OPCENTER_PASSWORD)

# ------------------------------------------
# ---- ^^^ DO NOT EDIT THIS SECTION ^^^ ----
# ------------------------------------------

# ---- Nextflow Configuration File Creation ----
# This section creates a Nextflow configuration file with various settings for the pipeline execution.

# Use cat to create or overwrite the mmc.config file with the desired Nextflow configurations.
# NOTE: S3 keys and OpCenter information will be concatted to the end of the config file. No need to add them now

# Additionally, please add your STAGE MOUNT BUCKETS here
cat > mmc.config << EOF
// enable nf-float plugin.
plugins {
 id 'nf-float'
}

// Process settings: Executor, error strategy, and resource allocation specifics.
process {
 executor = 'float'
 errorStrategy = 'retry'
 extra = '--dataVolume [opts=" --cache-dir /mnt/jfs_cache "]jfs://${jfs_private_ip}:6868/1:/mnt/jfs --dataVolume [size=120]:/mnt/jfs_cache --vmPolicy [spotOnly=true,retryLimit=10,retryInterval=300s]'
}

// Directories for Nextflow execution.
workDir = '${workDir}'
launchDir = '${workDir}'

// OpCenter connection settings.
float {
 address = '${opcenter_ip_address}'
 username = '${opcenter_username}'
 password = '${opcenter_password}'
}

// AWS S3 Client configuration.
aws {
 client {
 maxConnections = 20
 connectionTimeout = 300000
 }
 accessKey = '${access_key}'
 secretKey = '${secret_key}'
}

EOF

# ---- Data Preparation ----
# Use this section to copy essential files from S3 to the working directory.

# For example, copy the sample sheet and params.yml from S3 to the current working directory.
# aws s3 cp s3://nextflow-input/samplesheet.csv .
# aws s3 cp s3://nextflow-input/scripts/params.yml .

# ---- Nextflow Command Setup ----
# Important: The -c option appends the mmc config file and soft overrides the nextflow configuration.

# Assembles the Nextflow command with all necessary options and parameters.
nextflow_command='nextflow run <nextflow-pipeline> \
-r <revision-number> \
-c mmc.config \
-params-file params.yml \
--input samplesheet.csv \
--outdir 's3://nextflow-output/rnaseq/' \
-resume '

# ---------------------------------------------
# ---- vvv DO NOT EDIT BELOW THIS LINE vvv ----
# ---------------------------------------------
# The following section contains functions and commands that should not be modified by the user.

# Create side script to tag head node - exits when properly tagged
cat > tag_nextflow_head.sh << EOF
#!/bin/bash

runname="\$(cat .nextflow.log 2>/dev/null | grep nextflow-io-run-name | head -n 1 | grep -oP '(?<=nextflow-io-run-name:)[^ ]+')"
workflowname="\$(cat .nextflow.log 2>/dev/null | grep nextflow-io-project-name | head -n 1 | grep -oP '(?<=nextflow-io-project-name:)[^ ]+')"

while true; do

 # Runname and workflowname will be populated at the same time
 # If the variables are populated and not tagged it, tag the head node
 if [ ! -z \$runname ]; then
 ./float modify -j "$(echo $FLOAT_JOB_ID)" --addCustomTag run-name:\$runname 2>/dev/null
 ./float modify -j "$(echo $FLOAT_JOB_ID)" --addCustomTag workflow-name:\$workflowname 2>/dev/null
 break
 fi

 runname="\$(cat .nextflow.log 2>/dev/null | grep nextflow-io-run-name | head -n 1 | grep -oP '(?<=nextflow-io-run-name:)[^ ]+')"
 workflowname="\$(cat .nextflow.log 2>/dev/null | grep nextflow-io-project-name | head -n 1 | grep -oP '(?<=nextflow-io-project-name:)[^ ]+')"

 sleep 1s

done
EOF

# Start tagging side-script
chmod +x ./tag_nextflow_head.sh
./tag_nextflow_head.sh &

# Start Nextflow run
$nextflow_command

if [[ $? -ne 0 ]]; then
 echo $(date): "Nextflow command failed."
 exit 1
else 
 echo $(date): "Nextflow command succeeded."
 exit 0
fi

```

</details>

### Job-Submit Script Adjustments

* Modify the `process.extra` within the `mmc.config` section to customize the `vmPolicy` for individual nextflow processes (Default policy is `spotOnly` in the job submit script. Adjust as needed e.g., `onDemand`, `spotFirst`). You may find more options when calling `float submit -h`:

```bash
--vmPolicy [spotOnly=true,retryLimit=10,retryInterval=300s]
```

For handling nextflow code repository, samplesheets and params-file, you have two options: download them or create them directly in the script. Here’s how to do both:

### Downloading Samplesheet and Params File

* Provide download commands for users to obtain samplesheet and params file for Nextflow, ensuring you replace `<download-link>` with the actual URLs:

```bash
# Download samplesheet
aws s3 cp s3://nextflow-input/samplesheet.csv .

# Download params file
aws s3 cp s3://nextflow-input/params.yml .
```

### Creating Samplesheet and Params File Directly in the Script

* Alternatively, users can create these files directly within the script using the `cat` command as shown below.

```bash
# Create samplesheet
cat > samplesheet.csv << EOF
sample,fastq_1,fastq_2,strandedness
AL_TO_rep01,s3://nextflow-input/Sample_1_L007_R1_001.fastq.gz,s3://nextflow-input/Sample_1_L007_R2_001.fastq.gz,auto
AL_TO_rep01,s3://nextflow-input/Sample_2_L008_R1_001.fastq.gz,s3://nextflow-input/Sample_2_L008_R2_001.fastq.gz,auto
AL_TO_rep01,s3://nextflow-input/Sample_3_L014_R1_001.fastq.gz,s3://nextflow-input/Sample_3_L014_R2_001.fastq.gz,auto
AL_TO_rep01,s3://nextflow-input/Sample_4_L009_R1_001.fastq.gz,s3://nextflow-input/Sample_4_L009_R2_001.fastq.gz,auto
EOF

# Create params file
cat > params.yml << EOF
multiqc_title: "rnaseq_multiqc"
fasta: "s3://nextflow-input/reference/Caenorhabditis_elegans.WBcel235.dna.toplevel.fa.gz"
gtf: "s3://nextflow-input/reference/Caenorhabditis_elegans.WBcel235.111.gtf.gz"
save_reference: true
remove_ribo_rna: true
skip_alignment: true
pseudo_aligner: "salmon"
EOF
```

---

* Download your nextflow code repository using `git clone` or copying from S3

> NOTE: if using private git repositories you can export your GITHUB_TOKEN before the nextflow_command

* Finally, ensure you customize your `nextflow_command` with specific pipeline requirements and save the changes:

```bash
nextflow_command='nextflow run <nextflow-pipeline> \
-r <revision-number> \
-c mmc.config \
-params-file params.yml \
--input samplesheet.csv \
--outdir 's3://nextflow-output/rnaseq/' \
-resume '
```

> Remember to replace placeholders with your specific pipeline details. 

---

## Float Submit

* Login to your MMCloud opcenter:

```bash!
float login -a <opcenter-ip-address> -u <user>
```
> Enter your password in the next prompt and you should see `Login succeded`

* Ensure you are using the latest version of OpCenter & float CLI:

```bash
float release upgrade --sync
```

* Make sure you have the following variables set as-is for `float secret`'s:

```bash
+-----------------------+
| NAME |
+-----------------------+
| OPCENTER_IP_ADDRESS |
| OPCENTER_USERNAME |
| OPCENTER_PASSWORD |
| AWS_BUCKET_ACCESS_KEY |
| AWS_BUCKET_SECRET_KEY |
+-----------------------+
```

* Useful `float secret` commands:

```bash
# to list stored secrets
float secret ls
# to set a secret
float secret set OPCENTER_IP_ADDRESS 192.0.1.2
# to unset a secret
float secret unset OPCENTER_IP_ADDRESS
```

### Float Submit Command

* Replace the placeholders `<work-bucket>`, `<region>`, `<security-group>`, and `<job-name>` with your specific values and execute the float submit command:

```bash!
float submit \
--hostInit transient_JFS_AWS.sh \
--hostTerminate hostTerminate_AWS.sh \
-i docker.io/memverge/juiceflow \
--vmPolicy '[onDemand=true]' \
--migratePolicy '[disable=true]' \
--dataVolume '[size=60]:/mnt/jfs_cache' \
--dirMap /mnt/jfs:/mnt/jfs \
-c 2 -m 4 \
-n <job-name> \
--securityGroup <security-group> \
--env BUCKET=https://<work-bucket>.s3.<region>.amazonaws.com \
-j job_submit_AWS.sh
```

Here's a brief explanation of the parameters used in the `float submit` command:

| Parameter | Brief Description |
|-----------|-------------------|
| `--hostInit transient_JFS_AWS.sh` | Shell script to run on the host before the job starts. |
| `--hostTerminate hostTerminate_AWS.sh` | Shell script to run on the host after the job has been cancelled |
| `-i docker.io/memverge/juiceflow` | Docker image for the job's software environment. |
| `--vmPolicy '[onDemand=true]'` | Uses on-demand VM instance for head-node execution. |
| `--migratePolicy '[disable=true]'` | Disables head-node migration to different hosts/VMs. |
| `--dataVolume '[size=60]:/mnt/jfs_cache'` | Attaches a 60GB data volume at `/mnt/jfs_cache` in the container. |
| `--dirMap /mnt/jfs:/mnt/jfs` | Maps a host directory to a container directory for data sharing. |
| `-c 2` | Allocates 2 CPU cores to the job. |
| `-m 4` | Allocates 4GB of memory to the job. |
| `-n <job-name>` | Assigns a name to the job for identification. |
| `--securityGroup <security-group>` | Applies a security group to the job's VM for network rules. |
| `--env BUCKET=https://<work-bucket>.s3.<region>.amazonaws.com` | Sets an environment variable for the S3 bucket URL. |
| `-j job_submit_AWS.sh` | Specifies the job script or command to run inside the container. |

---

### Float Submit Script

* To simplify job submission, create a shell submit script named `submit_nf_float_job.sh` with the float submit command shown above

```
JOB_SCRIPT=$1
JOB_NAME=$2
JUICEFS_BUCKET=$3
PREVIOUS_JOB_ID=$4

if [ -z "$JOB_SCRIPT" ]; then
 echo "JOB_SCRIPT is not set"
 exit 1
fi

if [ -z "$JOB_NAME" ]; then
 echo "JOB_NAME is not set"
 exit 1
fi

if [ -z "$JUICEFS_BUCKET" ]; then
 echo "JUICEFS_BUCKET is not set"
 exit 1
fi

if [ -z "$PREVIOUS_JOB_ID" ]; then
 SHOULD_RESUME=""
else
 SHOULD_RESUME="--env PREVIOUS_JOB_ID="$PREVIOUS_JOB_ID
fi


float submit \
 --hostInit transient_JFS_AWS.sh \
 --hostTerminate hostTerminate_AWS.sh \
 -i docker.io/memverge/juiceflow:v2 \
 --vmPolicy '[onDemand=true]' \
 --migratePolicy '[disable=true]' \
 --dirMap /mnt/jfs:/mnt/jfs \
 --dataVolume '[size=60]:/mnt/jfs_cache' \
 -c 4 -m 8 \
 --securityGroup <security-group> \
 --env BUCKET=https://${JUICEFS_BUCKET}.s3.<s3-bucket-region>.amazonaws.com \
 $SHOULD_RESUME \
 -n $JOB_NAME \
 -j $JOB_SCRIPT

```

* Submit command

```
./submit_nf_float_job.sh <job-submission-script.sh> <job-name> <s3-bucket-name>
```

* Submit command to resume with previous job-id

```
./submit_nf_float_job.sh <job-submission-script.sh> <job-name> <s3-bucket-name> <previous-job-id>
```

---


### Mounting S3 buckets as Data Volumes

* you can mount the input data bucket using S3FS as a data volume on the Nextflow head node and worker nodes as follows:

```
--storage <storage-name>
```

* Read more on [Juiceflow Performance Optimization](https://www.mmcloud.io/resources/docs/juiceflow-performance-optimization-aws)


---

## Important Considerations

- **JuiceFS Bucket Requirements**: JuiceFS requires formatting at the root level of a storage bucket. It cannot be formatted on a sub-directory within a bucket. Ensure the root directory of the bucket, or the entire bucket, is specified in the command line interface (CLI) command.

---

### Cancelling a Running Workflow to Resume Later

There are many circumstances where you might need to cancel a running Nextflow workflow and resume it later using Nextflow's `-resume` feature, such as changes to the configuration or resource specifications.

* To cancel a workflow in the JuiceFlow setup, cancel the head node job. This will cause the head node host to generate a `<JOB_ID>.meta.json.gz`, where the JOB_ID represents the job-id of the head node file in the S3 work bucket. 

* You can find the `<JOB_ID>.meta.json.gz` and the status of metadata dump in the `job.events` log in `Attachments` of the head node job

* This `<JOB_ID>.meta.json.gz` file is critical for restoring the work directory for subsequent attempts using the environmental variable `PREVIOUS_JOB_ID`.

---

### Resuming Workflows with the Job-Submit Script

JuiceFlow supports the `-resume` option with Nextflow. Each workflow execution generates a `<JOB_ID>.meta.json.gz` file in the S3 work bucket. This file is critical for restoring the work directory for subsequent attempts using the environmental variable `PREVIOUS_JOB_ID` .

> Example command
```
float submit \
--hostInit transient_JFS_AWS.sh \
--hostTerminate hostTerminate_AWS.sh \
-i docker.io/memverge/juiceflow \
--vmPolicy '[onDemand=true]' \
--migratePolicy '[disable=true]' \
--dataVolume '[size=60]:/mnt/jfs_cache' \
--dirMap /mnt/jfs:/mnt/jfs \
-c 2 -m 4 \
-n <job-name> \
--securityGroup <security-group> \
--env PREVIOUS_JOB_ID=<PREVIOUS_JOB_ID> \
--env BUCKET=https://<work-bucket>.s3.<region>.amazonaws.com \
-j job_submit_AWS.sh
```
 
---


## Monitoring on OpCenter

### Workflow Execution Log

To monitor workflow execution and get a detailed view of each process in the Nextflow workflow:

* Click on the `Workflows` dashboard in the OpCenter to monitor workflow execution and get a detailed view for each process in the Nextflow workflow:

![](https://baywatch-api.memverge.com/media/upload/202406/240626214812100945.png)

* Click on the workflow name, and you can monitor the jobs running in this workflow in this consolidated view:

![](https://baywatch-api.memverge.com/media/upload/202406/240626214854210539.png)

* Once the head-node job starts `Executing`, you can monitor the Nextflow stdout by clicking on the job -> `Attachments` -> `stdout.autosave`:

![](https://baywatch-api.memverge.com/media/upload/202406/240626214616545791.png)

### Individual Job Logs

To view the execution log of any particular job:

* Click on the Job-ID.
* Navigate to the `Attachments` tab.
* Click `view` (eye-icon) on the `stdout.autosave` file.

![](https://baywatch-api.memverge.com/media/upload/202406/240626215440890739.png)

![](https://baywatch-api.memverge.com/media/upload/202406/240626215520824360.png)

---

## Create Job Templates to Launch via MMCLoud GUI

Job Templates allow for the ease and customaization of runs that follow a similar format, without having the need to manually set up a command every time. It requires the submission of one job first.

* From the `Jobs` dashboard, select the head node job previously submitted above and click on `More Actions` -> `Save as Template`:

![](https://baywatch-api.memverge.com/media/upload/202406/240626215758509556.png)

* Provide a name and tag for the private template:

![](https://baywatch-api.memverge.com/media/upload/202406/240626215842168590.png)

* Navigate to the `Job Templates` dashboard and click on `Private` templates:

![](https://baywatch-api.memverge.com/media/upload/202406/240626220028284894.png)

* You can click on any job template, edit/change samplesheet, variables etc., and submit new jobs from the GUI.

---

> * Users can also submit jobs from templates via CLI
>```bash!
>float submit --template private::<template-name>:<template-tag> \
>-e BUCKET=s3://<aws-jfs-bucket>
>```
> Addtionally, please keep in mind the features that need to be updated with every run if they deviate from the default values provided in the private template. This will mainly include:
> * S3 Bucket URL
> * Updating of the nextflow run command in the job script

---

# Persistent Nextflow JuiceFS

## Introduction to JuiceFS

[JuiceFS](https://juicefs.com/docs/community/introduction/) is an open-source, high-performance distributed file system designed specifically for cloud environments. It offers unique features, such as:

- **Separation of Data and Metadata**: JuiceFS stores files in chunks within object storage like Amazon S3, while metadata can be stored in various databases, including Redis.
- **Performance**: Achieves millisecond-level latency and nearly unlimited throughput, depending on the object storage scale.
- **Easy Integration with MMCloud**: MMCloud provides pre-configured nextflow head node templates with JuiceFS setup, simplifying deployment.
- **Comparison with S3FS**: For a detailed comparison between JuiceFS and S3FS, see [JuiceFS vs. S3FS](https://juicefs.com/docs/community/comparison/juicefs_vs_s3fs/). JuiceFS typically offers better performance and scalability.

---

## Pre-requisites for Using JuiceFS with Nextflow

Before you begin, ensure you meet the following pre-requisites:

- A VPC Security Group with inbound-rule for port `6868`.
 - **Navigation:** `AWS EC2 console -> Network & Security -> Security Groups`

![](https://baywatch-api.memverge.com/media/upload/202406/240626212749399541.png)

> **Inbound rules should include:**
 - Custom TCP over TCP on port `6868`, used by the Redis server in this juiceflow setup.

2. **Create a New S3 Bucket**:
 - Cloud storage is required for JuiceFS's backend storage.
 - Create a new S3 bucket and note its region, as this will be needed later.
 - Path: *AWS S3 console -> Create Bucket*
 - ![S3 Bucket Creation](https://baywatch-api.memverge.com/media/upload/202312/231208005848879261.png)

---

# Deployment Steps for Individual Users on MMCloud

## Float Login

Ensure you are using the latest version of `float` CLI:

```bash
sudo float release sync
```

Login to your MMCloud opcenter:

```bash
float login -a <opcenter-ip-address> -u <user>
```

> After entering your password, verify that you see `Login succeeded!`

## Float Secret

Set your AWS credentials as secrets in `float`:

```bash
float secret set AWS_BUCKET_ACCESS_KEY <BUCKET_ACCESS_KEY>
float secret set AWS_BUCKET_SECRET_KEY <BUCKET_SECRET_KEY>
```

To verify the secrets:

```bash
float secret ls
```

> **Expected output**:
>
> ```bash
> +-----------------------+
> | NAME |
> +-----------------------+
> | AWS_BUCKET_ACCESS_KEY |
> | AWS_BUCKET_SECRET_KEY |
> +-----------------------+
> ```

## Deploy Nextflow Head Node

Deploy the Nextflow head node using the `nextflow:jfs` template:

```bash
float submit -n <head-node-name> \
--template nextflow:jfs \
--securityGroup <sg-00000000000a> \
-e BUCKET=https://<bucket-name>.s3.<bucket-region>.amazonaws.com \
--env BUCKET_ACCESS_KEY={secret:AWS_BUCKET_ACCESS_KEY} \
--env BUCKET_SECRET_KEY={secret:AWS_BUCKET_SECRET_KEY}
```

> **Note**: Replace `<head-node-name>`, `<security-group>`, `<bucket-name>`, and `<bucket-region>` with your specific details. The `nextflow:jfs` template comes pre-configured with JFS setup.

---

### Overriding Template Defaults

#### Customizing CPU and Memory
To override default CPU and memory settings:

```bash
--overwriteTemplate "*" -c <number-of-cpus> -m <memory-in-gb>
```

> **Example**: To set 8 CPUs and 32GB memory
> ```bash
> --overwriteTemplate "*" -c 8 -m 32
> ```

#### Specifying a Subnet
For deploying in a specific AWS [subnet](https://docs.aws.amazon.com/vpc/latest/userguide/configure-subnets.html):

```bash
--overwriteTemplate "*" --subnet <SUBNET-ID>
```
---

### Mounting S3 buckets as Data Volumes

* you can mount the input data bucket using S3FS as a data volume on the Nextflow head node and worker nodes as follows:

```
--dataVolume [mode=r,accesskey=xxx,secret=xxx,endpoint=s3.REGION.amazonaws.com]s3://BUCKET_NAME:/staged-files
```

* Read more on [Juiceflow Performance Optimization](https://www.mmcloud.io/resources/docs/juiceflow-performance-optimization-aws)

---

#### Incremental Snapshot Feature (From v2.4)
Enables faster checkpointing and requires larger storage:

```bash
--overwriteTemplate "*" --dumpMode incremental
```

> **Advantages**:
> - Supports larger workloads
> - Lower impact on job running time
> - No need for periodic snapshot interval configuration
>
> **Disadvantages**:
> - Requires larger storage for delta saves
> - Final snapshot is necessary for restore

#### Checking Head Node Deployment Status

```bash
float list
```

> **Example Output**:
> ```bash
> +-----------------------+----------------+------------------------------------+---------+-----------+----------------------+------------+-------------+
> | ID | NAME | WORKING HOST | USER | STATUS | SUBMIT TIME | DURATION | COST |
> | NlygkM1dA0qIucncPwjgD | jfs-head-1 | 54.211.194.151 (2Core4GB/OnDemand) | sateesh | Executing | 2023-11-02T17:40:06Z | 6m54s | 0.0049 USD |
> +-----------------------+----------------+------------------------------------+---------+-----------+----------------------+------------+-------------+
> ```

#### SSH into Head Node

- Locate the public IP address of the head node in the `Working Host` column.
- Retrieve the SSH key from Float's secret manager:

```bash
float secret get <job-id>_SSHKEY > <head-node-name>-ssh.key
```

> **Note**: If you encounter a `Resource not found` error, wait a few more minutes for the head node and SSH key to initialize.

- Set the appropriate permissions for the SSH key:

```bash
chmod 600 <head-node-name>-ssh.key
```

---

## SSH to Nextflow JFS Head Node

SSH into the Nextflow head node using the provided SSH key, username, and the head node's public IP address:

```bash
ssh -i <head-node-name>-ssh.key <user>@<head-node-public-ip-address>
```

> Note: Use the username `nextflow` to login as admin

## MMC nf-float configuration

### Editing the configuration file

1. Copy the template and edit the configuration file:

```bash
cp mmcloud.config.template mmc-jfs.config
vi mmc-jfs.config
```

> **Note**: If you're new to using `vi`, check out this [Beginner's Guide to Vi](https://www.howtogeek.com/102468/a-beginners-guide-to-editing-text-files-with-vi/) for basic instructions.

2. The `mmc-jfs.config` file copied from the template will be pre-filled with the OpCenter IP address, and the PRIVATE IP address of the Nextflow head node. **You only need to provide your OpCenter username, password and AWS access and secret keys in the config**.


```bash
plugins {
 id 'nf-float'
}

workDir = '/mnt/jfs/nextflow'

process {
 executor = 'float'
 errorStrategy = 'retry'
 extra =' --dataVolume [opts=" --cache-dir /mnt/jfs_cache "]jfs://172.31.41.331:6868/1:/mnt/jfs --dataVolume [size=120]:/mnt/jfs_cache'
 /*
 For some special tasks like Qualimap which generates very small IO request, using this -o writeback_cache can help with performance. Here's an example.
 withName: "QUALIMAP_RNASEQ" {
 extra =' --dataVolume [opts=" --cache-dir /mnt/jfs_cache -o writeback_cache"]jfs://172.31.41.331:6868/1:/mnt/jfs --dataVolume [size=120]:/mnt/jfs_cache'
 }
 */
}

podman.registry = 'quay.io'

float {
 address = '172.31.42.11:443'
 username = '<your_user_name>'
 password = '<your_password>'
}

// AWS access info if needed
aws {
 client {
 maxConnections = 20
 connectionTimeout = 300000
 }
 accessKey = '<bucket_access_key>'
 secretKey = '<bucket_secret_key>'
}
```

### Using Tmux

Start a tmux session named `nextflow`:

```bash
tmux new -s nextflow
```

> To attach to an existing tmux session:
>
> ```bash
> tmux attach -t nextflow
> ```

> **Tip**: If you're new to `tmux`, here's a handy [Tmux Cheat Sheet](https://tmuxcheatsheet.com/).

### Nextflow Version Check

Check the Nextflow version and update if necessary:

```bash
nextflow -v
```

> **Example Output**:
>
> ```bash
> nextflow version 23.10.0.5889
> ```

---

## Launch Nextflow

Launch a Nextflow or `nf-core/<pipeline>` by providing the MMC config file:

```bash
nextflow run nf-core/<pipeline> \
-profile test_full \
-c mmc-jfs.config \
--outdir s3://nextflow-work-dir/<pipeline>
```

---

## Head Node Management

- In this persistent head node setup, user is responsible to dispose off the head node. To cancel the head node job click on the cancel button for the job as shown below

![](https://baywatch-api.memverge.com/media/upload/202406/240626221259289772.png)

![](https://baywatch-api.memverge.com/media/upload/202406/240626221324725332.png)


---

## FAQ

**Q**: Why does `nf-core/rnaseq` fail at the QUALIMAP step?

**A**: The QUALIMAP process in `nf-core/rnaseq` often fails due to its high-frequency, small-size write requests, leading to timeouts. Enabling `-o writeback_cache` consolidates these requests and improves performance significantly. However, it turns sequential writes into random writes, affecting sequential write performance. Use this setting only in scenarios with intensive random writes.

Add the following in the `process {}` scope of your config:

```bash
withName: "QUALIMAP_RNASEQ" {
 extra =' --dataVolume [opts=" --cache-dir /mnt/jfs_cache -o writeback_cache"]jfs://<head-node-private-ip>:6868/1:/mnt/jfs --dataVolume [size=120]:/mnt/jfs_cache'
 }
```

---

## Additional Reading

### Data Volumes

For jobs that generate file system I/O, specifying data volumes is essential. The OpCenter supports a variety of data volume types. Learn more about configuring data volumes in MMCloud in the [MMCloud Data Volumes Guide](https://docs.memverge.com/MMCloud/latest/User%20Guide/Working%20with%20OpCenter/datavolumes/).

### Allow / Deny Instance Types

Configure which instance types are allowed or denied in your setup. 

![Allow/Deny Instance Types](https://baywatch-api.memverge.com/media/upload/202312/231208013813787510.png)

See detailed docs on [Controlling Virtual Machine Types](https://docs.memverge.com/MMCloud/latest/User%20Guide/Working%20with%20OpCenter/allowlistforvm/)

---

# Instance Setup

1. Launch EC2 Instance. From the EC2 console, click on Launch Instance
![](https://baywatch-api.memverge.com/media/upload/202408/240802211531904303.png)

2. For AMI, search for amzn2-ami-kernel-5.10-hvm-2.0.20230628.0-x86_64-gp2
![](https://baywatch-api.memverge.com/media/upload/202408/240802211541058090.png)

3. For Instance type, choose t2.medium and provide key-pair
![](https://baywatch-api.memverge.com/media/upload/202408/240802211549358643.png)

4. For Network Setting select existing default security-group
![](https://baywatch-api.memverge.com/media/upload/202408/240802211556479325.png)

5. For Configure Storage, 8Gb of root volume is sufficient. Proceed to click on Launch Instance
![](https://baywatch-api.memverge.com/media/upload/202408/240802211604440390.png)



# MMC CLI Setup
Assuming your OpCenter is setup, to install the CLI:
```
wget https://<op_center_ip_address>/float --no-check-certificate
sudo mv float /usr/local/bin/
sudo chmod +x /usr/local/bin/float
```

Connect to your opcenter by loggin in:
```
float login -a <op_center_ip_address> -u <username> -p <password>
```

# Cromwell Setup
### Install Java

```
curl -s "https://get.sdkman.io" | bash
source "/home/ec2-user/.sdkman/bin/sdkman-init.sh"
sdk install java 17.0.6-tem
java -version
```
### Install Cromwell
```
wget https://github.com/broadinstitute/cromwell/releases/download/84/cromwell-84.jar

# Check version
java -jar cromwell-84.jar --version
```

# Config file
Name your file `cromwell-float.conf`. Make sure you update the address to your OpCenter address
```
# This is an example of how you can use Cromwell to interact with float.

backend {
 default = float

 providers {
 float {
 actor-factory = "cromwell.backend.impl.sfs.config.ConfigBackendLifecycleActorFactory"
 config {
 runtime-attributes="""
 String f_cpu = "2"
 String f_memory = "4"
 String f_docker = ""
 String f_extra = ""
 """

 # If an 'exit-code-timeout-seconds' value is specified:
 # - check-alive will be run at this interval for every job
 # - if a job is found to be not alive, and no RC file appears after this interval
 # - Then it will be marked as Failed.
 # Warning: If set, Cromwell will run 'check-alive' for every job at this interval
 exit-code-timeout-seconds = 30

 submit = """
 mkdir -p ${cwd}/execution
 echo "set -e" > ${cwd}/execution/float-script.sh
 echo "cd ${cwd}/execution" >> ${cwd}/execution/float-script.sh
 tail -n +22 ${script} > ${cwd}/execution/no-header.sh
 head -n $(($(wc -l < ${cwd}/execution/no-header.sh) - 14)) ${cwd}/execution/no-header.sh >> ${cwd}/execution/float-script.sh

 float submit -i ${f_docker} -j ${cwd}/execution/float-script.sh --cpu ${f_cpu} --mem ${f_memory} ${f_extra} > ${cwd}/execution/sbatch.out 2>&1
 cat ${cwd}/execution/sbatch.out | sed -n 's/id: $.*$/\1/p' > ${cwd}/execution/job_id.txt
 echo "receive float job id: "
 cat ${cwd}/execution/job_id.txt

 JOB_SCRIPT_DIR=float-jobs/$(cat ${cwd}/execution/job_id.txt)
 mkdir -p $JOB_SCRIPT_DIR
 cd $JOB_SCRIPT_DIR
 
# create the check alive script
cat <<EOF > float-check-alive.sh
SCRIPT_DIR=$(pwd)
cd ${cwd}/execution
float show -j \$1 --runningOnly > job-status.yaml
if [[ -s job-status.yaml ]]; then
 cat job-status.yaml
else
 float show -j \$1 | grep rc: | tr -cd '[:digit:]' > rc
 float log cat -j \$1 stdout.autosave > stdout
 float log cat -j \$1 stderr.autosave > stderr
fi
cd $SCRIPT_DIR
EOF

# create the kill script
cat <<EOF > float-kill.sh
SCRIPT_DIR=$(pwd)
cd ${cwd}/execution
float scancel -f -j \$1
cd $SCRIPT_DIR
EOF

 cat ${cwd}/execution/sbatch.out
 """

 kill = """
 source float-jobs/${job_id}/float-kill.sh ${job_id}
 """

 check-alive = """
 source float-jobs/${job_id}/float-check-alive.sh ${job_id}
 """
 
 job-id-regex = "id: (\\w+)\\n"
 }
 }
 }
}

```
# S3 Bucket Setup
Follow the directions in this section: [Setup Nextflow host on AWS - HackMD](https://hackmd.io/@speri/rkmPkmP52#3-Create-Nextflow-work-directory-S3-bucket)

Once you made your bucket and created your access keys, install s3fs:
```
sudo yum install automake fuse fuse-devel gcc-c++ git libcurl-devel libxml2-devel make openssl-devel
git clone https://github.com/s3fs-fuse/s3fs-fuse.git
cd s3fs-fuse
./autogen.sh
./configure --prefix=/usr --with-openssl
make
sudo make install
```
Create a password file with your access key and secret key in the form of
```
access_key:secret_key
```
Change mode your file to 600
```
chmod 600 ./passwd-s3fs
```
Mount your bucket to your designated mountpoint. If you are mounting to a directory that requires root privileges to access, you will need to use `sudo` to mount
```
s3fs BUCKET /MOUNTPOINT -o rw,allow_other -o multipart_size=52 -o parallel_count=30 -o passwd_file=~/.passwd-s3fs
```
If you plan on using an s3 bucket with your workflow, please update the `f_extra` line in the config:
```
String f_extra = "--dataVolume [accesskey=XXX,secret=XXX,mode=rw]s3://BUCKET:/MOUNTPOINT"
```

# Hello World (Read from Bucket)
### Create `hello.wdl`
```
workflow helloWorld {
 String name
 call sayHello { input: name=name }
}

task sayHello {
 String name

 command {
 printf "[cromwell-say-hello] hello to ${name} on $(date)\n"
 sleep 30
 }
 output {
 String out = read_string(stdout())
 }
 runtime {
 f_docker: "cactus"
 }
}
```
### Create `hello.json` in your bucket
```
{
 "helloWorld.name": "Developer"
}
```
### Run command (edit for your corresponding mountpoint)
```
java -Dconfig.file=cromwell-float.conf -jar \
 cromwell-84.jar run hello.wdl \
 --inputs /MOUNTPOINT/hello.json
```
A successful workflow will end in something similar to the snippet below. You may ignore the text that appear afterwards. The most important part is the “Succeeded”
```
[INFO] [11/27/2023 18:15:23.336] [cromwell-system-akka.dispatchers.engine-dispatcher-30] [akka://cromwell-system/user/SingleWorkflowRunnerActor] SingleWorkflowRunnerActor workflow finished with status 'Succeeded'.
{
 "outputs": {
 "helloWorld.sayHello.out": "[cromwell-say-hello] hello to Developer on Mon Nov 27 18:12:55 UTC 2023"
 },
 "id": "a1f0606e-367f-43a5-9381-e4ebe09ffbcf"
}
[2023-11-27 18:15:24,91] [info] Workflow polling stopped
```

# Sequence Workflow (Write to Bucket)
### Create `seq.wdl` (edit for your corresponding mountpoint)
```
workflow myWorkflow {
 call sayHello
 call writeReadFile { input: s=sayHello.out }
}

task sayHello {
 command {
 printf "[cromwell-say-hello] hello from $(whoami) on $(date)"
 sleep 30
 }
 output {
 String out = read_string(stdout())
 }
 runtime {
 f_docker: "cactus"
 f_cpu: "2"
 f_memory: "4"
 }
}

task writeReadFile {
 String s

 command {
 printf "[cromwell-write-read-file] write input to a file: ${s}\n" > /MOUNTPOINT/my_file.txt
 cat /MOUNTPOINT/my_file.txt
 }
 output {
 String out = read_string(stdout())
 }
 runtime {
 f_docker: "cactus"
 }
}
```
### Command
```
java -Dconfig.file=cromwell-float.conf -jar cromwell-84.jar run seq.wdl
```
You should expect to see the `my_file.txt` created and populated in your bucket

**Deploying MMCloud OpCenter in AWS Non-default VPC**
By Hui Chen, Yuchen Liu, Jing Gong

When EC2 resources are provisioned for the first time in an AWS account, a Default VPC is created automatically. In the default VPC, an Internet gateway and public subnets with corresponding routing tables are pre-configured; a public IP address are assigned to VMs by default; and access to the Internet is enabled automatically. As such, deploying MMCloud OpCenter in AWS default VPC is straightforward if the security posture of the default VPC is acceptable. However, in enterprise environments, it is often advantageous and/or necessary to run MMCloud workloads in a dedicated non-default VPC. Such set up allows granular security enforcement and ensures that MMCloud does not affect production environments. 

Deploying MMCloud OpCenter in a non-default VPC requires careful planning so as to ensure proper communication between the various MMCloud components. In a non-default VPC, customers may have both public IP subnets and private IP subnets. Public IP assignment normally is disabled for VMs, and most likely access to the internet will be disabled in the private subnets. For this kind of environment some extra cloud services need to be configured for MMCloud to work properly.

This document presents the procedure on how to deploy MMCloud OpCenter in AWS using a dedicated non-default VPC with typical enterprise security restrictions:

Step 1: Create VPC for MMCloud

From VPC dashboard, click on VPCs, then select Create VPC. Configure VPC settings as follows:
1.1	Resources to create: select VPC and more.
1.2	Name tag auto-generation: keep Auto-generate default set to be enabled; input a customized name tag.
1.3	IPv4 CIDR block: set the CIDR block addresses for the VPC.
1.4	IPv6 CIDR block: keep the default selection for No IPv6 CIDR block.
1.5	Tenancy: keep Default.
1.6	Number of Availability Zones (AZs): select 2 so as for the high availability.
1.7	Number of public subnets: select 2.
1.8	Number of private subnets: select 2 or 4 according to your needs.
1.9	NAT gateways ($): make selection according to your needs, None by default.
1.10	VPC endpoints: select S3Gateway.
1.11	DNS options: keep the default settings for Enable DNS hostnames and Enable DNS resolution. 
1.12	Click on Create VPC to create a new nondefault VPC.
 
![](https://baywatch-api.memverge.com/media/upload/202402/240227005041509145.png)
 
Step 2: Deploy MMCloud OpCenter in a public subnet of the newly created VPC. 

Follow Quick Start Deploy MMCloud OpCenter on AWS to set up the MMCloud in a public subnet of the VPC created at Step 1. 

This step allows users to access OpCenter from the public Internet, and allows OpCenter to communicate with the MMCloud License Server over the Internet. 

Step 3: Create EC2 endpoints for private subnets.

This step allows the OpCenter as well as the MMCloud worker nodes to access AWS services. 

From VPC dashboard, click on Endpoints, then select Create endpoint. Configure Endpoint settings as follows:
3.1 Name tag: input a customized name tag.
3.2 Service category: click on AWS services; search AWS service name of EC2 service, select it for the endpoint service.
3.3 VPC: select the VPC created at Step 1 as the VPC in which to create your endpoint.
3.4 Subnets: for all the availability zones in the VPC, select the private subnet IDs.
3.5 IP address type: keep the default IPv4 setting.
3.6 Security groups: select default group and two more security groups created for your MMCloud OpCenter and WorkerNodes.
3.7 Policy: Keep default Full access selection.
3.8 Tags: keep the default setting.
3.9 Click on Create endpoints to create a new EC2 instance endpoint. Wait until the status of the endpoint be fully ready.
3.10 Open OpCenter GUI. ((It may take several minutes for opcenter to be ready.)
 
![](https://baywatch-api.memverge.com/media/upload/202402/240227005055368761.png)

If a worker node needs to access Internet resources, you may need to create a NAT gateway. This is not necessary if a worker node only needs to access AWS services.

Below are the IAM roles and policies created when deploying Memory Machine Cloud through AWS Marketplace, via the CloudFormation script. 

There are two types of objects, OpCenter and WorkerNode, and Memory Machine Cloud will create different IAM roles for them.

### For OpCenter
- license-manager:*  (to query the license in license manager)
- pricing:* (to query the cloud resource price)
- IAM roles: (these are limited in the local AWS account)
iam:GetRole
iam:AttachRolePolicy
iam:CreateRole
iam:PutRolePolicy
iam:PassRole
iam:CreateServiceLinkedRole
- EC2 roles: (to allow Memory Machine Cloud to create and control an EC2 instance, for those delete related action, Memory Machine Cloud limited to the resource which created by Memory Machine Cloud)
ec2:Get*
ec2:Describe*
ec2:RunInstances
ec2:StartInstances
ec2:StopInstances
ec2:Create*
ec2:Modify*
ec2:AttachVolume
ec2:DetachVolume
ec2:DeleteVolume
ec2:DeleteSecurityGroup
ec2:TerminateInstances
ec2:DeleteSnapshot
- S3 roles: (to allow Memory Machine Cloud to access the customer data which is in the local AWS account’s s3 bucket or public buckets)
s3:DeleteObject
s3:Put*
s3:Replicate*
s3:Restore*
s3:CreateBucket
s3:DeleteBucket
s3:Update*
s3:List*
s3:Get*
s3:Describe*
- ECR roles: (to allow Memory Machine Cloud to access the customer’s ECR repo)
ecr:GetDownloadUrlForLayer
ecr:BatchGetImage
ecr:DescribeImages
ecr:ListImages
ecr:GetAuthorizationToken
ecr:BatchCheckLayerAvailability
- Marketplace roles: (to allow Memory Machine Cloud to push the Metering Usage)
aws-marketplace:MeterUsage

### For WorkerNode, the role is the subset of the OpCenter
- IAM roles:
iam:PassRole
- EC2 roles:
ec2:Get*
ec2:Describe*
ec2:StopInstances
ec2:Create*
ec2:Modify*
ec2:AttachVolume
ec2:DetachVolume
ec2:DeleteVolume
ec2:TerminateInstances
ec2:DeleteSnapshot
- S3 roles: (to allow Memory Machine Cloud to access the customer data which is in the local AWS account’s s3 bucket or public buckets)
s3:DeleteObject
s3:Put*
s3:Replicate*
s3:Restore*
s3:Update*
s3:List*
s3:Get*
s3:Describe*
- ECR roles: (to allow Memory Machine Cloud to access the customer’s ECR repo)
ecr:GetDownloadUrlForLayer
ecr:BatchGetImage
ecr:DescribeImages
ecr:ListImages
ecr:GetAuthorizationToken
ecr:BatchCheckLayerAvailability

## Overview

This guide will help you deploy a CloudFormation stack using a template that sets up IAM roles, EC2 instances, JuiceFS, and an AWS Batch Compute Environment. The stack has been updated to use JuiceFS with Amazon S3 as the backend for checkpoint storage and as a scratch directory. 

The updated stack also supports multiple job queues and compute environments, catering to diverse customer workloads. Customers can reach out to us for tailored multi-queue setups. This guide provides details on deploying the stack in the us-west-2 region.

If you're looking for an overview of the MM Batch Engine for AWS, [visit this link](https://www.mmcloud.io/resources/docs/mmc-batch-engine-for-aws-overview-2).


## Prerequisites:
Before launching the stack, ensure that you have the following resources created in the us-west-2 region:

1. #### VPC (Virtual Private Cloud)

2. #### Subnet (in the VPC)

3. #### Security Group (SG):

 - Inbound Rules: Allow required traffic for batch processing. Port 6379 for JuiceFS is required. 

 - Outbound Rules: Allow necessary internet access if required.

4. #### EC2 Key Pair:

 - Create an EC2 key pair to SSH into instances.

5. #### GitHub Token:

 - Note: Do not modify the path to the github_token.txt file. This path is predefined and managed automatically by the stack.

6. #### AMI ID:

 - The default AMI ID is set to ami-0d3bb50d3c35f67d4 (us-west-2), but you can use another AMI if required. Here's list of all accepted AMI ID provided by AWS corresponding to the region:
`us-east-1 (N.Virginia) - ami-09ef698301ad80887`
`us-east-2 (Ohio) - ami-0e9a7e80656bb9530 `
`us-west-1 (N.California) - ami-0156527dd7a8280a3`
`us-west-2 (Oregon) - ami-0b1bd1ab8a168b55d`

## Steps to Deploy the Stack:

1. #### Log in to AWS Management Console:
 - Ensure you are in the us-west-2 region.

2. #### Create the Required VPC, Subnet, and Security Group:
 - VPC: If you don’t already have one, create a new VPC, or use the default VPC.

 - Subnet: Ensure the subnet is associated with your VPC and has internet access (if needed).

 - Security Group: Create a security group allowing inbound access on port 6379 for JuiceFS. You can also use the default security group if preferred.

3. #### Create an EC2 Key Pair:
 - Go to the EC2 service, navigate to Key Pairs, and create a new key pair for SSH access to the EC2 instances.

4. #### Gather Your Parameters:
 - VPC ID: You can find this in the VPC dashboard.

 - Subnet ID: In the Subnet dashboard, copy the ID of the subnet associated with your VPC.

 - Security Group ID: Find your security group ID in the EC2 console.

 - EC2 Key Pair Name: The name of the key pair you created.

5. #### Deploy the Stack:
 - Go to the CloudFormation Console and click Create Stack.

 - Choose With new resources (standard).
 ![](https://baywatch-api.memverge.com/media/upload/202410/241022151246516689.png)


 - Under Specify template, choose Amazon S3 URL provided and paste the following URL:

 ![](https://baywatch-api.memverge.com/media/upload/202410/241022151311336094.png)


 - Click Next.

6. #### Enter the Parameters:
 - VPC ID: Paste your VPC ID.
 ![](https://baywatch-api.memverge.com/media/upload/202410/241022151555855652.png)


 - Subnet ID: Paste the Subnet ID. (This is only for Compute Resources)
 ![](https://baywatch-api.memverge.com/media/upload/202410/241022151627290980.png)


 - Security Group ID: Paste the Security Group ID.

 ![](https://baywatch-api.memverge.com/media/upload/202410/241022151655003278.png)


 - EC2 Key Pair: Enter the name of your EC2 key pair.
 ![](https://baywatch-api.memverge.com/media/upload/202410/241022151918181017.png)

 - Unique Prefix: A unique prefix for naming resources (e.g., project name or a random string).
 ![](https://baywatch-api.memverge.com/media/upload/202410/241022152505001180.png)


 - AMI ID: Leave as default, or enter a different AMI ID if needed.
 ![](https://baywatch-api.memverge.com/media/upload/202410/241022151955729183.png)


 - Root Volume Size: Optionally adjust the root volume size (default is 30 GiB).
 ![](https://baywatch-api.memverge.com/media/upload/202410/241022152022218096.png)

 - Default Instance Types: Modify or leave the default instance types.
 ![](https://baywatch-api.memverge.com/media/upload/202410/241022152127446604.png)


 - vCPU Settings: You can modify the minimum, maximum, and desired vCPU values to meet your requirements.
 ![](https://baywatch-api.memverge.com/media/upload/202410/241022152155017813.png)

 - Subnet IDs: Provide a list of Subnet IDs for MemoryDB, ensuring they cover at least two Availability Zones. (Multi-select Subnet ID for JuiceFS Redis MemoryDB)
![](https://baywatch-api.memverge.com/media/upload/202501/250113184412461325.png)

 
 - Redis Node Type: Specify the node type for MemoryDB (default is db.t4g.small).
![](https://baywatch-api.memverge.com/media/upload/202501/250113184403593783.png)

 - Enable Multi-Queue: Set to true to enable multiple job queues and compute environments, or false for a default single-queue setup.
![](https://baywatch-api.memverge.com/media/upload/202501/250113184345971393.png)

 - Multi-Queue Instance Types: Provide instance types for each compute environment in the multi-queue setup
 -	Jq1Ce1 Instance Types: Default is c5.large,c5.xlarge.
 - Jq1Ce2 Instance Types: Default is c5.2xlarge.
 -	Jq2Ce1 Instance Types: Default is m5.large,m5.xlarge,m5.2xlarge.
 -	Jq2Ce2 Instance Types: Default is m5.4xlarge.
 -	Jq3Ce1 Instance Types: Default is m5.8xlarge.
 -	Jq3Ce2 Instance Types: Default is m5.12xlarge.
![](https://baywatch-api.memverge.com/media/upload/202501/250113184147652483.png)


7. #### Configure Stack Options:
 - When moving to the Next page after entering parameters, you have the option to add tags to the resources. It is recommended to:

 - Add a tag with:

 - Key: {Tag Key Name}

 - Value: The same value as the Unique Prefix parameter.
 ![](https://baywatch-api.memverge.com/media/upload/202410/241022152604381607.png)


8. #### Review and Deploy:
 - Review your parameters and click Create Stack. CloudFormation will now begin deploying the resources.
 ![](https://baywatch-api.memverge.com/media/upload/202410/241022153137492863.png)
![](https://baywatch-api.memverge.com/media/upload/202410/241022153208072377.png)


9. #### Monitor the Deployment:
 - Check the CloudFormation Events tab to monitor the deployment process. It may take several minutes to complete.

10. #### Access the Resources:
 - Once the stack is complete, navigate to the Batch console to view the resources created by the CloudFormation stack.
 ![](https://baywatch-api.memverge.com/media/upload/202410/241022153339089239.png)
 - Create a job definition and submit it to test the setup.
11. #### Clean up:
 - Before deleting the Cloudformation Stack, make sure to clear the S3 buckets created by the Stacks 
`mm-engine-juice-fs-scratch-{UniquePrefix}` and `mm-engine-juice-fs-checkpoint-{UniquePrefix}. `
 
## Running Nextflow Batch Jobs with Your Stack:

After deploying the CloudFormation stack, you can use the following guide to run Nextflow jobs on AWS Batch.

1. Set Up the Nextflow Configuration:
 - To submit Nextflow jobs to AWS Batch, use the provided Nextflow configuration file (aws.config).

 - Replace the queue parameter with your Batch Job Queue created by the CloudFormation stack

 - You'll need to create AWS IAM Access and Secret Keys for an IAM user in the IAM Console.
 Copy and paste these keys in the appropriate place in the config file.

 - You can also update the AWS region if needed (the default is us-west-2).

`aws.config`

 plugins {
 id 'nf-amazon'
 }
 process {
 executor = 'awsbatch'
 queue = 'jq-mm-batch-<UniquePrefix>'
 maxRetries = 5
 memory = '20G'
 }
 
 process.containerOptions = '--env MMC_CHECKPOINT_DIAGNOSIS=true --env MMC_CHECKPOINT_IMAGE_SUBPATH=nextflow --env MMC_CHECKPOINT_INTERVAL=5m --env MMC_CHECKPOINT_MODE=true --env MMC_CHECKPOINT_IMAGE_PATH=/mmc-checkpoint'
 
 aws {
 accessKey = '<ACCESSS KEY>'
 secretKey = '<SECRET KEY>'
 region = 'us-west-2'
 client {
 maxConnections = 20
 connectionTimeout = 10000
 uploadStorageClass = 'INTELLIGENT_TIERING'
 storageEncryption = 'AES256'
 }
 batch {
 cliPath = '/nextflow_awscli/bin/aws'
 maxTransferAttempts = 3
 delayBetweenAttempts = '5 sec'
 }
 }

2. Explanation of the Environment Variables:
The following --env variables are used to configure checkpointing and image paths for Nextflow:

- `MMC_CHECKPOINT_DIAGNOSIS = true`: Enables checkpoint diagnostics.
- `MMC_CHECKPOINT_INTERVAL = 5m`: Sets the interval for creating checkpoints (every 5 minutes).
- `MMC_CHECKPOINT_MODE = true`: Enables checkpoint mode.
- `MMC_CHECKPOINT_IMAGE_PATH = /mmc-checkpoint`: Defines the path for storing the checkpoint image on EFS.

3. Run Your Pipeline:
You can now run your Nextflow pipeline using the following command:

 nextflow run nf-core/<PIPELINE> -profile test \
 -work-dir 's3://<WORKDIR_BUCKET>' \
 --outdir 's3://<OUTDIR_BUCKET>' \
 -c aws.config
- Replace <PIPELINE> with the name of your Nextflow pipeline (e.g., rnaseq).
- Replace <WORKDIR_BUCKET> and <OUTDIR_BUCKET> with your S3 bucket paths for work and output directories.

By following these steps, you can submit and manage Nextflow jobs efficiently using AWS Batch with checkpointing enabled.

## Overview
This document guides you through deploying the updated CloudFormation stack with an enhanced reporting feature. This stack deploys IAM roles, EC2 instances, EFS, and an AWS Batch Compute Environment. While you can still use your default AWS VPC, subnet, and security group, we recommend creating dedicated resources for better isolation and security. For foundational setup instructions, refer to our [Deployment Guide](https://www.mmcloud.io/resources/docs/quick-start-guide-for-deploying-the-cloudformation), which covers the prerequisites and steps for deploying this stack.

## Quick Recap from the Initial Guide:

1.	**Prepare AWS Resources**: Set up the essential AWS resources, including a VPC, subnet, security group, EC2 key pair, and GitHub token for repository access. The guide recommends creating dedicated resources for added security.

2.	**Deploy the CloudFormation Stack**: Log into AWS, input required parameters (e.g., VPC ID, subnet ID, key pair name), and deploy the stack. Using a unique prefix helps to organize resources effectively.

3.	**Monitor and Access Resources**: Track the deployment in the CloudFormation console. Once complete, you can access the AWS Batch environment, configure job definitions, and verify the setup.

4.	**Run Nextflow Jobs (Optional)**: After deployment, Nextflow users can submit jobs to AWS Batch, supporting streamlined batch job management.

The new reporting feature is seamlessly integrated into this deployment process, so no additional setup steps are required.

## **Real-Time Reporting for Job Metrics**

**About the Reporting Feature**

This release introduces a real-time reporting dashboard that offers insights into your batch job metrics. Accessible via a web interface, the dashboard provides a quick overview of:

•	**Total Jobs Executed**: The total count of all processed jobs.

•	**Spot Reclaim Protection Events**: Tracks successful recovery during spot instance interruptions, minimizing downtime and potential data loss.

•	**Total Runtime (in seconds)**: Shows the cumulative execution time of all jobs.

•	**Total Time Saved**: Indicates time saved through efficient instance recovery, giving a clear view of performance gains.

## Accessing the Reporting Dashboard

After deploying the stack as shown in the initial guide, follow these steps to access the reporting feature:

1.	Find the **AnalyticsEC2Stack**: Once deployed, the stack will automatically create a nested stack named AnalyticsEC2Stack.

![](https://baywatch-api.memverge.com/media/upload/202411/241108023556786041.png)

2.	Locate the EC2 Instance:

      •	In the AnalyticsEC2Stack under the CloudFormation console, navigate to the **Resources** tab.
![](https://baywatch-api.memverge.com/media/upload/202411/241108023637296408.png)

    •	Find the EC2 instance ID and click on it to open the EC2 console.

3.	Access the mmspotviewer Instance:

    •	Locate the mmspotviewer instance.

    •	In the instance details, find the **Public IP address**.

![](https://baywatch-api.memverge.com/media/upload/202411/241108023711531240.png)

4.	Access the Reporting Dashboard:

    •	Open a browser and enter the IP address using HTTP (not HTTPS), followed by :5000 to access the main dashboard.

![](https://baywatch-api.memverge.com/media/upload/202411/241108023801046948.png)

With these enhancements, your AWS Batch deployment now includes a streamlined tool for monitoring and optimizing job execution, ensuring efficient workflow performance and resource management.

## Introduction to AWS S3
AWS S3 is a highly scalable, durable, and secure object storage service, making it an ideal choice for managing large-scale workflows like those in Nextflow. Nextflow includes built-in support for AWS S3, allowing seamless integration of S3 buckets into pipeline scripts. Files stored in an S3 bucket can be accessed transparently in your pipeline script, just like any other file in the local file system, enabling efficient data management across cloud and on-premises environments.

---

## Pre-requisites for Using S3 with Nextflow

Before you begin, ensure you meet the following prerequisites:

### 1. A VPC Security Group with an Inbound Rule for Port 22

Ensure that your virtual private cloud (VPC) is properly configured to allow SSH access to the instances running your Nextflow pipelines. This requires a security group with an inbound rule to allow connections on port 22, which is used for SSH.
Navigation: AWS EC2 console -> Network & Security -> Security Groups

![](https://baywatch-api.memverge.com/media/upload/202410/241011235918570380.png)


---

### 2. Create a New S3 Bucket for Storing Nextflow Output and Workdir Files

To store the outputs of your Nextflow pipelines and the intermediate files created in the Nextflow workdir, you will need a dedicated S3 bucket. This S3 bucket will serve as both the working directory and the final storage location, ensuring that all files generated during the pipeline's execution are accessible for future reference or further processing.

To create a new S3 bucket, follow these steps:

1. Navigate to the **AWS S3 Console**.
2. Click on **Create Bucket** to begin the setup process.

![](https://baywatch-api.memverge.com/media/upload/202410/241011000121370564.png)

3. After the bucket is created, you need to create two folders within the bucket: one for the output files and one for the workdir.

![](https://baywatch-api.memverge.com/media/upload/202410/241011011132284977.png)

4. You can then select each folder by checking the box next to it, and click on **Copy S3 URL** to obtain the folder's URL. This URL will be required for configuration in the upcoming `Editing the Configuration File` section.

![](https://baywatch-api.memverge.com/media/upload/202410/241011011024204729.png)


---

### 3. Prepare All Required Scripts

**1. s3flow_transient_hostInit.sh**
```bash
curl -O https://mmce-data.s3.amazonaws.com/s3flow/v1/s3flow_transient_hostInit.sh
```

**2. s3flow_transient_hostTerminate.sh**
```bash
curl -O https://mmce-data.s3.amazonaws.com/s3flow/v1/s3flow_transient_hostTerminate.sh
```

**3. s3flow_transient_nextflow_submit.sh**
```bash
curl -O https://mmce-data.s3.amazonaws.com/s3flow/v1/s3flow_transient_nextflow_submit.sh
```
It is recommended to keep these three files in the same folder. Ensure that the EC2 Instance or local machine where these files reside has `float` installed.

---

## Deployment Steps for Individual Users on MMCloud

### Float Login
Ensure you are using the latest version of the `float` CLI:

```bash
sudo float release sync
```

Login to your MMCloud OpCenter:

```bash
float login -a <opcenter-ip-address> -u <user>
```

After entering your password, verify that you see **Login succeeded!**

---

### Float Secret
Set your AWS credentials as secrets in `float`:

```bash
float secret set AWS_BUCKET_ACCESS_KEY <BUCKET_ACCESS_KEY>
float secret set AWS_BUCKET_SECRET_KEY <BUCKET_SECRET_KEY>
```

To verify the secrets:

```bash
float secret ls
```

---

##### Expected Output:
```bash
+-------------------------+
| NAME |
+-------------------------+
| AWS_BUCKET_ACCESS_KEY |
| AWS_BUCKET_SECRET_KEY |
+-------------------------+
```

---


## Configue Two Scripts


## 1. modifying s3flow_transient_hostTerminate.sh:

```bash
# ------------------------------------------------------------------
# ---- vvv Set root dir of nextflow cache dir location here vvv ----
# ------------------------------------------------------------------
# Should match the dir in the s3flow_transient_nextflow_submit.sh script
nextflow_dir="s3://your-bucket-here/subdir"
```

## 2. modifying mmc.config in s3flow_transient_nextflow_submit.sh:


###### Configuration File Content:

```bash
# This S3 URI is the parent dir location of where you want to save
# your .nextflow.log and .nextflow/ folder
# Should match the location in the s3flow_transient_hostTerminate.sh script
nextflow_dir="<s3://your-bucket-here/subdir>"



cat > mmc.config << EOF
plugins {
 id 'nf-float'
}

workDir = 's3://<your_s3_bucket_name/your_workDir_folder_name/'

process {
 executor = 'float'
 errorStrategy = 'retry'

 /*
 If users would like to enable float storage function, specify like this
 extra = '--storage <S3_bucket_name>'
 */
 
 /*
 If extra disk space needed, specify like this
 disk = '200 GB'
 */

 extra ='$FLOAT_VMPOLICY_OPT'
 /*
 For some special tasks like Qualimap which generates very small IO request, using this -o writeback_cache can help with performance. Here's an example.
 withName: "QUALIMAP_RNASEQ" {
 extra ='$FLOAT_VMPOLICY_OPT'
 }
 */
}

podman.registry = 'quay.io'

float {
 address = '<your_opcenter_ip>'
 username = '<your_user_name>'
 password = '<your_password>'
}

// AWS access info if needed
aws {
 client {
 endpoint = 'https://s3.<your_bucket_region>.amazonaws.com'
 maxConnections = 20
 connectionTimeout = 300000
 }
 accessKey = '$(get_secret AWS_BUCKET_ACCESS_KEY)'
 secretKey = '$(get_secret AWS_BUCKET_SECRET_KEY)'
}


```

- **Note**:
 1. Replace the value of `workDir` with the S3 URL of the workDir you copied earlier.
 2. Remember to add `endpoint = 'https://s3.<your_bucket_region>.amazonaws.com'` under `client`.
 3. If you are providing a bucket in `us-east-1`, update the endpoint in your config file like so:

```bash
aws {
 client {
 endpoint = 'https://s3.us-east-1.amazonaws.com'
 }
}
```

---

## 3. Enable copy the sample sheet as needed and modify Nextflow Command Setup in s3flow_transient_nextflow_submit.sh:

1. **Data Preparation:**
 In this section, you will copy essential files (such as your sample sheet or parameters) from your S3 bucket to the working directory on the transient node as needed.

 **Instructions**:
 - Uncomment and modify the following lines based on the location of your input files:
 ```bash
 aws s3 cp s3://your-s3-bucket/samplesheet.csv .
 aws s3 cp s3://your-s3-bucket/scripts/params.yml .
 ```
 - Replace `your-s3-bucket` with the name of your actual S3 bucket.
 - Ensure that the paths match the files you are using for your pipeline run.

2. **Nextflow Command Setup:**
 The `nextflow_command` variable contains the command that will execute your Nextflow pipeline. You need to modify this command to fit your specific pipeline and output settings.

 **Instructions**:
 - Replace the `<your_outdir_bucket>` placeholder with the name of the S3 bucket you created for the pipeline's output.
 ```bash
 nextflow_command="nextflow run nf-core/<pipeline> -profile test -c mmc.config --outdir s3://your-s3-bucket/outputDir/"
 ```
 - You can also modify the `-profile` and other pipeline-specific parameters depending on the pipeline you are running.

---


## Deploy Nextflow Head Node

Deploy the Nextflow head node using the **juiceflow:v2** template:

```bash
float submit \
-c 2 -m 4 \
-i juiceflow:v2 \
--storage <S3_bucket_name_of_user_input_data > \
--vmPolicy '[onDemand=true]' \
--migratePolicy '[disable=true]' \
--securityGroup sg-00XXXXXXXXXXX \
-j s3flow_transient_nextflow_submit.sh \
--dirMap /transient_s3flow/nextflow:/transient_s3flow/nextflow \
--hostTerminate s3flow_transient_hostTerminate.sh \
--hostInit s3flow_transient_hostInit.sh
```

**Note**:
1. Replace `<security-group>` with your specific details. 
2. The **juiceflow:v2** template comes pre-configured with S3 setup.
3. `-c 2 -m 4` is used to specify the head node’s CPU and memory configuration. Below, you will also see an example of modifying the **nextflow:jfs** template using the `--overwriteTemplate "*" -c 8 -m 32` command to change the head node’s CPU and memory settings.
4. Using --storage is not mandatory, but it can be beneficial for certain users. For more details, please refer to the FAQ section of this tutorial.


---



### Checking Head Node Deployment Status

```bash
float list -f 'status=executing'
```

#### Example Output:
```bash
+-----------------------+--------------------------+----------------------------------+-------+-----------+----------+----------------------+------------+
| ID | NAME | WORKING HOST | USER | STATUS | DURATION | SUBMIT TIME | COST |
+-----------------------+--------------------------+----------------------------------+-------+-----------+----------+----------------------+------------+
| 1cc0j755719d8xdewyidt | juiceflow-t3.medium | 54.82.58.176 (2Core4GB/OnDemand) | admin | Executing | 41m52s | 2024-10-14T20:05:34Z | 0.0296 USD |
+-----------------------+--------------------------+----------------------------------+-------+-----------+----------+----------------------+------------+
```

---



## FAQ

Q: **What does the `--storage` option do?**

A: In the past, if `--dataVolume` or `--storage` wasn't used, and a user provided input files (e.g., FASTQ) using S3 URLs, Nextflow would first have to copy these files into the `workDir` so that the Nextflow process could access them. This step, called **staging**, had a significant downside: even if only a small part of a file in the S3 bucket changed, Nextflow would still need to download the entire file again into the `workDir` for each process.

With the introduction of `--dataVolume` and `--storage`, this staging process is no longer necessary. These options allow Nextflow to directly access files in the S3 file system, eliminating redundant file transfers. This approach uses the open-source `s3fs` solution, which enables seamless interaction with S3 as if it were a file system. You can find more details here: [s3fs GitHub](https://github.com/s3fs-fuse/s3fs-fuse).

Additionally, the `--storage` option was introduced to simplify the use of `--dataVolume`. Previously, you had to manually provide AWS credentials when using `--dataVolume`, but with `--storage`, that’s no longer required.

---

## Introduction to AWS S3
AWS S3 is a highly scalable, durable, and secure object storage service, making it an ideal choice for managing large-scale workflows like those in Nextflow. Nextflow includes built-in support for AWS S3, allowing seamless integration of S3 buckets into pipeline scripts. Files stored in an S3 bucket can be accessed transparently in your pipeline script, just like any other file in the local file system, enabling efficient data management across cloud and on-premises environments.

---

## Pre-requisites for Using S3 with Nextflow

Before you begin, ensure you meet the following prerequisites:

### 1. A VPC Security Group with an Inbound Rule for Port 22

Ensure that your virtual private cloud (VPC) is properly configured to allow SSH access to the instances running your Nextflow pipelines. This requires a security group with an inbound rule to allow connections on port 22, which is used for SSH.
Navigation: AWS EC2 console -> Network & Security -> Security Groups

![](https://baywatch-api.memverge.com/media/upload/202410/241011235918570380.png)


---

### 2. Create a New S3 Bucket for Storing Nextflow Output and Workdir Files

To store the outputs of your Nextflow pipelines and the intermediate files created in the Nextflow workdir, you will need a dedicated S3 bucket. This S3 bucket will serve as both the working directory and the final storage location, ensuring that all files generated during the pipeline's execution are accessible for future reference or further processing.

To create a new S3 bucket, follow these steps:

1. Navigate to the **AWS S3 Console**.
2. Click on **Create Bucket** to begin the setup process.

![](https://baywatch-api.memverge.com/media/upload/202410/241011000121370564.png)

3. After the bucket is created, you need to create two folders within the bucket: one for the output files and one for the workdir.

![](https://baywatch-api.memverge.com/media/upload/202410/241011011132284977.png)

4. You can then select each folder by checking the box next to it, and click on **Copy S3 URL** to obtain the folder's URL. This URL will be required for configuration in the upcoming `Editing the Configuration File` section.

![](https://baywatch-api.memverge.com/media/upload/202410/241011011024204729.png)



---

### 3. Prepare All Required Scripts

1. **s3flow_persistent_hostinit.sh**
```bash
curl -O https://mmce-data.s3.amazonaws.com/s3flow/v1/s3flow_persistent_hostinit.sh
```
2. **keep_alive.sh**
The content of `keep_alive.sh` is as follows:
```bash
#!/bin/bash

# Keep the script running indefinitely
while true; do
 sleep infinity # This ensures the script never stops
done
```
It is recommended to keep these three files in the same folder. Ensure that the EC2 Instance or local machine where these files reside has `float` installed.

---

## Deployment Steps for Individual Users on MMCloud

### Float Login
Ensure you are using the latest version of the `float` CLI:

```bash
sudo float release sync
```

Login to your MMCloud OpCenter:

```bash
float login -a <opcenter-ip-address> -u <user>
```

After entering your password, verify that you see **Login succeeded!**

---

### Float Secret
Set your AWS credentials as secrets in `float`:

```bash
float secret set AWS_BUCKET_ACCESS_KEY <BUCKET_ACCESS_KEY>
float secret set AWS_BUCKET_SECRET_KEY <BUCKET_SECRET_KEY>
```

To verify the secrets:

```bash
float secret ls
```

---

##### Expected Output:
```bash
+-------------------------+
| NAME |
+-------------------------+
| AWS_BUCKET_ACCESS_KEY |
| AWS_BUCKET_SECRET_KEY |
+-------------------------+
```

---

### Deploy Nextflow Head Node

Deploy the Nextflow head node using the **nextflow:jfs** template:

```bash
float submit -i nextflow:jfs \
--hostInit s3flow_persistent_hostinit.sh \
--storage <S3_bucket_name_of_user_input_data > \
--vmPolicy '[onDemand=true]' \
--migratePolicy '[disable=true]' \
--securityGroup sg-XXXXXXXX \
-c 2 -m 4 \
-n <head-node-name> \
-j keep_alive.sh
```

**Note**:
1. Replace `<head-node-name>` and `<security-group>` with your specific details. We recommend starting `<head-node-name>` with `S3FLOW_PERSISTENT_HEAD_XXX` for better understanding purposes.
2. The **nextflow:jfs** template comes pre-configured with S3 setup.
3. `-c 2 -m 4` is used to specify the head node’s CPU and memory configuration. Below, you will also see an example of modifying the **nextflow:jfs** template using the `--overwriteTemplate "*" -c 8 -m 32` command to change the head node’s CPU and memory settings.
4. Using --storage is not mandatory, but it can be beneficial for certain users. For more details, please refer to the FAQ section of this tutorial.


---

### Overriding Template Defaults if Needed

#### Customizing CPU and Memory
To override default CPU and memory settings:

```bash
--overwriteTemplate "*" -c <number-of-cpus> -m <memory-in-gb>
```

**Example**: To set 8 CPUs and 32GB memory:

```bash
--overwriteTemplate "*" -c 8 -m 32
```

---

#### Specifying a Subnet
For deploying in a specific AWS **subnet**:

```bash
--overwriteTemplate "*" --subnet <SUBNET-ID>
```

---

#### Mounting S3 Buckets as Data Volumes
You can mount the input data bucket using S3FS as a data volume on the Nextflow head node and worker nodes as follows:

```bash
--dataVolume [mode=r,accesskey=xxx,secret=xxx,endpoint=s3.REGION.amazonaws.com]s3://BUCKET_NAME:/staged-files
```

---

#### Incremental Snapshot Feature (From v2.4)
Enables faster checkpointing and requires larger storage:

```bash
--overwriteTemplate "*" --dumpMode incremental
```

---

### Checking Head Node Deployment Status

```bash
float list -f 'status=executing'
```

#### Example Output:
```bash
+-----------------------+--------------------------+-------------------------------+-------+-----------+----------+----------------------+------------+
| ID | NAME | WORKING HOST | USER | STATUS | DURATION | SUBMIT TIME | COST |
+-----------------------+--------------------------+-------------------------------+-------+-----------+----------+----------------------+------------+
| n0ez2czrqmw2kp67tstmk | S3FLOW_PERSISTENT_HEAD_1 | 3.80.52.8 (2Core4GB/OnDemand) | admin | Executing | 29m51s | 2024-10-10T22:31:21Z | 0.0210 USD |
+-----------------------+--------------------------+-------------------------------+-------+-----------+----------+----------------------+------------+
```

---

### Get SSH key via float command

1. **Locate** the public IP address of the head node in the **Working Host** column.
2. **Retrieve** the SSH key from Float's secret manager:

```bash
float secret get <job-id>_SSHKEY > <head-node-name>-ssh.key
```
 See the screenshot below as an example:
![](https://baywatch-api.memverge.com/media/upload/202410/241011001730894074.png)


- **Note**: If you encounter a `Resource not found` error, wait a few more minutes for the head node and SSH key to initialize.

3. **Set the appropriate permissions** for the SSH key:

```bash
chmod 600 <head-node-name>-ssh.key
```

---

### SSH to S3Flow Persistent Head Node

SSH into thehead node using the provided SSH key, username, and the head node's public IP address:

```bash
ssh -i <head-node-name>-ssh.key nextflow@<head-node-public-ip-address>
```
 See the screenshot below as an example:
![](https://baywatch-api.memverge.com/media/upload/202410/241011002206712049.png)

- **Note**: Use the username `nextflow` to log in as an admin.

---






### MMC NF-Float Configuration

#### Editing the Configuration File

1. **Copy the template and edit the configuration file**:

```bash
cp mmcloud.config.template mmc-s3flow.config
vi mmc-s3flow.config
```

- **Note**: If you are new to using `vi`, check out this [Beginner's Guide to Vi](https://www.howtoge ek.com/102468/a-beginners-guide-to-editing-text-files-with-vi/) for basic instructions.


**If you're not comfortable using vi to modify config files, you can follow the "Accessing via VSCode" section at the end of this tutorial to SSH into the head node**


2. The **mmc-s3flow.config** file copied from the `mmcloud.config.template` will be pre-filled with the OpCenter IP address and the **PRIVATE** IP address of the Nextflow head node. You only need to provide your OpCenter username, password, and AWS access, secret keys and region in the config.



###### Configuration File Content:

```bash
plugins {
 id 'nf-float'
}

workDir = 's3://<your_s3_bucket_name/your_workDir_folder_name/'

process {
 executor = 'float'
 errorStrategy = 'retry'
 
 /*
 If users would like to enable float storage function, specify like this
 extra = '--storage <S3_bucket_name>'
 */


 /*
 If extra disk space needed, specify like this
 disk = '200 GB'
 */

 extra = ''
 /*
 For some special tasks like Qualimap, which generates very small IO requests, using this -o writeback_cache can help with performance. Here's an example:
 withName: "QUALIMAP_RNASEQ" {
 extra = ''
 }
 */
}

podman.registry = 'quay.io'

float {
 address = '<your_opcenter_ip>'
 username = '<your_user_name>'
 password = '<your_password>'
}

// AWS access info if needed
aws {
 client {
 endpoint = 'https://s3.<your_bucket_region>.amazonaws.com'
 maxConnections = 20
 connectionTimeout = 300000
 }
 accessKey = '<bucket_access_key>'
 secretKey = '<bucket_secret_key>'
 region = '<bucket_region>'
}
```


**Note**:
1. Replace the value of `workDir` with the S3 URL of the workDir you copied earlier.
2. Remember to add `endpoint = 'https://s3.<your_bucket_region>.amazonaws.com'` under `client`.
3. If you are providing a bucket in `us-east-1`, update the endpoint in your config file like so:
 ```bash
 aws {
 client {
 endpoint = 'https://s3.us-east-1.amazonaws.com'
 }
 }
 ```
---



## Using Tmux

Start a tmux session named `nextflow`:

```bash
tmux new -s nextflow
```

To attach to an existing tmux session:

```bash
tmux attach -t nextflow
```

- **Tip**: If you are new to `tmux`, here is a handy [Tmux Cheat Sheet](https://tmuxcheatsheet.com/).

---

## Nextflow Version Check

Check the Nextflow version and update if necessary:

```bash
nextflow -v
```

### Example Output:
```bash
nextflow version 24.04.4.5917
```

---

## Launch Nextflow

Launch a Nextflow or `nf-core/<pipeline>` by providing the MMC config file:

```bash
nextflow run nf-core/<pipeline> \
 -profile test \
 -c mmc-s3flow.config \
 --outdir s3://nextflow-work-dir/<pipeline>
```
- **Note**: Replace the value of --outdir with the S3 URL of the output folder you copied earlier.

---

## Head Node Management

In this persistent head node setup, the user is responsible for disposing of the head node. To cancel the head node job, click the cancel button for the job as shown below.
![](https://baywatch-api.memverge.com/media/upload/202410/241011004353236205.png)

![](https://baywatch-api.memverge.com/media/upload/202410/241011004434965460.png)



---

## How to Use the SSH Client in VSCode

### 1 Install the Remote-SSH Plugin

To use the SSH client in VSCode, you first need to install the "Remote-SSH" plugin. Click the **Extensions** icon on the left sidebar of the VSCode interface, type **Remote-SSH** in the search bar, and press Enter. The plugin will appear in the search results. Click **Install** and wait for the installation to complete.

**Figure 1:** Search for the Remote-SSH plugin in the Extensions marketplace 
![](https://baywatch-api.memverge.com/media/upload/202410/241015173639072822.png)



### 2 Add a Remote Host

Once the plugin is installed, click the **Remote Explorer** icon on the left side of VSCode. In the **SSH** section, click the **+** icon to add a new host. A host refers to the SSH server you want to connect to.

**Figure 2:** Click the "+" to add a new host 
![](https://baywatch-api.memverge.com/media/upload/202410/241015173918463415.png)



In the input field that appears, enter the host details in the format `ssh user@IP_address -A`. For example, if the SSH server's IP address is ` 90.80.52.8` and the username is `nextflow`, you would enter:

```bash
ssh nextflow@ 90.80.52.8 -A
```

**Figure 3:** Enter the host details 
![](https://baywatch-api.memverge.com/media/upload/202410/241011213256276935.png)


Next, you'll be prompted to choose a path to save the SSH configuration. If the specified file doesn't exist, VSCode will create a new one. If the file already exists, the new host information will be added to the beginning of the file. You can select any location for the config file, but ensure you have read and write permissions for that path. Typically, the file path would be something like `C:\Users\username\.ssh\config`.

**Figure 4:** Select the appropriate config file 
![](https://baywatch-api.memverge.com/media/upload/202410/241011213335363568.png)


After selecting the config file, you'll see a notification at the bottom right that says "Host added." Click **Open Config** to review the configuration file.

**Figure 5:** Review the config file 
![](https://baywatch-api.memverge.com/media/upload/202410/241011213345579938.png)


In the config file, you will see the following structure:

```bash
Host <host_name>
 HostName <host_ip>
 User <username>
 ForwardAgent yes
 IdentityFile "/Users/speri/memverge/s3flow/cgtpk5fhwpd3n1jva2xka_SSHKEY"
```

**Note**: This IdentityFile is `<head-node-name>-ssh.key`, you generated earlier via command `float secret get <job-id>_SSHKEY > <head-node-name>-ssh.key `

Ensure the details are correct. Once confirmed, you can close the file. At this point, the host has been successfully added.

**Figure 6:** Verify the host details 
![](https://baywatch-api.memverge.com/media/upload/202410/241011213734902248.png)



### 3 Connect to the Remote Host

To connect to the remote host, click the green **><** icon located at the bottom left of the VSCode window. This will open the remote connection window.

**Figure 7:** Open the remote connection window 
![](https://baywatch-api.memverge.com/media/upload/202410/241011213932439422.png)
After cilcking, you will see below screesnhot 
![](https://baywatch-api.memverge.com/media/upload/202410/241011213939403800.png)




### 4 Open and Edit Files Remotely

Once connected, you can browse the remote server's file system in VSCode. Open files and edit them directly as if they were on your local machine.

Select the host you just added from the list, and VSCode will establish an SSH connection to the remote server. 
![](https://baywatch-api.memverge.com/media/upload/202410/241015174147291457.png)


Now you can easily edit your files and templates using VSCode's remote editing features.
![](https://baywatch-api.memverge.com/media/upload/202410/241011214005331358.png)

---



## **Creating Job Templates to Launch via MMCloud GUI**

Job Templates allow you to streamline and customize runs that follow a similar format, without the need to manually configure a command for each execution. To create a job template, you must first submit a job, which you will later use to save as a template.

### **Steps to Create a Job Template**:

1. **Navigate to the Jobs Dashboard**:
 - After submitting a job (such as the head node job in this case), go to the **Jobs** section from the MMCloud GUI dashboard.

2. **Select the Head Node Job**:
 - In the Jobs dashboard, locate and select the head node job that you previously submitted.

3. **Save as Template**:
 - Click on **More Actions** (located in the top-right corner), and then choose **Save as Template** from the dropdown menu.
![](https://baywatch-api.memverge.com/media/upload/202410/241015184619518105.png)
 

4. **Provide Template Information**:
 - In the **Save Job as Template** dialog box, provide a name for your template under **Template Name**.
 - Assign an appropriate **Tag** to easily identify the template later on (e.g., "S3Flow_persistent").
 - You can optionally check the box to overwrite an existing template with the same name.
 - Once done, click **Save**.

![](https://baywatch-api.memverge.com/media/upload/202410/241015185316947854.png)

5. **Accessing the Saved Template**:
 - Navigate to the **Job Templates** section from the left-side menu.
 - Switch to the **Private** templates tab to view your saved template.
 - You should now see the newly saved template listed, along with its status and other details.
 
![](https://baywatch-api.memverge.com/media/upload/202410/241015185536791787.png)

6. **Using the Job Template**:
 - Once the template is saved, you can use it to launch new jobs with the same configuration, saving time by avoiding repetitive setups. Simply select the template from the Job Templates dashboard and initiate the job run.

![](https://baywatch-api.memverge.com/media/upload/202410/241015190532253902.png)


---












## FAQ

Q: **What does the `--storage` option do?**

A: In the past, if `--dataVolume` or `--storage` wasn't used, and a user provided input files (e.g., FASTQ) using S3 URLs, Nextflow would first have to copy these files into the `workDir` so that the Nextflow process could access them. This step, called **staging**, had a significant downside: even if only a small part of a file in the S3 bucket changed, Nextflow would still need to download the entire file again into the `workDir` for each process.

With the introduction of `--dataVolume` and `--storage`, this staging process is no longer necessary. These options allow Nextflow to directly access files in the S3 file system, eliminating redundant file transfers. This approach uses the open-source `s3fs` solution, which enables seamless interaction with S3 as if it were a file system. You can find more details here: [s3fs GitHub](https://github.com/s3fs-fuse/s3fs-fuse).

Additionally, the `--storage` option was introduced to simplify the use of `--dataVolume`. Previously, you had to manually provide AWS credentials when using `--dataVolume`, but with `--storage`, that’s no longer required.

---

* MMCloud's Juiceflow solution provides a high-performance method for running Nextflow pipelines on the cloud by leveraging JuiceFS to optimize cloud storage for the work directory.

> Read more about Juiceflow in a blog [here](https://www.mmcloud.io/blog/juiceflow-a-next-generation-solution-for-nextflow)

* [JuiceFS](https://juicefs.com/docs/community/introduction/) is a general-purpose, distributed file system compatible with any application. In the current MMCloud release, JuiceFS can only be used with a Nextflow host deployed using the OpCenter's built-in Nextflow job template.

* While cloud storage formatted with JuiceFS offers high performance for work directories, input data often needs to be staged in the work directory. 

* Nextflow natively supports staging or even streaming inputs from S3 to the executors of the respective process steps. By defining the input S3 location as a Channel, the Nextflow framework ensures the data is available for the task.

* However, with a large number of files, staging can become a bottleneck as the files must first be downloaded to the stage subdirectory in the work directory and then made available to the worker nodes for execution.

* To avoid this staging bottleneck, input data in cloud storage can be mounted as data volumes, making the files available locally and bypassing the staging process.

* Essentially, you can register storage and mount the storage bucket using S3FS as a data volume on the Nextflow head node and worker nodes as `--storage <storage-name>`

---

## Steps:

> See the official [Juiceflow Quick Guide](https://www.mmcloud.io/resources/docs/juiceflow-aws) for introduction on how to run nextflow pipelines on MMCloud

### Register Storage

> Available from float `v3.0.0-69ce0c9-Imperia` onwards

* In the OpCetner left navigation bar, click on `Storage`

![](https://baywatch-api.memverge.com/media/upload/202408/240812050212519294.png)

* Click on `Register Storage` button

![](https://baywatch-api.memverge.com/media/upload/202408/240812050106954736.png)


* Select `Storage Type`

| Volume | col | col |
| - | - | - |
| NFS | content | content |
| Lustre | content | content |
| S3 | content | content |
| OSS | content | content |
| GS | content | content |

![](https://baywatch-api.memverge.com/media/upload/202408/240812050418968825.png)

* Provide a `name` for the storage, `S3 URI`, `endpoint`, `access key`, `secret key`, `mount-point` of the bucket and choose `Access-Mode` - `Read` only or `Read Write`

![](https://baywatch-api.memverge.com/media/upload/202408/240812051142317705.png)

### Using storage arguments in JuiceFlow

1. In your job script, add the input bucket as `--storage <storage-name>` in the `process.extra` section of `mmc.config`:

```
process {
 executor = 'float'
 errorStrategy = 'retry'
 extra = '--dataVolume [opts=" --cache-dir /mnt/jfs_cache "]jfs://${jfs_private_ip}:6868/1:/mnt/jfs --dataVolume [size=120]:/mnt/jfs_cache --storage <storage-name-1> --storage <storage-name-2>'
}
```

2. Modify your sample sheet to now read from the storage mount point defined while registering storage `</storage-mount-point>`:

```
group_id,subject_id,sample_id,sample_type,sequence_type,filetype,filepath
COLO829_Full,COLO829,COLO829T,tumor,dna,bam,/storage-mount-point/COLO829v003T.bam
COLO829_Full,COLO829,COLO829T,tumor,dna,bai,/storage-mount-point/COLO829v003T.bam.bai
COLO829_Full,COLO829,COLO829R,normal,dna,bam,/storage-mount-point/COLO829v003R.bam
COLO829_Full,COLO829,COLO829R,normal,dna,bai,/storage-mount-point/COLO829v003R.bam.bai
```

3. For the head node, add the `--storage <storage-name>` variable to the float submit command:

```
float submit \
--hostInit transient_JFS_AWS.sh \
--hostTerminate hostTerminate_AWS.sh \
-i docker.io/memverge/juiceflow \
--vmPolicy '[onDemand=true]' \
--migratePolicy '[disable=true]' \
--dataVolume '[size=60]:/mnt/jfs_cache' \
--storage <storage-name-1> \
--storage <storage-name-2> \
--dirMap /mnt/jfs:/mnt/jfs \
-c 2 -m 4 \
-n <job-name> \
--securityGroup <security-group> \
--env BUCKET=https://<work-bucket>.s3.<region>.amazonaws.com \
-j job_submit_AWS.sh
```

This setup ensures efficient data handling by reading input data from input buckets using S3FS as staged data volume, while the work buckets leverage JuiceFS for high performance.

---

## Using JuiceFS SnapLocation for better efficiency of checkpoint/restore

* MMC can use either local EBS volumes or JuiceFS formatted S3 buckets for storing snapshot data

* From the left navigation bar of the opcenter, click on `System Settings`

![](https://baywatch-api.memverge.com/media/upload/202408/240812052246117983.png)

* In System Settings, click on `Cloud` settings and go the `Snapshot Location` field


![](https://baywatch-api.memverge.com/media/upload/202408/240812052423317064.png)

* Provide the S3 bucket URL in the following format making sure to include the `accesskey`, `secretkey` and `mode=rw` 

```
[accesskey=<access>,secret=<secret>,mode=rw]s3://preview-opcenter-jfs-snaplocation
```

> **NOTE: THE S3 bucket provided for snaplocation as above will be used to store the snapshot data of all jobs from all users in the opcenter and the data is cleared out automatically after a job finished**

This guide explains how to install customer-specified packages using a Dockerfile and a Conda environment file.

## 1. Create Dockerfile

Create a file named `Dockerfile` with the following content:

```
FROM --platform=linux/amd64 continuumio/miniconda3

# Set the working directory
WORKDIR /usr/src/app

# Copy the environment.yml file to the working directory
COPY environment.yml ./

# Create a Conda environment using the environment.yml file
RUN conda env create -f environment.yml

# Ensure the Conda activation script is available
RUN echo "source activate myenv" >> ~/.bashrc

# Expose the port for JupyterLab
EXPOSE 8888

# Entry command to start JupyterLab without token authentication
ENTRYPOINT ["bash", "-c", "eval \"$(conda shell.bash hook)\" && conda activate myenv && jupyter lab --ip=0.0.0.0 --port=8888 --no-browser --allow-root --NotebookApp.token='' --NotebookApp.password=''"]
```

## 2. Create environment.yml

If you already have an `environment.yml` file, you can skip step 3. Create a file named `environment.yml` with the following content:

```
name: myenv                     # Specify the name of the Conda environment
channels:
  - conda-forge                 # A community-driven channel with a wide variety of packages
  - bioconda                    # A channel with packages specific to bioinformatics
dependencies:
  - python                      # Install Python
  - numpy                       # Install NumPy for numerical computations
  - jupyterlab                  # Install JupyterLab, an interactive development environment
```

> Note: The above is just an example. You can replace the content in channels: and dependencies: based on your needs.

## 3. Export environment.yml from Existing Conda Environment

If you do not have an environment.yml file but have a Conda environment, follow these steps to generate one. Skip this step if you have completed step 2.

* Activate your existing Conda environment:
```
conda activate your_existing_env
```

* Export the environment to a file:
```
conda env export > environment.yml
```

The generated `environment.yml` file will be located in the current directory.

* Review and edit the environment.yml file if necessary. Open the environment.yml file in a text editor and make any necessary adjustments, such as removing unnecessary packages or changing package versions.

## 4. Build the Docker Image

Run the following command to build the Docker image. Replace YOUR_DOCKERHUB_USERNAME/YOUR_IMAGE_NAME with the desired name for your Docker image:

```
docker build -t YOUR_DOCKERHUB_USERNAME/YOUR_IMAGE_NAME -f Dockerfile .
```

> Example:
>```
>docker build -t cizhiwu/jupyterlab-env:v1 -f Dockerfile .
>```


## 5. Tag the Docker Image

Run the following command to tag the Docker image. Replace YOUR_DOCKERHUB_USERNAME/YOUR_IMAGE_NAME and VERSION_TAG with the appropriate values:

```
docker tag YOUR_DOCKERHUB_USERNAME/YOUR_IMAGE_NAME:VERSION_TAG YOUR_DOCKERHUB_USERNAME/YOUR_IMAGE_NAME:latest
```

Example
```
docker tag cizhiwu/jupyterlab-env:v1 cizhiwu/jupyterlab-env:latest
```


## 6.Push the Docker Image

Run the following commands to push both the versioned and latest tags to Docker Hub:

```
docker push YOUR_DOCKERHUB_USERNAME/YOUR_IMAGE_NAME:VERSION_TAG
docker push YOUR_DOCKERHUB_USERNAME/YOUR_IMAGE_NAME:latest
```

Example:
```
docker push cizhiwu/jupyterlab-env:v1
docker push cizhiwu/jupyterlab-env:latest
```


## 7. Launch JupyterLab on Opcenter

Log in to your Opcenter instance and run the following command to submit the job. 

The -c flag specifies the number of CPUs, and the -m flag specifies the memory size in GB:

```
float submit -i docker.io/YOUR_DOCKERHUB_USERNAME/YOUR_IMAGE_NAME:latest -c 4 -m 16 --publish 8888:8888 --imageVolSize 17 --vmPolicy [onDemand=true] --migratePolicy [disable=true] --withRoot=true --securityGroup YOUR_SECURITY_GROUP
```

Example:
```
float submit -i docker.io/cizhiwu/jupyterlab-env:latest -c 4 -m 16 --publish 8888:8888 --imageVolSize 17 --vmPolicy [onDemand=true] --migratePolicy [disable=true] --withRoot=true --securityGroup sg-09e5fc379012da5b7
```

> Here, cizhiwu/jupyterlab-env:latest is the image you pushed to Docker Hub, and sg-09e5fc379012da5b7 is your Opcenter instance's security group.


After launching the JupyterLab server from the Docker image, go to the, find its Public IPv4 address. For example, if the address is 100.27.35.229, then open your browser and visit:

```
http://100.27.35.229:8888
```

This will open JupyterLab.

AWS Known Issues

Known Issues-1: AWS Rebalance Recommendation Signal