Megahit
Description
Megahit is an ultra fast assembly tool for metagenomics data.
License
Free to use and open source under GNU GPLv3.
Available
Version on CSC's Servers
- Puhti: 1.2.8
Usage
In Puhti, Megahit is activated by loading the biokit environment.
module load biokit
megahit -h
Sample Megahit batch job:
#!/bin/bash
#SBATCH --job-name=Megahit
#SBATCH --account=<project>
#SBATCH --time=12:00:00
#SBATCH --ntasks=1
#SBATCH --nodes=1
#SBATCH --output=megahit_out_8
#SBATCH --error=megahit_err_8
#SBATCH --cpus-per-task=8
#SBATCH --mem=32G
#SBATCH --partition=small
module load biokit
srun megahit -1 reads_1.fastq -2 reads_2.fastq -t $SLURM_CPUS_PER_TASK --m 32000000000 -o result_directory
csc-workspaces
to check your Puhti projects. Maximum running time is
set to 12 hours (--time=12:00:00
). As Megahit uses threads based parallelization, the process is considered as one job that should be executed within one node (--ntasks=1
, --nodes=1
). The job reserves eight cores --cpus-per-task=8
that can use in total up to 32 GB of memory (--mem=32G
). Note that the number of cores to be used needs to be defined in actual Megahit command
too. That is done with Megahit option -t
. In this case we use $SLURM_CPUS_PER_TASK variable that contains the cpus-pre-task
value ( we could as well use -t 8
but then we have to remember to change the value if number of the reserved CPU:s is changed).
The job is submitted to the to the batch job system with sbatch
command. For example, if the batch job
file is named as megahit_job.sh then the submission command is:
sbatch megahit_job.sh