Velvet
Velvet is a sequence assembler for very short reads.
License
Free to use and open source under GNU GPLv2.
Available
- Puhti: 1.2.10
- Chipster graphical user interface
Usage
On Puhti, the Velvet commands are initialized with the command:
velveth
(and the corresponding colorspace version velveth_de
) helps you construct the dataset for the velvetg
program. Velveth takes in a number of sequence files, produces a hashtable, then outputs two files in an output directory, Sequences and Roadmaps, which are necessary to Velvetg. The syntax is as follows:
For example:
velvetg
(and the corresponding colorspace version velvetg_de
) is the core of Velvet where the de Bruijn graph is built and then manipulated. The syntax of velvetg
is:
A velvetg
command could look like:
When Velvet was compiled on Puhti, the maximum allowed k-mer length was defined. The longer the maximum k-mer is, the more memory Velvet will need (regardless of the k-mer length that is actually used). Because of that we provide several versions of Velvet, listed in the table below. On Puhti, the default maximum k-mer length that can be used in the hash table is 100 bases. However, it is recommended to use the version that has the shortest possible max k-mer length. For example, for k-mer length 40, you should use velveth_maxk50
and velvetg_maxk50
.
Velvet programs available on Puhti
Program | max. k-mer length | type |
---|---|---|
velveth |
100 | normal |
velvetg |
100 | normal |
velveth_maxk75 |
75 | normal |
velvetg_maxk75 |
75 | normal |
velveth_maxk50 |
50 | normal |
velvetg_maxk50 |
50 | normal |
velveth_maxk35 |
35 | normal |
velvetg_maxk35 |
35 | normal |
velveth_de |
100 | colorspace |
velvetg_de |
100 | colorspace |
velveth_de_maxk75 |
75 | colorspace |
velvetg_de_maxk75 |
75 | colorspace |
velveth_de_maxk50 |
50 | colorspace |
velvetg_de_maxk50 |
50 | colorspace |
velveth_de_maxk35 |
35 | colorspace |
velvetg_de_maxk35 |
35 | colorspace |
On Puhti, the Velvet jobs should be executed through the batch job system. Below is sample batch job file for Velvet:
#!/bin/bash
#SBATCH --job-name=velvet
#SBATCH --output=put=output_velvet2.txt
#SBATCH --error=errors_velvet2.txt
#SBATCH --account=<project>
#SBATCH --time=4-00:00:00
#SBATCH --ntasks=1
#SBATCH --partition=longrun
#SBATCH --nodes=1
#SBATCH --cpus-per-task=4
#SBATCH --mem=64G
module load biokit
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
export OMP_THREAD_LIMIT=$SLURM_CPUS_PER_TASK
velveth_maxk50 assembly_folder 45 -shortPaired -fastq temp.fastq
velvetg_maxk50 assembly_folder -ins_length 400
In the batch job file above, the job reserves 4 computing cores (--cpus-per-task=4
) and 64 GB of memory for four days (--time=4-00:00:00
). Velvet can utilize thread-based parallel computing. After the setup command module load biokit
, the number of cores to be used in the Velvet run is defined with the environment variables $OMP_NUM_THREADS
and $OMP_THREAD_LIMIT
. In this script these variables are set by using the environment variable $SLURM_CPUS_PER_TASK
that contains the value defined with --cpus-per-task
(which, in this example, has the value 4).
The batch job can be launched with the command:
More information about running batch jobs can be found from the batch job section of the Puhti user guide.
More information
More information about Velvet can be found from: