Seqtk
Description
Seqtk is a fast and lightweight tool for processing sequences in the FASTA or FASTQ format. It seamlessly parses both FASTA and FASTQ files which can also be optionally compressed by gzip.
License
Free to use and open source under MIT License.
Available
- Puhti: 1.3-r106
Usage
Seqtk in included in the biokit module:
module load biokit
Seqtk command syntax is:
seqtk <command> <arguments>
Command | Function |
---|---|
seq | common transformation of FASTA/Q |
comp | get the nucleotide composition of FASTA/Q |
sample | subsample sequences |
subseq | extract subsequences from FASTA/Q |
fqchk | fastq QC (base/quality summary) |
mergepe | interleave two PE FASTA/Q files |
trimfq | trim FASTQ using the Phred algorithm |
hety | regional heterozygosity |
gc | identify high- or low-GC regions |
mutfa | point mutate FASTA at specified positions |
mergefa | merge two FASTA/Q files |
famask | apply a X-coded FASTA to a source FASTA |
dropse | drop unpaired from interleaved PE FASTA/Q |
rename | rename sequence names |
randbase | choose a random base from hets |
cutN | cut sequence at long N |
listhet | extract the position of each het |
Examples:
Convert FASTQ to FASTA:
seqtk seq -a in.fq.gz > out.fa
seqtk subseq in.fq name.lst > out.fq
seqtk subseq in.fa reg.bed > out.fa