User friendly preprocessor of metagenome sequences

What is the PyroTrimmer

PyroTrimmer is a program that trims barcodes, linkers, primers, and sequence region with low quality scores from 454 sequence reads. It has following features :

  • more sensitive detection of primer sequences using Levenstein distance and global alignment
  • the first stand-alone software for this purpose
  • performing faster than other existing programs
PyroTimmer was implemented in JAVA, not use any other executable program, so it can be run on any kind of operating system without requirement for other external programs to be installed.

User's Guide

PyroTrimmer takes tag, fasta and quality file as input. Tag file include project, names of genes and samples, upper or lower length cutoffs of reads, and sequences of barcodes-linkers-primers. Each element of tag file should be separated as tab delimiter. There is five options to accuratly trim by this program. Fasta and quality are trimmed according to tag through seven steps. PyroTrimmer produces output file that trimmed result file, length count file, summary statics file and undefined file.
The following is the options used to control PyroTrimmer and the command line example:
arguments
-t file path of tag information
-i file path of the input fasta file
-q file path of the quality file
-o file Path for output data
options
-a average quality value cutoff for 3' end trimming: 10 to 30 (default = 20)
-l average quality value cutoff for full length sequence: 20 to 30 (default = 25)
-m # of mismatches for trimming primer sequences: 1 to 4 (default = 3)
-p removing sequences with ambiguous base(N) > arg: 0 to 5 (default = 0)
-w window size for 3' end trimming: 3 to 20 (default = 5)
example
java -jar PyroTrimmer-1.0.jar -t /home/whoami/tagfile -i /home/whoami/test.fasta -q /home/whoami/test.quality -o /home/whoami/out -a 20 -l 25 -m 3 -p 0 -w 5k
For more information, click here.

Test data set

This data set is from SRR189062 which is Bacterial community associated with the marine sponge Raspailia ramosa sampled in Irish Waters.
Download fasta, quality and tag file

Contact Information

Kyung Mo Kim (E-mail : kmkim@kribb.re.kr)
Kyuin Hwang (E-mail : rbdls77@kopri.re.kr)

How to cite PyroTrimmer?

Oh J, Kim BK, Cho WS, Hong SG, Kim KM. 2012. PyroTrimmer: a user-friendly software for pre-processing multiplex pyrosequencing data. J Microbiol. 50(5):766-9

PyroTrimmer 1.1 Released

This release includes have new features compared to version 1.0 including:

  • new function - # of homopolymeric nt : A homopolymer longer than nucleotides user specified are removed
  • fug fix