MaskPrimers¶
Removes primers and annotates sequences with primer and barcode identifiers
usage: MaskPrimers [-h] [–version] ...
-
-h,--help¶ show this help message and exit
-
--version¶ show program’s version number and exit
- output files:
- mask-pass
- processed reads with successful primer matches.
- mask-fail
- raw reads failing primer identification.
- output annotation fields:
- SEQORIENT
- the orientation of the output sequence. Either F (input) or RC (reverse complement of input).
- PRIMER
- name of the best primer match.
- BARCODE
- the sequence preceding the primer match. Only output when the –barcode flag is specified.
MaskPrimers align¶
Find primer matches using pairwise local alignment.
usage: MaskPrimers [-h] [–version] ...
-
-h,--help¶ show this help message and exit
-
-s<seq_files>¶ A list of FASTA/FASTQ files containing sequences to process.
-
--fasta¶ Specify to force output as FASTA rather than FASTQ.
-
--failed¶ If specified create files containing records that fail processing.
-
--log<log_file>¶ Specify to write verbose logging to a file. May not be specified with multiple input files.
-
--delim<delimiter>¶ A list of the three delimiters that separate annotation blocks, field names and values, and values within a field, respectively.
-
--nproc<nproc>¶ The number of simultaneous computational processes to execute (CPU cores to utilized).
-
--outdir<out_dir>¶ Specify to changes the output directory to the location specified. The input file directory is used if this is not specified.
-
--outname<out_name>¶ Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files.
-
-p<primer_file>¶ A FASTA or REGEX file containing primer sequences.
-
--mode{cut,mask,trim,tag}¶ Specifies the action to take with the primer sequence. The “cut” mode will remove both the primer region and the preceding sequence. The “mask” mode will replace the primer region with Ns and remove the preceding sequence. The “trim” mode will remove the region preceding the primer, but leave the primer region intact. The “tag” mode will leave the input sequence unmodified.
-
--maxerror<max_error>¶ Maximum allowable error rate.
-
--revpr¶ Specify to match the tail-end of the sequence against the reverse complement of the primers. This also reverses the behavior of the –maxlen argument, such that the search window begins at the tail-end of the sequence.
-
--barcode¶ Specify to encode sequences with barcode sequences (unique molecular identifiers) found preceding the primer region.
-
--maxlen<max_len>¶ Length of the sequence window to scan for primers.
-
--skiprc¶ Specify to prevent checking of sample reverse complement sequences.
-
--gap<gap_penalty>¶ A list of two positive values defining the gap open and gap extension penalties for aligning the primers. Note: the error rate is calculated as the percentage of mismatches from the primer sequence with gap penalties reducing the match count accordingly; this may lead to error rates that differ from strict mismatch percentage when gaps are present in the alignment.
MaskPrimers score¶
Find primer matches by scoring primers at a fixed position.
usage: MaskPrimers [-h] [–version] ...
-
-h,--help¶ show this help message and exit
-
-s<seq_files>¶ A list of FASTA/FASTQ files containing sequences to process.
-
--fasta¶ Specify to force output as FASTA rather than FASTQ.
-
--failed¶ If specified create files containing records that fail processing.
-
--log<log_file>¶ Specify to write verbose logging to a file. May not be specified with multiple input files.
-
--delim<delimiter>¶ A list of the three delimiters that separate annotation blocks, field names and values, and values within a field, respectively.
-
--nproc<nproc>¶ The number of simultaneous computational processes to execute (CPU cores to utilized).
-
--outdir<out_dir>¶ Specify to changes the output directory to the location specified. The input file directory is used if this is not specified.
-
--outname<out_name>¶ Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files.
-
-p<primer_file>¶ A FASTA or REGEX file containing primer sequences.
-
--mode{cut,mask,trim,tag}¶ Specifies the action to take with the primer sequence. The “cut” mode will remove both the primer region and the preceding sequence. The “mask” mode will replace the primer region with Ns and remove the preceding sequence. The “trim” mode will remove the region preceding the primer, but leave the primer region intact. The “tag” mode will leave the input sequence unmodified.
-
--maxerror<max_error>¶ Maximum allowable error rate.
-
--revpr¶ Specify to match the tail-end of the sequence against the reverse complement of the primers. This also reverses the behavior of the –maxlen argument, such that the search window begins at the tail-end of the sequence.
-
--barcode¶ Specify to encode sequences with barcode sequences (unique molecular identifiers) found preceding the primer region.
-
--start<start>¶ The starting position of the primer