AssemblePairs¶
Assembles paired-end reads into a single sequence
usage: AssemblePairs [-h] [–version] ...
-
-h,--help¶ show this help message and exit
-
--version¶ show program’s version number and exit
- output files:
- assemble-pass
- successfully assembled reads.
- assemble-fail
- raw reads failing paired-end assembly.
- output annotation fields:
- <user defined>
- annotation fields specified by the –1f or –2f arguments.
AssemblePairs align¶
Assemble pairs by aligning ends
usage: AssemblePairs [-h] [–version] ...
-
-h,--help¶ show this help message and exit
-
-1<seq_files_1>¶ An ordered list of FASTA/FASTQ files containing head/primary sequences.
-
-2<seq_files_2>¶ An ordered list of FASTA/FASTQ files containing tail/secondary sequences.
-
--fasta¶ Specify to force output as FASTA rather than FASTQ.
-
--failed¶ If specified create files containing records that fail processing.
-
--log<log_file>¶ Specify to write verbose logging to a file. May not be specified with multiple input files.
-
--delim<delimiter>¶ A list of the three delimiters that separate annotation blocks, field names and values, and values within a field, respectively.
-
--nproc<nproc>¶ The number of simultaneous computational processes to execute (CPU cores to utilized).
-
--outdir<out_dir>¶ Specify to changes the output directory to the location specified. The input file directory is used if this is not specified.
-
--outname<out_name>¶ Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files.
-
--coord{illumina,solexa,sra,454,presto}¶ The format of the sequence identifier which defines shared coordinate information across paired ends
-
--rc{head,tail,both}¶ Specify to reverse complement sequences before stitching
-
--1f<head_fields>¶ Specify annotation fields to copy from head records into assembled record
-
--2f<tail_fields>¶ Specify annotation fields to copy from tail records into assembled record
-
--alpha<alpha>¶ Significance threshold for sequence assemble
-
--maxerror<max_error>¶ Maximum allowable error rate
-
--minlen<min_len>¶ Minimum sequence length to scan for overlap
-
--maxlen<max_len>¶ Maximum sequence length to scan for overlap
-
--scanrev¶ If specified, scan past the end of the tail sequence to allow the head sequence to overhang the end of the tail sequence.
AssemblePairs join¶
Assemble pairs by concatenating ends
usage: AssemblePairs [-h] [–version] ...
-
-h,--help¶ show this help message and exit
-
-1<seq_files_1>¶ An ordered list of FASTA/FASTQ files containing head/primary sequences.
-
-2<seq_files_2>¶ An ordered list of FASTA/FASTQ files containing tail/secondary sequences.
-
--fasta¶ Specify to force output as FASTA rather than FASTQ.
-
--failed¶ If specified create files containing records that fail processing.
-
--log<log_file>¶ Specify to write verbose logging to a file. May not be specified with multiple input files.
-
--delim<delimiter>¶ A list of the three delimiters that separate annotation blocks, field names and values, and values within a field, respectively.
-
--nproc<nproc>¶ The number of simultaneous computational processes to execute (CPU cores to utilized).
-
--outdir<out_dir>¶ Specify to changes the output directory to the location specified. The input file directory is used if this is not specified.
-
--outname<out_name>¶ Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files.
-
--coord{illumina,solexa,sra,454,presto}¶ The format of the sequence identifier which defines shared coordinate information across paired ends
-
--rc{head,tail,both}¶ Specify to reverse complement sequences before stitching
-
--1f<head_fields>¶ Specify annotation fields to copy from head records into assembled record
-
--2f<tail_fields>¶ Specify annotation fields to copy from tail records into assembled record
-
--gap<gap>¶ Number of N characters to place between ends
AssemblePairs reference¶
Assemble pairs by aligning reads against a reference database
usage: AssemblePairs [-h] [–version] ...
-
-h,--help¶ show this help message and exit
-
-1<seq_files_1>¶ An ordered list of FASTA/FASTQ files containing head/primary sequences.
-
-2<seq_files_2>¶ An ordered list of FASTA/FASTQ files containing tail/secondary sequences.
-
--fasta¶ Specify to force output as FASTA rather than FASTQ.
-
--failed¶ If specified create files containing records that fail processing.
-
--log<log_file>¶ Specify to write verbose logging to a file. May not be specified with multiple input files.
-
--delim<delimiter>¶ A list of the three delimiters that separate annotation blocks, field names and values, and values within a field, respectively.
-
--nproc<nproc>¶ The number of simultaneous computational processes to execute (CPU cores to utilized).
-
--outdir<out_dir>¶ Specify to changes the output directory to the location specified. The input file directory is used if this is not specified.
-
--outname<out_name>¶ Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files.
-
--coord{illumina,solexa,sra,454,presto}¶ The format of the sequence identifier which defines shared coordinate information across paired ends
-
--rc{head,tail,both}¶ Specify to reverse complement sequences before stitching
-
--1f<head_fields>¶ Specify annotation fields to copy from head records into assembled record
-
--2f<tail_fields>¶ Specify annotation fields to copy from tail records into assembled record
-
-r<ref_file>¶ A FASTA file containing the reference sequence database.
-
--minident<min_ident>¶ Minimum identity of the assembled sequence required to call a valid assembly (between 0 and 1).
-
--evalue<evalue>¶ Minimum E-value for the ublast reference alignment for both the head and tail sequence.
-
--maxhits<max_hits>¶ Maximum number of hits from ublast to check for matching head and tail sequence reference alignments.
-
--fill¶ Specify to insert change the behavior of inserted characters when the head and tail sequences do not overlap. If specified this will result in inserted of the V region reference sequence instead of a sequence of Ns in the non-overlapping region. Warning, you could end up making chimeric sequences by using this option.
-
--exec<usearch_exec>¶ The path to the usearch executable file.