pRESTO - The REpertoire Sequencing TOolkit¶
pRESTO是一个工具箱,用来处理淋巴细胞集高通量测序的原始reads。
高通量测序的巨大进步使得对免疫球蛋白集的大规模表征成为可能,该集合的定义为,在T和B淋巴细胞表面的跨膜抗原-受体蛋白的集合。REpertoire Sequencing TOolkit (pRESTO) 由一套工具组成,用以处理在比对到种系参考序列前的所有阶段的序列处理。pRESTO既可以处理单端测序reads,也可以处理双端测序reads。它包含一下特性:质量控制,引物掩盖,对嵌入序列条码的reads的注释,生成唯一分子识别码(unique molecular identifier (UMI) )保守序列(#问题 原文是generation of unique molecular identifier (UMI) consensus sequences),双端测序reads的拼接和对重复序列的鉴别。同时也包含了大量的对序列进行排序,采样和转换操作的参数。
Getting Started¶
Examples¶
- Workflows
- Fixing UMI Problems
- Miscellaneous Tasks
- Importing data from SRA, ENA or GenBank into pRESTO
- Reducing file size for submission to IMGT/HighV-QUEST
- Subsetting sequence files by annotation
- Random sampling from sequence files
- Cleaning or removing poor quality sequences
- Assembling paired-end reads that do not overlap
- Assigning isotype annotations from the constant region sequence
Usage Documentation¶
About¶