Paper intensive reading (三十二):SAMSA2: a standalone metatranscriptome analysis pipeline

Title: SAMSA2: a standalone metatranscriptome analysis pipeline

Author: Samuel T. Westreich , Michelle L. Treiber , David A. Mills , Ian Korf and Danielle G. Lemay

Published: 21 May 2018

University of California

Abstract

Background
Complex microbial communities are an area of growing interest in biology. Metatranscriptomics allows researchers to quantify microbial gene expression in an environmental sample via high-throughput sequencing. Metatranscriptomic experiments are computationally intensive because the experiments generate a large volume of sequence data and each sequence must be compared with reference sequences from thousands of organisms.
Results
SAMSA2 is an upgrade to the original Simple Annotation of Metatranscriptomes by Sequence Analysis (SAMSA) pipeline that has been redesigned for standalone use on a supercomputing cluster. SAMSA2 is faster due to the use of the DIAMOND aligner, and more flexible and reproducible because it uses local databases. SAMSA2 is available with detailed documentation, and example input and output files along with examples of master scripts for full pipeline execution.
Conclusions
SAMSA2 is a rapid and efficient metatranscriptome pipeline for analyzing large RNA-seq datasets in a supercomputing cluster environment. SAMSA2 provides simplified output that can be examined directly or used for further analyses, and its reference databases may be upgraded, altered or customized to fit the needs of any experiment.

Outline

1. Background

2. Implementation

2.1 Recommended sequencing parameters

2.2 Tool dependencies and version control

2.3 Preprocessing

2.4 Annotation

2.5 Aggregation and downstream processing

3. Results

3.1 Improved speed and accuracy for metatranscriptome analysis

3.2 Ability to customize index database

3.3 Functional annotations and organism annotations for each input read

3.4 Sorting of functions into hierarchical categories using SEED Subsystems

3.5 Subdividing metatranscriptome data to obtain functional activity by specific organism

4. Discussion

5. Conclusion

正文摘录

1.Background
    对宏基因组和宏转录组的介绍,引出目前宏转录组的工具还比较少
    researchers are adopting more comprehensive sequencing methods such as metagenomics and metatranscriptomics. Metagenomics—sequencing of all DNA from a diverse sample—reveals which microbes are present. Metatranscriptomics—sequencing of all RNA from a diverse sample—captures all gene expression, giving a view of which microbes are active and what they are doing.

    4.Discussion
    当前复杂微生物环境研究方法的缺陷,宏转录组方法的优势。
    The study of complex microbial environments, which may contain many different interacting organisms, requires large amounts of data to fully understand. Current approaches, such as 16S rRNA sequencing, can provide a broad overview of which groups of microbes are present in an environment, but they fail to offer enough resolution to differentiate between closely related genera or species, and they provide no information about potential activities being performed by members of the microbiome. Although metatranscriptomics generally requires higher initial sample quality and has higher costs in sequencing and processing, it offers detailed insights into both the organisms present and their current transcriptional activity.

你可能感兴趣的:(Paper,Reading)