文献阅读：Hungate1000收集的瘤胃微生物组成员的培养和测序

期刊

nature biotechnology (68.164/Q1)

摘要

Productivity of ruminant livestock depends on the rumen microbiota, which ferment indigestible plant polysaccharides into nutrients used for growth. Understanding the functions carried out by the rumen microbiota is important for reducing greenhouse gas production by ruminants and for developing biofuels from lignocellulose. We present 410 cultured bacteria and archaea, together with their reference genomes, representing every cultivated rumen-associated archaeal and bacterial family. We evaluate polysaccharide degradation, short-chain fatty acid production and methanogenesis pathways, and assign specific taxa to functions. A total of 336 organisms were present in available rumen metagenomic data sets, and 134 were present in human gut microbiome data sets. Comparison with the human microbiome revealed rumen-specific enrichment for genes encoding de novo synthesis of vitamin B12, ongoing evolution by gene loss and potential vertical inheritance of the rumen microbiome based on underrepresentation of markers of environmental stress. We estimate that our Hungate genome resource represents ~75% of the genus-level bacterial and archaeal taxa present in the rumen.

反刍家畜的生产力取决于瘤胃微生物群，它们将不可消化的植物多糖发酵成用于生长的营养物质。了解瘤胃微生物群所执行的功能对于减少反刍动物的温室气体生产和开发木质纤维素的生物燃料非常重要。我们提出了410个培养的细菌和古细菌，以及它们的参考基因组，代表了每一个培养的瘤胃相关的古细菌和细菌家族。我们评估了多糖降解、短链脂肪酸生产和甲烷生成途径，并将特定的分类群分配到功能上。共有336种生物存在于现有的瘤胃宏基因组数据集中，134种生物存在于人类肠道微生物组数据集中。与人类微生物组的比较显示，编码维生素B12重新合成的基因在瘤胃特异性富集，基因丢失导致的持续进化，以及基于环境胁迫标记物表达不足的瘤胃微生物组的潜在垂直遗传。我们估计，我们的Hungate基因组资源代表瘤胃中约75%的属级细菌和古菌类群。

介绍

Climate change and feeding a growing global population are the two biggest challenges facing agriculture 1. Ruminant livestock have an important role in food security 2; they convert low-value lignocellulosic plant material into high-value animal proteins that include milk, meat and fiber products. Microorganisms present in the rumen 3,4 ferment polysaccharides to yield short-chain fatty acids (SCFAs; acetate, butyrate and propionate) that are absorbed across the rumen epithelium and used by the ruminant for maintenance and growth. The rumen represents one of the most rapid and efficient lignocellulose depolymerization and utilization systems known, and is a promising source of enzymes for application in lignocellulose-based biofuel production 5. Enteric fermentation in ruminants is also the single largest anthropogenic source of methane (CH4 ) 6, and each year these animals release ~125 million tonnes of CH4 into the atmosphere. Targets to reduce agricultural carbon emissions have been proposed 7, with >100 countries pledging to reduce agricultural greenhouse gas emissions in the 2015 Paris Agreement of the United Nations Framework Convention on Climate Change. Consequently, improved knowledge of the flow of carbon through the rumen by lignocellulose degradation and fermentation to SCFAs and CH4 is relevant to food security, sustainability and greenhouse gas emissions.

气候变化和养活不断增长的全球人口是农业面临的两个最大挑战 1。反刍牲畜在粮食安全方面具有重要作用 2；它们将低价值的木质纤维素植物材料转化为高价值的动物蛋白，包括牛奶、肉类和纤维产品。存在于瘤胃中的微生物发酵多糖，产生短链脂肪酸（SCFAs；乙酸盐、丁酸盐和丙酸盐），通过瘤胃上皮吸收，被反刍动物用于维持和生长 3,4。瘤胃是已知的最快速和最有效的木质纤维素解聚和利用系统之一，是应用于以木质纤维素为基础的生物燃料生产的酶的有希望的来源 5。反刍动物的肠道发酵也是甲烷（CH4）的最大人为来源 6，每年这些动物向大气中释放约1.25亿吨的CH4。已经提出了减少农业碳排放的目标 7，超过100个国家在《联合国气候变化框架公约》的2015年巴黎协议中承诺减少农业温室气体排放。因此，提高对木质纤维素降解和发酵成SCFAs和CH4的碳在瘤胃中流动的认识，与粮食安全、可持续性和温室气体排放相关。

Understanding the functions of the rumen microbiome is crucial to the development of technologies and practices that support efficient global food production from ruminants while minimizing greenhouse gas emissions. The Rumen Microbial Genomics Network (http://www.rmgnetwork.org/) was launched under the auspices of the Livestock Research Group of the Global Research Alliance (http://globalresearchalliance.org/research/livestock/) to further this understanding, with the generation of a reference microbial genome catalog—the Hungate1000 project—as a primary collaborative objective. Although the microbial ecology of the rumen has long been the focus of research8,9, at the beginning of the project reference genomes were available for only 14 bacteria and one methanogen, so that genomic diversity was largely unexplored.

了解瘤胃微生物组的功能对开发技术和实践至关重要，这些技术和实践支持全球反刍动物的高效粮食生产，同时尽量减少温室气体排放。瘤胃微生物基因组学网络（http://www.rmgnetwork.org/）是在全球研究联盟（http://globalresearchalliance.org/research/livestock/）家畜研究组的支持下发起的，以进一步了解这一问题，其主要合作目标是生成一个参考微生物基因组目录--Hungate1000项目。尽管瘤胃的微生物生态学长期以来一直是研究的重点8,9，但在项目开始时，只有14种细菌和一种甲烷菌的参考基因组可用，因此基因组的多样性在很大程度上没有得到探索。

The Hungate1000 project was initiated as a community resource in 2012, and the collection assembled includes virtually all the bacterial and archaeal species that have been cultivated from the rumens of a diverse group of animals10. We surveyed Members of the Rumen Microbial Genomics Network and requested they provide cultures of interest. We supplemented these with additional cultures purchased from culture collections to generate the most comprehensive collection possible. These cultures are available to researchers, and we envisage that additional organisms will have their genome sequences included as more rumen microbes are able to be cultivated.

Hungate1000项目是在2012年作为社区资源启动的，收集的资料包括几乎所有从不同动物的瘤胃中培养出来的细菌和古细菌物种10。我们调查了瘤胃微生物基因组学网络的成员，要求他们提供感兴趣的培养物。我们用从培养物中购买的额外培养物来补充这些培养物，以产生最全面的收集。这些培养物可供研究人员使用，我们设想，随着更多的瘤胃微生物能够被培养，更多的生物体将被纳入其基因组序列。

Large-scale reference genome catalogs, including the Human Microbiome Project (HMP)11 and the Genomic Encyclopedia of Bacteria and Archaea (GEBA)12 have helped to improve our understanding of microbiome functions, diversity and interactions with the host. The success of these efforts has resulted in calls for continued development of high-quality reference genome catalogs13,14, and led to a resurgence in efforts to cultivate microorganisms15–17. This high-quality reference genome catalog for rumen bacteria and archaea increases our understanding of rumen functions by revealing degradative and physiological capabilities, and identifying potential rumen-specific adaptations.

大规模的参考基因组目录，包括人类微生物组计划（HMP）11和细菌和古细菌基因组百科全书（GEBA）12，有助于提高我们对微生物组功能、多样性和与宿主相互作用的理解。这些努力的成功使得人们呼吁继续开发高质量的参考基因组目录13,14，并导致了培养微生物的努力的重新兴起15-17。这个高质量的瘤胃细菌和古细菌参考基因组目录通过揭示降解和生理能力，以及识别潜在的瘤胃特异性适应，增加了我们对瘤胃功能的理解。

方法

Cultures used in this study.

本研究中使用的培养物。

The full list of cultures used in the project and their provenance is shown in Supplementary Table 1 with additional information available in Supplementary Note 1. New Zealand bacterial cultures from the Hungate Collection are available from the AgResearch culture collection while other cultures should be obtained from the relevant culture collections or requested from the sources shown in Supplementary Table 1.

本项目使用的全部培养物清单及其出处见补充表1，其他信息见补充说明1。来自Hungate藏品的新西兰细菌培养物可从AgResearch的培养物库中获得，而其他培养物应从相关的培养物库中获得，或从补充表1中所示的来源索取。

Genomic DNA isolation.

基因组DNA的分离。

Genomic DNA was extracted using the Qiagen Genomic-tip kit following the manufacturer's instructions for the 500/G size extraction. Purified DNA was subject to partial 16S rRNA gene sequencing to confirm strain identity, before being shipped to the DOE Joint Genome Institute (JGI), USA for sequencing.

使用Qiagen Genomic-tip试剂盒按照制造商的说明提取500/G大小的基因组DNA。纯化的DNA进行部分16S rRNA基因测序以确认菌株身份，然后运往美国能源部联合基因组研究所（JGI）进行测序。

Sequence, assembly and annotation.

序列、组装和注释。

All Hungate genomes were sequenced at the DOE Joint Genome Institute (JGI) using Illumina technology 56 or Pacific Biosciences (PacBio) RS technology 57. For all genomes, we either constructed and sequenced an Illumina short-insert paired-end library with an average insert size of 270 bp, or a Pacbio SMRTbell library. Genomes were assembled using Velvet 58, ALLPATHS 59 or Hierarchical Genome Assembly Process (HGAP) 60 assembly methods (specifics provided in Supplementary Table 2). Genomes were annotated by the DOE–JGI genome annotation pipeline 61,62. Briefly, protein-coding genes (CDSs) were identified using Prodigal 63 followed by a round of automated and manual curation using the JGI GenePrimp pipeline 64. Functional annotation and additional analyses were performed within the Integrated Microbial Genomes (IMG-ER) platform 32. All data as well as detailed sequencing and assembly reports can be downloaded from https://genome.jgi.doe.gov/portal/pages/dynamicOrganismDownload.jsf?organism=HungateCollection.

所有Hungate基因组都是在DOE联合基因组研究所（JGI）使用Illumina技术56或Pacific Biosciences（PacBio）RS技术57进行测序。对于所有的基因组，我们要么构建并测序了平均插入大小为270bp的Illumina短插入配对端文库，要么构建了Pacbio SMRTbell文库。基因组用Velvet 58、ALLPATHS 59或Hierarchical Genome Assembly Process (HGAP) 60的组装方法进行组装（具体情况见补充表2）。基因组由DOE-JGI基因组注释管道61,62进行注释。简而言之，使用Prodigal 63识别蛋白质编码基因（CDS），然后使用JGI GenePrimp管道64进行一轮自动和手动整理。功能注释和额外的分析是在综合微生物基因组（IMG-ER）平台上进行的32。所有的数据以及详细的测序和组装报告都可以从https://genome.jgi.doe.gov/portal/pages/dynamicOrganismDownload.jsf?organism=HungateCollection。

Hungate Collection and the Global Rumen Census analysis.

Hungate收集和全球瘤胃普查分析。

We used the 16S rRNA gene sequences generated from the Global Rumen Census (GRC) 22 to map the phylogenetic positions of the Hungate Collection genomes onto the known global distribution of Bacteria and Archaea from the rumen. Ten-thousand predicted OTUs were randomly chosen from the total 673,507 OTUs identified from that study in order to construct a phylogenetic tree. The 16S rRNA gene sequences for Hungate Collection genomes were added to the GRC subsample, and all Bacteria and Archaea were checked for chimeras and to ensure they represented separate OTUs using CDHIT-OTU 65 (with a 0.97% identity Ribosomal Database Project (RDP) 66 followed by visual inspection with JalView67. Taxonomic classifications were taken from those predicted by the GRC study. A maximum likelihood tree was then separately constructed for the Bacteria and Archaea using two rounds of Fasttree (version 2.1.7) 68: the first round built a maximum likelihood tree using the GTR model of evolution and (options: -gtr –nt); the second round optimized the branch lengths for the resulting topology (options: -gtr -nt -nome –mllen). The resulting phylogenetic trees were visualized using iTOL 55 with the mapped positions of the Hungate genomes.

我们使用全球瘤胃普查（GRC）22产生的16S rRNA基因序列，将Hungate Collection基因组的系统发育位置映射到已知的瘤胃细菌和古细菌的全球分布上。从该研究确定的总共673,507个OTU中随机选择一万个预测的OTU，以构建系统发育树。Hungate Collection基因组的16S rRNA基因序列被添加到GRC子样本中，并使用CDHIT-OTU 65（身份为0.97%的核糖体数据库项目（RDP）66，然后用JalView67进行目测，检查所有细菌和古细菌的嵌合体，以确保它们代表独立的OTU。分类法取自GRC研究预测的分类法。然后用两轮Fasttree（2.1.7版）68为细菌和古细菌分别构建最大似然树：第一轮使用GTR进化模型和（选项：-gtr -nt）构建最大似然树；第二轮为所得到的拓扑结构优化分支长度（选项：-gtr -nt -nome -mllen）。使用iTOL 55将得到的系统发育树与Hungate基因组的映射位置进行了可视化。

Carbohydrate-active enzymes (CAZymes).

碳水化合物活性酶（CAZymes）

For each of the 501 genomes, the protein sequences were subjected to parallel (i) BLAST queries against CAZy libraries, of both complete sequences and individual modules; and (ii) HMMER searches using CAZy libraries of module family and subfamilies. Family assignments and overall CAZyme modularity were further validated through a human curation step, when proteins were not fully aligned (without gaps) with >50% identity to CAZy records.

对于501个基因组中的每一个，蛋白质序列都进行了平行的（i）针对CAZy库的BLAST查询，包括完整的序列和单个模块；以及（ii）使用CAZy库的模块家族和亚家族的HMMER搜索。当蛋白质与CAZy记录不完全吻合（无间隙），且与CAZy记录的同一性大于50%时，家族分配和整体CAZyme模块性通过人类策展步骤得到进一步验证。

Conserved single-copy gene phylogeny.

保守的单拷贝基因的系统发育。

A set of 56 universally conserved single-copy proteins in bacteria and archaea 69 was used for construction of the Butyrivibrio phylogenetic tree. Marker genes were detected and aligned using hmmsearch and hmmalign included in HMMER3 (ref. 70) using HMM profiles obtained from Phylosift 71. Alignments were concatenated and filtered. A phylogenetic tree was inferred using the maximum likelihood methods with RAxML (version 7.6.3). Tree topologies were tested for robustness using 100 bootstrap replicates and the standard LG model. Trees were visualized using FastTree followed by iTOL 55.

一组在细菌和古细菌中普遍保守的56个单拷贝蛋白69被用于构建Butyrivibrio的系统发育树。使用HMMER3(参考文献70)中的hmmsearch和hmmalign，利用从Phylosift获得的HMM图谱71，检测并排列标记基因。将排列组合在一起并进行过滤。使用RAxML（7.6.3版）的最大似然法推断出系统发育树。使用100次自举重复和标准的LG模型对树的拓扑结构进行了稳健性测试。使用FastTree和iTOL 55对树进行了可视化。

Prediction of biosynthetic clusters.

预测生物合成集群。

Putative biosynthetic clusters (BCs) were predicted and annotated using AntiSMASH version 3.0.4 (ref. 72) with the “inclusive” and the “borderpredict” options. All other options were left as default.

CRISPR–CAS system analysis.

CRISPR-CAS系统分析。

A modified version of the Crispr Recognition Tool (CRT) algorithm 61, with annotations from the Integrated Microbial Genomes with Metagenomes (IMG/M) system 32 was used to validate the functionality of the CRISPR–Cas types (only complete cas gene arrangements were used plus those cas 'orphan' arrays with the same repeat from a complete array within the same genome). This Hungate spacer collection was queried against the viral database from the Integrated Microbial Genome system (IMG/VR database) 73, a custom global “spacerome” (predicted from all IMG isolate and metagenome data sets) and the NBCI refseq plasmid database. All spacer searches were performed using the BLASTn-short function from the BLAST+ package 74 with parameters: e-value threshold of 1.0 × 10−6, percentage identity of >94% and coverage of >95%. These cutoffs were recommended by a recent study benchmarking the accuracy of spacer hits across a range of % identities and coverage 75.

使用Crispr识别工具（CRT）算法61的修改版，以及来自集成微生物基因组与宏基因组（IMG/M）系统32的注释，来验证CRISPR-Cas类型的功能（只使用完整的cas基因排列，加上那些cas "孤儿 "阵列与同一基因组内完整阵列的相同重复）。这个Hungate间隔物集合被查询了来自综合微生物基因组系统的病毒数据库（IMG/VR数据库）73，一个定制的全球 "间隔物"（从所有IMG分离物和宏基因组数据集预测）和NBCI refseq质粒数据库。所有的间隔体搜索都是使用BLAST+软件包74中的BLASTn-short功能进行的，参数为：e值阈值为1.0×10-6，同一性百分比为>94%，覆盖率为>95%。这些临界值是由最近的一项研究推荐的，该研究对在一定范围内的同一性和覆盖率的间隔体命中的准确性进行了基准测试75。

Recruitment of metagenomic sequences.

招募宏基因组序列。

1,468,357 protein coding sequences or CDS from 501 Hungate isolate genomes were searched using LAST 76 against ∼1.9 billion CDS predicted from 8,200 metagenomic samples stored in the IMG database. Hungate genomes were designated as “recruiters” if the following criteria were met: a minimum of 200 CDS with hits at ≥ 90% amino acid identity over 70% alignment lengths to an individual metagenomic CDS or ≥ 10% capture of total CDS in each genome. The rationale for choosing the minimum 200 hit count was to ensure that the evidence included more than merely housekeeping genes (which tend to be more highly conserved). In a few instances, the 200 CDS hit count requirement was relaxed if at least 10% of the total CDS in the genomes was captured. The 90% amino acid identity cutoff was chosen based on Luo et al. 77, who assert that organisms grouped at the 'species' level typically show >85% AAI among themselves. We ascertained that ≥ 90% identity was sufficiently discriminatory for species in the Hungate genome set by observing differences in the recruitment pattern (hit count or % CDS coverage) of different species of the same genus (e.g., Prevotella spp., Butyrivibrio spp., Bifidobacterium spp., Treponema spp.) from every phylum against the same metagenomic sample.

使用LAST 76对储存在IMG数据库中的8200个宏基因组样本中预测的19亿个CDS进行了搜索，从501个Hungate分离物基因组中找到了1,468,357个蛋白质编码序列或CDS。如果符合以下标准，Hungate基因组被指定为 "招募者"：至少有200个CDS的命中率≥90%，与单个宏基因组CDS的排列长度超过70%，或者每个基因组中的CDS总数≥10%。选择最小200个命中数的理由是为了确保证据不仅仅包括管家基因（往往是高度保守的）。在一些情况下，如果基因组中至少有10%的CDS被捕获，那么200个CDS的命中数要求就会被放宽。90%的氨基酸同一性是根据Luo等人77的观点选择的，他们认为在 "物种 "水平上分组的生物体之间通常显示出>85%的AAI。我们确定≥90%的同一性对Hungate基因组中的物种有足够的鉴别力，方法是观察来自各门的同一属的不同物种（如普雷沃特氏菌属、布氏杆菌属、双歧杆菌属、特雷贝氏菌属）对同一宏基因组样本的招募模式（命中数或CDS覆盖率%）的差异。

For nucleotide read recruitment, total reads from an individual metagenome were aligned against scaffolds from each of the 501 isolates using the BWA aligner 78. The effective minimum nucleotide % identity was ∼75% with a minimum alignment length of 50 bp. Alignment results were examined in terms of total number of reads recruited to an isolate (at different % identity cutoffs with ≥ 97% identity proposed as a species-level recruitment), average read depth of total reads recruited to a given isolate genome, as well as % coverage of total nucleotide length of the genome.

对于核苷酸读数的招募，使用BWA对齐器78将单个宏基因组的总读数与501个分离物的支架进行对齐。有效的最小核苷酸%认同度为75%，最小对齐长度为50bp。对比结果是根据被招募到一个分离体的读数总数（在不同的认同度截止点，认同度≥97%被认为是物种级别的招募）、被招募到一个特定分离体基因组的总读数的平均读深度以及基因组总核苷酸长度的覆盖率来检查。

Genome comparisons.

基因组的比较。

For rumen versus human isolates comparisons, human intestinal isolate genomes were carefully selected from the IMG database using available GOLD metadata fields pertaining to isolation source (and taking care to remove known pathogens). Genome redundancies within either the human set or the rumen set were eliminated after assessing the average nucleotide identity (ANI) of total best bidirectional hits and removing genomes sharing >99% ANI (alignment fraction of total CDS ≥ 60%) to another genome within that set. Furthermore, low-quality genomes within the human set were flagged and removed based on the absence of the “high-quality” filter assigned by the IMG quality control pipeline owing to lack of phylum-level taxonomic assignment or if the coding density was <70% or >100% or the number of genes per million base pairs was <300 or >1,200 (ref. 61). This approach resulted in 388 genomes delineated in the human set and 458 genomes in the rumen set (lists provided in Supplementary Table 10). Both collections of genomes had similar average genome sizes (3.3–3.5 Mbp) and completeness (evaluated by CheckM19). Pairwise comparisons of gene counts for individual Pfams between members of each set were performed using Metastats 79, which employs a non-parametric two-sided t-test test (or a Fischer's exact test for sparse counts) with false-discovery rate (FDR) error correction to identify differentially abundant features between the two genome sets. Most significant features were delineated using a q-value cutoff of <0.001, and less populous or sparsely recruited Pfams were also eliminated (where the sum of gene counts in each genome set was <100) (Supplementary Table 11, worksheet designated “Q-val<0.001_edited”). A second worksheet labeled “Q-val<0.005” shows a larger subset of differentially abundant Pfams applying the less stringent threshold of Q-value < 0.005, and including results for Pfams with sparse counts. Pfam was chosen for this primary analysis because it is the largest and most widely used source of manually curated protein families, with nearly 80% coverage (on average) of total CDS in these microbial genomes. KO terms or TIGRFAMS were also assessed to validate and complement Pfam-based findings or to examine specific pathways more closely. For comparisons of enolase-positive versus enolase-negative Butyrivibrio spp. strains, Metastats 79 was employed in conjunction with contrasting upper and lower quartile or percentile gene counts, in order to identify additional functions with a similar pattern of preservation/loss as the glycolytic enolase gene.

对于瘤胃与人类分离物的比较，人类肠道分离物的基因组是从IMG数据库中利用与分离物来源有关的现有GOLD元数据字段（并注意去除已知的病原体）仔细选择的。在评估全部最佳双向命中的平均核苷酸同一性（ANI）并删除与该组中另一个基因组共享>99% ANI（总CDS的排列比例≥60%）的基因组后，消除了人类组或瘤胃组中的基因组冗余。此外，人类基因组中的低质量基因组被标记并删除，依据是IMG质量控制管道分配的 "高质量 "过滤器的缺失，原因是缺乏系统级的分类，或者编码密度<70%或>100%，或者每百万碱基对的基因数<300或>1,200（参考文献61）。这种方法的结果是在人类基因组中划定了388个基因组，在瘤胃基因组中划定了458个基因组（名单见补充表10）。两个基因组集合的平均基因组大小（3.3-3.5Mbp）和完整性（由CheckM19评估）相似。使用Metastats 79对每个基因组成员之间的单个Pfams的基因计数进行配对比较，Metastats 79采用非参数双侧t检验（或对稀疏计数进行费舍尔精确检验），并进行虚假发现率（FDR）误差修正，以确定两个基因组之间的不同丰度特征。使用<0.001的q值分界线划定最重要的特征，同时排除人口较少或招募较少的Pfams（在每个基因组中的基因计数之和<100）（补充表11，工作表指定为 "Q-val<0.001_编辑"）。第二张工作表标记为 "Q-val<0.005"，显示了一个更大的差异丰度的Pfams子集，应用较不严格的Q值<0.005的阈值，并包括具有稀疏计数的Pfams的结果。选择Pfam进行主要分析，是因为它是最大和最广泛使用的人工策划的蛋白质家族的来源，在这些微生物基因组中的总CDS覆盖率接近80%（平均）。还对KO术语或TIGRFAMS进行了评估，以验证和补充基于Pfam的发现，或更密切地检查特定的途径。对于烯醇酶阳性菌株与烯醇酶阴性菌株的比较，Metastats 79与上、下四分位数或百分位数的对比一起使用，以确定具有与糖酵解酶基因类似的保存/损失模式的其他功能。

For metagenomes-based comparisons, previously published sheep rumen (IMG IDs: 3300021254, 300021255, 3300021256, 3300021387, 3300021399, 3300021400, 3300021426, 3300021431) and human intestinal (IMG IDs: 3300008260, 3300008496, 3300007299, 3300007296, 3300008272, 3300007361, 3300008551, 3300007305, 3300007717) metagenomes were reassembled using metaSPAdes80, annotated and loaded into IMG. Estimated gene copy numbers (calculated by multiplying gene count with read depth for the scaffold the gene resides on) were compared using Metastats (as described above).

为了进行基于宏基因组的比较，以前发表的绵羊瘤胃（IMG IDs: 3300021254, 300021255, 3300021256, 3300021387, 3300021399, 3300021400, 3300021426, 3300021431）和人类肠道（IMG IDs: 3300008260, 3300008496, 3300007299, 3300007296, 3300008272, 3300007361, 3300008551, 3300007305, 3300007717）的宏基因组使用metaSPAdes80进行重新组合，注释并加载到IMG。使用Metastats（如上所述）对估计的基因拷贝数（通过将基因计数与基因所在支架的读深度相乘计算）进行比较。

结果

Reference rumen genomes

参考瘤胃基因组

Members of nine phyla, 48 families and 82 genera (Supplementary Table 1 and Supplementary Note 1) are present in the Hungate Collection. The organisms were chosen to make the coverage of cultivated rumen microbes as comprehensive as possible10. While multiple isolates were sequenced from some polysaccharide-degrading genera (Butyrivibrio, Prevotella and Ruminococcus), many species are represented by only one or a few isolates. 410 reference genomes were sequenced in this study, and were analyzed in combination with 91 publicly available genomes18. All Hungate1000 genomes were sequenced using Illumina or PacBio technology, and were assembled and annotated as summarized in the Online Methods. All genomes were assessed as high quality using CheckM19 with >99% completeness on average, and in accordance with proposed standards20. The genome statistics can be found in Supplementary Table 2.

在Hungate收藏品中，有9个门、48个科和82个属的成员（补充表1和补充说明1）。选择这些生物是为了使培养的瘤胃微生物的覆盖面尽可能的全面10。虽然从一些多糖降解属（Butyrivibrio、Prevotella和Ruminococcus）中测序了多个分离物，但许多物种仅由一个或几个分离物代表。在这项研究中，对410个参考基因组进行了测序，并与91个公开的基因组结合起来进行分析18。所有的Hungate1000基因组都是用Illumina或PacBio技术测序的，并按照在线方法中的总结进行了组装和注释。所有的基因组都用CheckM 19评估为高质量，平均完整度大于99%，并符合建议的标准20。基因组的统计数据可以在补充表2中找到。

The 501 sequenced organisms analyzed in this study are listed in Supplementary Table 1. We refer to these 501 genomes (480 bacteria and 21 archaea) as the Hungate genome catalog. Supplementary Table 3 provides a comprehensive chronological list of all publicly available completed rumen microbial genome sequencing projects, including anaerobic fungi and genomes that have been recovered from metagenomes but that were not included in our analyses.

本研究中分析的501个测序生物体列于补充表1。我们把这501个基因组（480个细菌和21个古细菌）称为Hungate基因组目录。补充表3提供了所有公开的已完成的瘤胃微生物基因组测序项目的综合时间列表，包括厌氧真菌和已经从宏基因组中恢复的基因组，但没有包括在我们的分析中。

Members of the Firmicutes and Bacteroidetes phyla predominate in the rumen21,22 and contribute most of the Hungate genome sequences (68% and 12.8%, respectively; Supplementary Fig. 1a), with the Lachnospiraceae family making up the largest single group (32.3%). Archaea are mainly from the Methanobrevibacter genus or are in the Methanomassiliicoccales order. The average genome size is ~3.3 Mb (Supplementary Fig. 1b), and the average G+C content is 44%. Most organisms were isolated directly from the rumen (86.6%), with the remainder isolated from feces or saliva. Most cultured organisms were from bovine (70.9%) or ovine (17.6%) hosts, but other ruminant or camelid species are also represented (Table 1).

瘤胃中以 Firmicutes （厚壁菌门）和 Bacteroidetes （拟杆菌门）门的成员为主21,22，贡献了大部分 Hungate 基因组序列（分别为 68% 和 12.8%；补充图 1a），其中 Lachnospiraceae （毛螺菌科）科构成了最大的单一群体（32.3%）。古细菌主要来自Methanobrevibacter属或属于Methanomassiliicoccales目。平均基因组大小为~3.3 Mb（补充图1b），平均G+C含量为44%。大多数生物体是直接从瘤胃中分离出来的（86.6%），其余的是从粪便或唾液中分离出来的。大多数培养的生物体来自牛（70.9%）或卵（17.6%）宿主，但也有其他反刍动物或骆驼类物种（表1）。

The Global Rumen Census project22 profiled the microbial communities of 742 rumen samples present in diverse ruminant species, and found that rumen communities largely comprised similar bacteria and archaea in the 684 samples that met the criteria for inclusion in the analysis. A core microbiome of seven abundant genus-level groups was defined for 67% of the Global Rumen Census sequences22. We overlaid 16S rRNA gene sequences from the 501 Hungate genomes onto the 16S rRNA gene amplicon data set from the Global Rumen Census project (Fig. 1). This revealed that our Hungate genomes represent ~75% of the genus-level taxa reported from the rumen.

全球瘤胃普查项目22对存在于不同反刍动物物种中的742个瘤胃样品的微生物群落进行了分析，发现在符合分析标准的684个样品中，瘤胃群落主要由类似的细菌和古细菌组成。在全球瘤胃普查序列22中，有67%定义了由七个丰富的属级群组成的核心微生物组。我们将501个Hungate基因组的16S rRNA基因序列重叠到全球Rumen Census项目的16S rRNA基因扩增子数据集上（图1）。这表明我们的Hungate基因组代表了从瘤胃中报告的属级分类群的约75%。

Figure 1 Microbial community composition data from the Global Rumen Census22 overlaid with the 16S rRNA gene sequences (yellow dots) from the 501 Hungate catalog genomes. Two groups of abundant but currently unclassified bacteria are indicated by blue (Bacteroidales, RC-9 gut group) and orange (Clostridiales, R-7 group) dots. The colored rings around the trees represent the taxonomic classifications of each OTU from the Ribosomal Database Project database (from the innermost to the outermost): genus, family, order, class and phylum. The strength of the color is indicative of the percentage similarity of the OTU to a sequence in the RDP database of that taxonomic level.

图1 来自全球瘤胃普查22的微生物群落组成数据与来自501个Hungate目录基因组的16S rRNA基因序列（黄点）相叠加。蓝色（类细菌，RC-9肠道组）和橙色（梭菌类，R-7组）的点表示两组丰富但目前未分类的细菌。树周围的色环代表核糖体数据库项目数据库中每个OTU的分类（从最里面到最外面）：属、科、目、类和门。颜色的强度表示该OTU与RDP数据库中该分类级别的序列的相似百分比。

Previous studies of the rumen microbiome have highlighted unclassified bacteria as being among the most abundant rumen microorganisms10,21, and we also report 73 genome sequences from strains that have yet to be taxonomically assigned to genera or phenotypically characterized (Supplementary Table 1). Most abundant among these uncharacterized strains are members of the order Bacteroidales (RC-9 gut group) and Clostridiales (R-7 group), and this abundance points to a key role for these strains in rumen fermentation22. The RC-9 gut group bacteria have small genomes (~2.3 Mb), and the closest named relatives (84% identity of the 16S rRNA gene) are members of the genus Alistipes, family Rikenellaceae. The R-7 group are most closely related to Christensenella minuta (86% identity of the 16S rRNA gene), family Christensenellaceae.

以前对瘤胃微生物组的研究强调，未分类的细菌是瘤胃中最丰富的微生物10,21，我们还报告了73个基因组序列，这些菌株尚未被分类到属或表型特征（补充表1）。在这些未定性的菌株中，最丰富的是类杆菌目（RC-9肠道组）和梭菌目（R-7组）的成员，这种丰富性表明这些菌株在瘤胃发酵中起着关键作用22。RC-9肠道组细菌的基因组很小（约2.3Mb），最接近的命名亲属（16S rRNA基因的84%相同）是Rikenellaceae（理研菌科）的Alistipes（另枝菌属）属的成员。R-7组与Christensenella minuta（小克里斯滕森氏菌）（16S rRNA基因的86%的一致性）关系最密切，Christensenellaceae（克里斯滕森菌科）科。

Functions of the rumen microbiome

瘤胃微生物组的功能

Polysaccharide degradation. Ruminants need efficient lignocellulose breakdown to satisfy their energy requirements, but ruminant genomes, in common with the human genome, encode very limited

degradative enzyme capacity. Cattle have a single pancreatic amylase23, and several lysozymes24 which functions as lytic digestive enzymes that can kill Gram-positive bacteria25.

多糖降解。反刍动物需要有效地分解木质纤维素以满足其能量需求，但反刍动物的基因组与人类的基因组一样，编码的降解酶能力非常有限。牛有单一的胰腺淀粉酶23，和几种溶菌酶24，其功能是催化消化酶，可以杀死革兰氏阳性细菌25。

We searched the CAZy database for each Hungate genome (http://www.cazy.org/)26 in order to characterize the spectrum of carbohydrate-active enzymes and binding proteins present (Supplementary Fig. 2 and Supplementary Table 4). In total, the Hungate genomes encode 32,755 degradative CAZymes (31,569 glycoside hydrolases and 1,186 polysaccharide lyases), representing 2.2% of the combined ORFeome. The largest and most diverse CAZyme repertoires (Fig. 2a) were found in isolates with large genomes including Bacteroides ovatus (over 320 glycoside hydrolases (GH) and polysaccharide lyases (PL) from ~60 distinct families), Lachnospiraceae bacterium NLAE-zl-G231 (296 GHs and PLs), Ruminoclostridium cellobioparum ATCC 15832 (184 GHs and PLs) and Cellvibrio sp. BR (158 GHs and PLs). The most prevalent CAZyme families are shown in Supplementary Figure 3. Bacteria that initiate the breakdown of plant fiber are predicted to be important in rumen microbial fermentation (Fig. 2b), including representatives of bacterial groups capable of degrading cellulose, hemicellulose (xylan/xyloglucan) and pectin (Fig. 2c).

我们搜索了每个Hungate基因组的CAZy数据库（http://www.cazy.org/）26，以确定碳水化合物活性酶和结合蛋白的光谱（补充图2和补充表4）。总的来说，Hungate基因组编码了32,755种降解性CAZymes（31,569种糖苷水解酶和1,186种多糖裂解酶），占综合ORFeome的2.2%。最大和最多样化的CAZyme repertoires（图2a）出现在具有大基因组的分离物中，包括Bacteroides ovatus（卵形拟杆菌）（来自约60个不同家族的超过320种糖苷水解酶（GH）和多糖裂解酶（PL））、Lachnospiraceae（毛螺菌科）NLAE-zl-G231细菌（296种GH和PL）、Ruminoclostridium cellobioparum ATCC 15832（184种GH和PL）以及Cellvibrio sp. BR（158个GHs和PLs）。最普遍的CAZyme家族显示在补充图3中。据预测，启动植物纤维分解的细菌在瘤胃微生物发酵中非常重要（图2b），包括能够降解纤维素、半纤维素（木聚糖/木聚糖）和果胶的细菌群的代表（图2c）。

Figure 2 Functions of the rumen microbiome. (a) Number of degradative CAZymes (GH, glycoside hydrolases and PL, polysaccharide lyases) in distinct families in each of the 501 Hungate catalog genomes. Genomes are colored by phylum. (b) Simplified illustration showing the degradation and metabolism of plant structural carbohydrates by the dominant bacterial and archaeal groups identified in the Global Rumen Census project22 using information from metabolic studies and analysis of the reference genomes. The abundance and prevalence data shown in the table are taken from the Global Rumen Census project22. Abundance represents the mean relative abundance (%) for that genus-level group in samples that contain that group, while prevalence represents the prevalence of that genus-level group in all samples (n = 684).* The conversion of choline to trimethylamine, and propanediol to propionate generate toxic intermediates that are contained within bacterial microcompartments (BMC). Cultures from the reference genome set that encode the genes required to produce the structural proteins required for BMC formation are shown in Supplementary Table 5. (c) Number of polysaccharide-degrading CAZymes encoded in the genomes of representatives from the eight most abundant bacterial groups. Cellulose: GH5, GH9, GH44, GH45, GH48; pectin: GH28, PL1, PL9, PL10, PL11, CE8, CE12; xylan: GH8, GH10, GH11, GH43, GH51, GH67, GH115, GH120, GH127, CE1, CE2.

图2 瘤胃微生物组的功能。(a) 在501个Hungate目录基因组中，不同家族的降解性CAZymes（GH，糖苷水解酶和PL，多糖裂解酶）的数量。基因组按门类着色。(b) 简化的图示显示了全球瘤胃普查项目22中确定的主要细菌和古细菌群对植物结构碳水化合物的降解和代谢，这些细菌和古细菌群使用来自代谢研究和参考基因组分析的信息。表中显示的丰度和流行率数据取自全球瘤胃普查项目22。丰度代表该属级组在含有该组的样品中的平均相对丰度（%），而流行率代表该属级组在所有样品中的流行率（n = 684）。来自参考基因组集的培养物，其编码的基因编码BMC形成所需的结构蛋白的参考基因组培养物显示在补充表5中。(c) 八个最丰富的细菌组的代表的基因组中编码的多糖降解CAZymes的数量。纤维素。GH5, GH9, GH44, GH45, GH48；果胶。GH28, PL1, PL9, PL10, PL11, CE8, CE12；木质素。GH8, GH10, GH11, GH43, GH51, GH67, GH115, GH120, GH127, CE1, CE2。

Examination of the CAZyme profiles (Supplementary Fig. 3) highlights the degradation strategies used by different taxa present in our collection. Members of the phylum Bacteroidetes have evolved polysaccharide utilization loci (PULs), genomic regions that encode all required components for the binding, transport and depolymerization of specific glycan structures. Predictions of PUL organization in all 64 Bacteroidetes genomes from the Hungate catalog have been integrated into the dedicated PULDB database27. The pectin component rhamnogalacturonan II (RG-II) is the most structurally complex plant polysaccharide, and all the CAZymes required for its degradation occur in a single large PUL recently identified in Bacteroides thetaiotaomicron28. Similar PULs encoding all necessary enzymes were also found in rumen isolates belonging to three different families within the phylum Bacteroidetes (Supplementary Fig. 2 and Supplementary Fig. 4). Another feature of the Bacteroidetes genomes and PULs is the prevalence of GH families dedicated to the breakdown of animal glycans (Supplementary Figure 2). Host glycans are not thought to be used as a carbohydrate source for rumen bacteria, and most of the genomes with extensive repertoires of these enzymes (Bacteroides spp.) were from species that were isolated from feces. However, ruminants secrete copious saliva and the presence of animal glycan-degrading enzymes in rumen Prevotella spp. may enable them to utilize salivary N-linked glycoproteins29, and help explain their abundance in the rumen microbiome22.

对CAZyme概况的检查（补充图3）突出了我们收集的不同分类群所使用的降解策略。类杆菌门的成员已经进化出多糖利用位点（PULs），这些基因组区域编码结合、运输和解聚特定糖类结构的所有必要成分。来自Hungate目录的所有64个类杆菌基因组中的PUL组织的预测已被整合到专门的PULDB数据库中27。果胶成分鼠李半乳糖醛酸II（RG-II）是结构最复杂的植物多糖，其降解所需的所有CAZymes都出现在最近在Bacteroides thetaiotaomicron（多型拟杆菌） 28中发现的一个大型PUL中。在属于类杆菌门的三个不同家族的瘤胃分离物中也发现了编码所有必要酶的类似PULs（补充图2和补充图4）。类杆菌基因组和PULs的另一个特点是普遍存在专门用于分解动物糖类的GH家族（补充图2）。宿主糖被认为不会被用作瘤胃细菌的碳水化合物来源，大多数具有广泛的这些酶的基因组（Bacteroides spp. 拟杆菌属）都来自于从粪便中分离出来的物种。然而，反刍动物分泌大量的唾液，瘤胃普雷沃特菌属中存在动物糖降解酶，可能使它们能够利用唾液中的N-连接糖蛋白29，并有助于解释它们在瘤胃微生物组中的丰度22。

The multisubunit cellulosome is an alternative strategy for complex glycan breakdown in which a small module (dockerin) appended to glycan-cleaving enzymes anchors various catalytic units onto cognate cohesin repeats found on a large scaffolding protein30. Cellulosomes have been reported in only a small number of species, mainly in the family Ruminococcaceae in the order Clostridiales. Supplementary Table 4 reports the number of dockerin and cohesin modules found in the reference genomes and the main cellulosomal bacteria are highlighted in Supplementary Figure 2. We find that Clostridiales bacteria can be divided into four broad categories: (i) those that have neither dockerins nor cohesins (non-cellulosomal species), (ii) those that have just a few dockerins and no cohesins (most likely non-cellulosomal), (iii) those that have a large number of dockerins and many cohesins (true cellulosomal bacteria like Ruminococcus flavefaciens) and (iv) those that have a large number of dockerins but just a few cohesins like R. albus and R. bromii. In R. albus, it is likely that a single cohesin serves to anchor isolated dockerin-bearing enzymes onto the cell surface rather than to build a bona fide cellulosome. The starchde grading enzymes of R. bromii bear dockerin domains that enable them to assemble into cohesin-based amylosomes31, analogous to cellulosomes, which are active against particulate resistant starches. R. bromii strains from the human gut microbiota and the rumen encode similar enzyme complements31.

多亚单位纤维素体是复杂糖类分解的另一种策略，其中一个小模块（dockerin）附加到糖类分解酶上，将各种催化单元固定在大型支架蛋白30上发现的同源粘连蛋白重复上。纤维素体仅在少数物种中被报道，主要是在梭菌目鲁米诺球菌科中。补充表4报告了在参考基因组中发现的dockerin和cohesin模块的数量，主要的纤维素体细菌在补充图2中被强调。我们发现，梭状芽孢杆菌可分为四大类。(i) 那些既没有dockerins也没有cohesins的细菌（非纤维素体物种），(ii) 那些只有少数dockerins而没有cohesins的细菌（很可能是非纤维素体），(iii) 那些有大量dockerins和许多cohesins的细菌（真正的纤维素体细菌，如Ruminococcus flavefaciens）和(iv) 那些有大量dockerins而只有少数cohesins的细菌，如R. albus和R. bromii。在R. albus中，很可能是一个单一的协同蛋白用于将孤立的含有dockerin的酶固定在细胞表面，而不是建立一个真正的纤维素体。R. bromii的淀粉德分级酶带有dockerin结构域，使它们能够组装成基于cohesin的淀粉体31，类似于纤维素体，对颗粒状抗性淀粉有活性。来自人类肠道微生物群和瘤胃的R. bromii菌株编码了类似的酶互补性31。

Fermentation pathways.

发酵途径。

Most of what is known about microbial fermentation pathways in the rumen has been derived from measurements of end product fluxes or inferred from pure or mixed cultures of microorganisms in vitro, and based on reference metabolic pathways present in non-rumen microbes. The relative participation of particular species in each pathway, or their contribution to end product formation in vivo, is poorly characterized. To determine the functional potential of the sequenced species, we used genome information in combination with the published literature to assign bacteria to different metabolic strategies, on the basis of their substrate utilization and production of specific fermentation end products (Supplementary Table 5). The main metabolic pathways and strategies are present in at least one of, or combinations of, the most abundant bacterial and archaeal groups found in the rumen (Fig. 2b); as a result, we now have a better understanding of which pathways are encoded by these groups. The analysis also provides the first information on the contribution made by the abundant but uncharacterized members of the orders Bacteroidales and Clostridiales to the rumen fermentation. This metabolic scheme provides a framework for the investigation of gene function in these organisms, and the design of strategies that may enable manipulation of rumen fermentation.

人们对瘤胃中微生物发酵途径的了解大多来自对最终产品通量的测量，或从体外微生物的纯种或混合培养物中推断出来的，并以非瘤胃微生物中存在的参考代谢途径为基础。每条途径中特定物种的相对参与，或它们对体内最终产品形成的贡献，都没有得到很好的描述。为了确定被测序物种的功能潜力，我们利用基因组信息与已发表的文献相结合，根据细菌的底物利用和特定发酵终端产品的生产，将其归入不同的代谢策略（补充表5）。主要的代谢途径和策略至少存在于瘤胃中发现的最丰富的细菌和古细菌群体中的一种或其组合中。瘤胃（图2b）；因此，我们现在对这些群体编码的途径有了更好的了解。该分析还首次提供了关于丰富但未被描述的类杆菌和梭菌类成员对瘤胃发酵所做贡献的信息。这一代谢方案为研究这些生物体的基因功能提供了一个框架，并为设计可能实现瘤胃发酵操纵的策略提供了依据。

Gene loss.

基因丢失

One curious feature of several rumen bacteria is the absence of an identifiable enolase, the penultimate enzymatic step in glycolysis, which is conserved in all domains of life. Examination of >30,000 isolates from the Integrated Microbial Genomes with Microbiomes (IMG/M) database32 revealed that enolase-negative strains were rare (<0.5% of total), and that a high proportion of such strains were rumen isolates belonging to the genera Butyrivibrio and Prevotella and uncharacterized members of the family Lachnospiraceae (Supplementary Table 5). In the genus Butyrivibrio approximately half the sequenced strains lack enolase, while some show a truncated form. The distribution of this enzyme in relation to the phylogeny of this genus is shown in Figure 3. This analysis suggests that enolase is in the process of being lost by some rumen Butyrivibrio isolates and that we may be observing an example of environment-specific evolution by gene loss33. Although the adaptive advantage conferred by loss of enolase is not clear, there is a possible link with pyruvate metabolism and lactate production. Several enolase-negative Butyrivibrio strains do not produce lactate and 12 also lack the gene for l-lactate dehydrogenase. Conversely the enolase and l-lactate dehydrogenase genes are co-located in seven strains. An attempt to identify additional functions exhibiting a similar pattern of gene loss (or a complementing gain of function) by comparing enolase-positive versus enolase-negative Butyrivibrio spp. strains yielded no substantial additional insights (Supplementary Table 6).

几个瘤胃细菌的一个奇怪的特征是没有可识别的烯醇酶，这是糖酵解的倒数第二步，在所有生命领域都是保守的。对综合微生物基因组（IMG/M）数据库32中超过30,000个分离物的检查显示，烯醇酶阴性菌株是罕见的（占总数的<0.5%），而且这种菌株的很大比例是属于Butyrivibrio（丁酸弧菌属）和Prevotella（普雷沃菌属）属以及Lachnospiraceae（毛螺菌科）科的未定性成员的瘤胃分离物（补充表5）。在布氏弧菌属中，约有一半的测序菌株缺乏烯醇化酶，而有些则显示出截断的形式。该酶与该属系统发育的关系分布见图3。这一分析表明，烯醇化酶正在被一些瘤胃布氏杆菌分离出来，我们可能正在观察一个通过基因丢失进行环境特异性进化的例子33。虽然烯醇化酶的丧失所带来的适应性优势还不清楚，但可能与丙酮酸代谢和乳酸的产生有联系。一些烯醇酶阴性的布氏杆菌菌株不产生乳酸，12个菌株还缺乏l-乳酸脱氢酶的基因。相反，在7个菌株中，烯醇酶和l-乳酸脱氢酶的基因是同位的。通过比较烯醇酶阳性菌株和烯醇酶阴性菌株，试图找出表现出类似基因缺失模式（或功能互补增益）的其他功能，但没有得到实质性的额外认识（补充表6）。

Figure 3 Survey of enolase genes in Butyrivibrio strains. Maximum likelihood tree based on concatenated alignment of 56 conserved marker proteins from genomes of all Butyrivibrio strains in the Hungate Collection. Strains lacking a detectable enolase gene are indicated by pale pink shading while those with a truncated enolase are indicated by lavender shading. Strains without shading possess an intact enolase.

图3 Butyrivibrio菌株中烯醇化酶基因的调查。最大似然树，基于来自Hungate收集的所有布氏弧菌菌株基因组的56个保守标记蛋白的联合排列。缺乏可检测的烯醇化酶基因的菌株用淡粉色阴影表示，而那些具有截断的烯醇化酶的菌株用淡紫色阴影表示。没有阴影的菌株具有完整的烯醇化酶。

Another example of gene loss is seen in bacteria that have lost their complete glycogen synthesis and utilization pathway, as shown by the concomitant loss of families GH13, GH77, GT3 or GT5, and GT35 (Supplementary Fig. 2). These bacteria include nutritionally fastidious members of the Firmicutes (Allisonella histaminiformans, Denitrobacterium detoxificans, Oxobacter pfennigii) and Proteobacteria (Wolinella succinogenes), and have also lost most of their degradative CAZymes, suggesting that they have evolved toward a downstream position as secondary fermenters where they feed on fermentation products (acetate, pyruvate, amino acids) from primary degraders.

基因丢失的另一个例子见于失去完整糖原合成和利用途径的细菌，如GH13、GH77、GT3或GT5和GT35家族的同时丢失（补充图2）。这些细菌包括韧皮菌科（Allisonella histaminiformans、Denitrobacterium detoxificans、Oxobacter pfennigii）和蛋白菌科（Wolinella succinogenes）的营养快速型成员，并且也失去了大部分降解性CAZymes，表明它们已经向下游位置进化，成为次级发酵剂，以初级降解剂的发酵产物（醋酸、丙酮酸、氨基酸）为食。

Biosynthetic gene clusters.

生物合成基因簇。

We searched the Hungate genomes for biosynthetic gene clusters (Supplementary Fig. 5 and Supplementary Table 7) to identify evidence of secondary metabolites that might be used as rumen modifiers to reduce methane production through their antimicrobial activity34. A total of 6,906 biosynthetic clusters were predicted from the Hungate genomes (Supplementary Note 2).

我们在Hungate基因组中搜索了生物合成基因簇（补充图5和补充表7），以确定可能被用作瘤胃改良剂的次级代谢物的证据，通过其抗菌活性减少甲烷的产生34。从Hungate基因组中共预测出6,906个生物合成集群（补充说明2）。

Supplementary Figure 5 Distribution of genes encoding antimicrobial biosynthetic clusters (bacteriocins, lantipeptides and non-ribosomal peptide synthases) in the Hungate catalogue genomes Maximum likelihood tree based on 16S rDNA gene alignment was visualized and annotated using iTOL. Tree clades are colour coded according to phylum. Multi-bar-charts depict the total number of biosynthetic clusters for putative antimicrobial secondary metabolite classes in each genome.

补充图5 编码抗菌生物合成簇的基因在Hungate目录基因组中的分布情况 基于16S rDNA基因排列的最大似然树被可视化并使用iTOL进行了注释。树的支系根据门类的不同用颜色编码。多条形图描述了每个基因组中推定的抗菌次生代谢物类别的生物合成簇的总数。

CRISPRs.

Identification of CRISPR–Cas systems and their homologous protospacers from viral, plasmid and microbial genomes could shed light on past encounters with foreign mobile genetic elements35 and somewhat indirectly, habitat distribution and ecological interactions36. A total of 6,344 CRISPR spacer sequences were predicted from 241 Hungate genomes and searched against various databases (Supplementary Table 8). Searching spacers against a database of cultured and uncultured DNA viruses and retroviruses (IMG/VR) revealed novel associations between 83 viral operational taxonomic units (OTUs) and 31 Hungate hosts. The vast majority of these viruses were derived from human intestinal and ruminal samples. Details and additional results are furnished in Supplementary Note 3.

从病毒、质粒和微生物基因组中鉴定CRISPR-Cas系统及其同源的原生质粒，可以说明过去与外来移动遗传因子的遭遇35，并在某种程度上间接说明生境分布和生态相互作用36。从241个Hungate基因组中共预测了6344个CRISPR间隔物序列，并与各种数据库进行了搜索（补充表8）。根据培养和未培养的DNA病毒和逆转录病毒数据库（IMG/VR）搜索间隔序列，发现83个病毒操作分类单位（OTU）和31个Hungate宿主之间存在新的联系。这些病毒绝大部分来自于人类的肠道和瘤胃样本。详情和其他结果见补充说明3。

Metagenomic sequence recruitment

宏基因组序列招募

We evaluated whether the Hungate catalog can contribute to metagenomic analyses by using a total of 1,468,357 coding sequences (CDSs) from the 501 reference genomes to search against ~1.9 billion CDS predicted from more than 8,200 metagenomic data sets from diverse habitats. A total of 892,995 Hungate CDSs (~60%) were hits to 13,364,644 metagenome proteins at ≥ 90% amino acid identity. 466 out of 501 Hungate isolates recruited sequences from 2,219 metagenomic data sets derived from host-associated, environmental or engineered sources (Fig. 4 and Supplementary Table 9). The large number of human samples recruited (1,699) can be attributed to the greater availability of human samples compared to metagenomes from other mammals, including ruminants. Considering the number of isolate CDSs with hits to metagenome sequences (% coverage), most Hungate genomes (413/501) are represented in rumen metagenome samples, as well as in human or other vertebrate samples (Fig. 4). The average % coverage for 466 recruited genomes was 26.5% of total CDS, with Sharpea azabuensis DSM 18934 showing the highest capture (95.6%) in a sheep rumen metagenome (Supplementary Fig. 6).

我们通过使用501个参考基因组中的1,468,357个编码序列（CDS）与来自不同生境的8,200多个宏基因组数据中预测的约19亿个CDS进行搜索，评估了Hungate目录是否能对宏基因组分析作出贡献。共有892,995个Hungate CDSs（约60%）与13,364,644个宏基因组蛋白质的氨基酸一致性≥90%。在501个Hungate分离物中，有466个从2,219个宏基因组数据集中招募了序列，这些数据来自宿主相关的、环境或工程来源（图4和补充表9）。招募到的大量人类样本（1,699）可归因于与来自其他哺乳动物（包括反刍动物）的宏基因组相比，人类样本的可用性更高。考虑到与宏基因组序列相匹配的分离CDS的数量（%覆盖率），大多数Hungate基因组（413/501）在瘤胃宏基因组样本中都有体现，在人类或其他脊椎动物样本中也是如此（图4）。466个被招募的基因组的平均覆盖率为总CDS的26.5%，Sharpea azabuensis DSM 18934在绵羊瘤胃宏基因组中显示出最高的捕获率（95.6%）（补充图6）。

Figure 4 Recruitment of metagenomic proteins by Hungate catalog genomes. Maximum likelihood tree based on 16S rDNA gene alignment of rumen strains. The tree clades are color coded according to phylum. Multi-bar-chart depicting the average % coverage of total CDS of an isolate by metagenome samples from each ecosystem category was drawn using iTOL 55. Dashed boxes highlight interesting examples of recruitment such as isolates detected in both rumen and human samples (maroon boxes) or detected in human but not rumen samples (red boxes), and others. Number key is as follows (average % coverage is given in parentheses):

1. Sharpea azabuensis str. (~88%), Kandleria vitulina str. (~87%); 2. Staphylococcus epidermidis str. (~40%), Lactobacillus ruminis str. (~51%); 3. Streptococcus equinus str. (~38% by rumen, ~35% by human); 4. Prevotella bryantii str. (~38% by rumen, ~9% by human); 5. Bacteroides spp.(~38%); 6. Bifidobacterium spp. (~24%), Propionibacterium acnes (~39%); 7. Shigella sonnei (~30% by human), E. coli PA3 (~31% by human), Citrobacter sp. NLAE-zl-C269 (20% by human); 8. Clostridium beijerinckii HUN142 (87% by plant); 9. Methanobrevibacter spp. (~32%). The innermost circle identifies Hungate isolates of fecal (

) or salivary (♦) origin. Please refer to Supplementary Table 9 for data and other specifics.

图4 Hungate目录基因组对宏基因组蛋白的招募。基于瘤胃菌株的16S rDNA基因排列的最大似然树。树的支系根据门类用颜色编码。使用iTOL 55绘制了多条图表，描述了每个生态系统类别的宏基因组样本对分离物总CDS的平均覆盖率%。虚线框突出了有趣的招募实例，如在瘤胃和人类样本中都检测到的分离物（栗色框）或在人类但不是瘤胃样本中检测到的分离物（红色框），以及其他。数字键如下（括号内为平均覆盖率%）。1. Sharpea azabuensis str. (~88%), Kandleria vitulina str. (~87%); 2. Staphylococcus epidermidis str. (~40%), Lactobacillus ruminis str. (~51%); 3. Streptococcus equinus str. (~38% by rumen, ~35% by human); 4. Prevotella bryantii str. (~38% by rumen, ~9% by human); 5. Bacteroides spp.(~38%); 6. Bifidobacterium spp. (~24%), Propionibacterium acnes (~39%); 7. Shigella sonnei (~30% by human), E. coli PA3 (~31% by human), Citrobacter sp. NLAE-zl-C269 (20% by human); 8. Clostridium beijerinckii HUN142 (87% by plant); 9. Methanobrevibacter spp. (~32%). 最里面的圆圈标识了粪便（）或唾液（♦）来源的Hungate分离物。有关数据和其他具体情况，请参考补充表9。

Examining recruitment against available rumen metagenomes, a majority of 336 isolates were captured in 24 rumen samples (27% average coverage) (Supplementary Fig. 7 and Supplementary Table 9). A further 52 rumen isolates may be included if the hit count recruitment parameter is relaxed from 200 to 50. These isolates are predicted to occur in relatively low abundance in these rumen metagenomes, and raise the proportion of recruiters to almost 80% of the total Hungate catalog. Top recruitment (in terms of % coverage of total isolate CDS) was by organisms previously identified as dominant genera in the rumen10,22,37,38, such as Prevotella spp., Ruminococcus spp., Butyrivibrio spp. and members of the unnamed RC-9, R-7 and R-25 groups. Some Hungate catalog genomes were exclusively detected in one or a few samples originating from the same ruminant host (e.g., sheep-associated Sharpea, Kandleria and Megasphaera strains), whereas others were detected across all ruminants (e.g., Prevotella spp.). It is, however, important to acknowledge the limitations of existing rumen metagenome samples (not merely in terms of their paucity), as they were sourced from animals on special diets (e.g., switchgrass5 or lucerne (alfalfa) pellets39), which may alter the microbiome22.

对照现有的瘤胃宏基因组进行研究，在24个瘤胃样品中捕获了336个分离物中的大多数（平均覆盖率27%）（补充图7和补充表9）。如果将命中数招募参数从200个放宽到50个，那么还有52个瘤胃分离物可能被包括在内。据预测，这些分离物在这些瘤胃宏基因组中出现的丰度相对较低，并将招募者的比例提高到Hungate总目录的近80%。最主要的招募者（按总分离物CDS的覆盖率计算）是以前被确定为瘤胃中的优势属的生物，如普雷沃特氏菌属、鲁米尼克氏菌属、布特氏菌属以及未命名的RC-9、R-7和R-25组成员10,22,37,38。一些Hungate目录基因组只在来自同一反刍动物宿主的一个或几个样本中检测到（如与羊有关的Sharpea、Kandleria和Megasphaera菌株），而其他的则在所有反刍动物中检测到（如Prevotella spp.）。然而，重要的是要承认现有瘤胃宏基因组样本的局限性（不仅仅是它们的数量少），因为它们来自特殊饮食的动物（如开关草5或苜蓿（紫花苜蓿）颗粒39），这可能改变微生物组22。

Supplementary Figure 7 Recruitment of rumen metagenomes by Hungate catalogue genomes. Maximum likelihood tree based on 16S rDNA gene alignment was visualized and annotated using iTOL. Tree clades are colour coded according to phylum. Bar-charts depict the average % coverage of total CDSs of an isolate by rumen metagenome samples from each ruminant host shown on individual circles around the tree.

补充图7 Hungate目录基因组对瘤胃宏基因组的招募。基于16S rDNA基因排列的最大似然树用iTOL进行了可视化和注释。树的支系根据门类的不同用颜色编码。条形图描述了来自每个反刍动物宿主的瘤胃宏基因组样本对分离物总CDS的平均覆盖率，显示在树周围的各个圆圈上。

165 Hungate cultures were not detected in deposited rumen metagenome data sets under the thresholds applied. Many of these (~50) were of fecal origin, and reflect how the microbiota of the rumen is distinct from that found in other regions of the ruminant GI tract40.

在所应用的阈值下，165个饥饿培养物没有在沉积的瘤胃宏基因组数据集中检测到。其中许多（约50个）来自于粪便，反映了瘤胃的微生物群与反刍动物消化道其他区域的微生物群不同。

A total of 68 isolates were recruited by both rumen and human intestinal samples and represent shared species between the rumen and human microbiomes (Fig. 4), possibly fulfilling similar roles. A further 66 Hungate isolates were recruited by human samples but were not detected in rumen samples, giving a total of 134 Hungate catalog genomes that recruited various human samples, making them valuable reference sequences for the analysis of human microbiome samples. This observation is also indirectly recapitulated by the CRISPR–CAS systems-based analysis, which showed links to spacers from human intestinal samples, particularly for Hungate isolates of fecal origin (Supplementary Table 8). Additional metagenome recruitment analysis details are provided in Supplementary Note 4.

共有68个分离物被瘤胃和人类肠道样品招募，代表了瘤胃和人类微生物组之间的共享物种（图4），可能履行了类似的作用。另有66个Hungate分离物被人类样品招募，但在瘤胃样品中没有检测到，因此总共有134个Hungate目录基因组招募了各种人类样品，使它们成为分析人类微生物组样品的宝贵参考序列。这一观察结果也被基于CRISPR-CAS系统的分析所间接复述，它显示了与来自人类肠道样品的间隔物的联系，特别是对于粪便来源的Hungate分离物（补充表8）。额外的宏基因组招募分析细节在补充说明4中提供。

Comparison with human gut microbiota

与人类肠道微生物群的比较

Many Hungate strains (134/501) were shared between rumen and human intestinal microbiome samples. This is unsurprising, as both habitats are high-density, complex anaerobic microbial communities, producing similar fermentation products, and with extensive interspecies cross-feeding and interaction41. We performed a comparative analysis against available human intestinal isolates (largely from the HMP), to identify differences that can be attributed to distinct lifestyles and adaptive capacity of rumen microorganisms. The Hungate and human intestine isolate collections were curated to remove redundancy, low-quality genomes and known human pathogens. This resulted in a set of 458 rumen and 387 human intestinal genomes (Supplementary Table 10), which was used to identify protein families in the Pfam database that were differentially abundant in isolates from each environment. Out of 7,718 Pfam domains found in 458 non-redundant Hungate isolate genomes, we determined 367 were over-represented in the ruminal genomes and 423 were under-represented on the basis of the falsediscovery rate (FDR), q-value < 0.001 (Supplementary Table 11). Over-represented Pfams (Fig. 5) included enzymes involved in plant cell wall degradation (GH11, GH16, GH26, GH43, GH53, GH67, GH115), carbohydrate-binding modules (CBM2, CBM3, and cohesin and dockerin modules associated with cellulosome assembly) and GT41 family glycosyl transferases, which occur predominantly in the genera Anaerovibrio and Selenomonas. Notably, Pfams for the biosynthesis of cobalamin (vitamin B12), an essential micronutrient for the host, were over-represented. Vitamin B12 biosynthesis is one of the most complex pathways in nature, involving more than 30 enzymatic steps, and given its high metabolic cost, is only encoded by a small set of bacteria and archaea. We examined this biosynthetic pathway in more detail using other functional annotation types (KO and Tigrfam) across the 501 Hungate isolates, and discovered that 12 or more enzymatic steps were overrepresented in the Hungate genomes, and at least 47 isolates might be capable of de novo B12 synthesis (Supplementary Table 12). Many of these were members of the Class Negativicutes within the Firmicutes (Anaerovibrio, Mitsuokella, and Selenomonas). A further 140 (including 21 archaeal) genomes encode enzymes for the salvage of B12 from an intermediate, and may even work cooperatively (based on potential complementarity of lesions in the pathway in different members) to share and synthesize corrinoids for community and/or host benefit. These observations reflect the high burden of a requirement for vitamin B12, which is needed as a cofactor for enzymes involved in gluconeogenesis from propionate in the liver. This process is essential for lactose biosynthesis and milk production in dairy animals42, and dairy and meat products of ruminant origin are important dietary sources of B12 (ref. 43). By contrast, it has been speculated that human gut microbes were unlikely to contribute significant amounts of B12 for their host and were likely competitors for dietary B12 (ref. 44).

许多Hungate菌株（134/501）在瘤胃和人类肠道微生物组样本之间共享。这并不奇怪，因为这两个栖息地都是高密度、复杂的厌氧微生物群落，产生类似的发酵产物，并有广泛的种间交叉进食和互动41。我们对现有的人类肠道分离物（主要来自HMP）进行了比较分析，以确定可归因于瘤胃微生物的不同生活方式和适应能力的差异。对Hungate和人类肠道分离物集进行了整理，以去除冗余、低质量基因组和已知的人类病原体。这导致了一组458个瘤胃和387个人类肠道基因组（补充表10），这被用来识别Pfam数据库中在每个环境的分离物中含量不同的蛋白质家族。在458个非冗余Hungate分离物基因组中发现的7718个Pfam域中，我们确定有367个在瘤胃基因组中代表性过高，423个代表性不足，依据的是假发现率（FDR），q值<0.001（补充表11）。代表性过高的Pfams（图5）包括参与植物细胞壁降解的酶（GH11、GH16、GH26、GH43、GH53、GH67、GH115）、碳水化合物结合模块（CBM2、CBM3以及与纤维素体组装有关的cohesin和dockerin模块）和GT41家族糖基转移酶，它们主要出现在厌氧菌属和硒门菌属中。值得注意的是，用于生物合成钴胺（维生素B12）的Pfams，对宿主来说是一种重要的微量营养素，被过度代表。维生素B12的生物合成是自然界最复杂的途径之一，涉及30多个酶的步骤，鉴于其高代谢成本，仅由一小部分细菌和古细菌编码。我们使用其他功能注释类型（KO和Tigrfam）对501个Hungate分离物的这种生物合成途径进行了更详细的研究，发现12个或更多的酶步骤在Hungate基因组中的代表性过高，至少有47个分离物可能有能力进行新的B12合成（补充表12）。其中许多是韧皮动物中的阴性菌类（Anaerovibrio、Mitsuokella和Selenomonas）的成员。还有140个（包括21个古细菌）基因组编码了从中间物中抢救B12的酶，甚至可能合作工作（基于不同成员的途径中病变的潜在互补性），为社区和/或宿主的利益分享和合成冠状病毒。这些观察结果反映了对维生素B12需求的高负担，维生素B12需要作为参与肝脏中从丙酸盐生成葡萄糖的酶的辅助因子。这一过程对乳制品动物的乳糖生物合成和牛奶生产至关重要42，反刍动物的乳制品和肉制品是B12的重要膳食来源（参考文献43）。相比之下，有人推测人类的肠道微生物不太可能为其宿主贡献大量的B12，而可能是膳食B12的竞争对手（参考文献44）。

Of the Pfams (Fig. 5) under-represented in Hungate genomes, the occurrence of all steps for the oxidative branch of the pentose phosphate pathway (OPPP) was striking. The role of the OPPP is primarily the irreversible production of reducing equivalents (NADPH), although other enzymes may serve as alternate sources of reducing equivalents. As discussed above, the Pfam for enolase appeared in the list of under-represented families. The list also contained several Pfams associated with bacteriophage functions and sporulation. The differential abundance of sporulation genes is interesting as the observation that sporulation genes are abundant in human gut bacteria has been made recently16,31 and is potentially linked with resistance to oxygen exposure. This observation is particularly striking given the preponderance of Firmicutes, an archetypically spore-forming phylum45, in the rumen set. Large and small subunits of an oxygen-dependent Class I type ribonucleotide reductase were also under-represented together with several other Pfams implicated in oxygen tolerance, suggesting that human intestinal isolates may encounter higher oxygen tension compared to the strictly anaerobic ruminal ecosystem. These observations indirectly suggest that host genetics and physiology influence rumen microbiome composition and that rumen microbes are likely to be vertically inherited as indicated in recent studies46,47. Conversely, human intestinal (more specifically, fecal) isolates are transmitted from other sources in the environment31,48. We were able to recapitulate these findings in a metagenome-based comparison of these two environments (sheep rumen samples against normal human fecal samples; Supplementary Table 13), suggesting that these differences cannot be explained by cultivation or abundance biases in the isolate data sets.

在Hungate基因组中代表性不足的Pfams（图5）中，磷酸戊糖途径的氧化分支（OPPP）的所有步骤的出现是引人注目的。OPPP的作用主要是不可逆地产生还原当量（NADPH），尽管其他酶可以作为还原当量的替代来源。如上所述，烯醇酶的Pfam出现在代表性不足的家族名单中。该名单还包括几个与噬菌体功能和孢子形成有关的Pfam。孢子形成基因的不同丰度是有趣的，因为最近观察到孢子形成基因在人类肠道细菌中很丰富16,31，并可能与抗氧暴露有关。考虑到瘤胃中典型的形成孢子的门类45-- Firmicutes的优势，这一观察尤其引人注目。依赖氧气的I类核糖核苷酸还原酶的大亚单位和小亚单位以及其他几个与氧气耐受性有关的Pfams的代表性也很低，这表明与严格厌氧的瘤胃生态系统相比，人类肠道分离物可能遇到更高的氧气张力。这些观察结果间接表明，宿主的遗传学和生理学影响着瘤胃微生物组的组成，而且正如最近的研究46,47所示，瘤胃微生物可能是垂直遗传的。相反，人类的肠道（更确切地说，粪便）分离物是由环境中的其他来源传播的31,48。我们能够在这两种环境的基于宏基因组的比较中再现这些发现（绵羊瘤胃样本与正常人类粪便样本；补充表13），表明这些差异不能由分离物数据集中的培养或丰度偏差来解释。

Figure 5 Differentially abundant Pfams between rumen and human intestinal isolates. X axis is individual Pfams detected by Metastats to be differentially abundant with a Q-value < 0.001. Y axis is the log2-fold difference of mean counts for each population (rumen or intestinal). Select Pfams are highlighted as discussed in the text. OPPP, oxidative pentose phosphate pathway.

图5 瘤胃和人类肠道分离物之间含量不同的Pfams。X轴是Metastats检测到的个别Pfams的含量不同，Q值<0.001。Y轴是每个群体（瘤胃或肠道）的平均计数的对数倍差。选择的Pfams被突出显示，如文中所述。OPPP，氧化性磷酸戊糖途径。

讨论

The Hungate genome catalog that we report here includes genomic analysis of 501 bacterial and archaeal cultures that represent almost all of the cultured rumen species that have been taxonomically characterized, as well as representatives of several novel species and genera. This high-quality reference collection will guide interpretation of metagenomics data sets, including genomes recovered from metagenomes (MAGs). The Hungate genome catalog also allows robust comparative genomic analyses that are not feasible using incomplete sequence data from metagenomes. Researchers have access to Hungate Collection strains, which will enable a better understanding of carbon flow in the rumen, including the breakdown of lignocellulose, through the metabolism of substrates to SCFAs and fermentation end products, to the final step of CH4 formation.

我们在此报告的Hungate基因组目录包括对501种细菌和古细菌培养物的基因组分析，这些培养物代表了几乎所有已被分类学定性的培养瘤胃物种，以及一些新物种和属的代表。这个高质量的参考文献集将指导宏基因组学数据集的解释，包括从宏基因组（MAGs）中恢复的基因组。Hungate基因组目录还允许进行强大的比较基因组分析，而使用来自宏基因组的不完整序列数据是不可行的。研究人员可以获得Hungate系列菌株，这将使人们更好地了解瘤胃中的碳流，包括木质纤维素的分解，通过底物到SCFAs和发酵终端产品的代谢，到最后一步CH4的形成。

The Hungate genome collection is by no means complete. Some important taxa are missing, especially members of the order Bacteroidales10,22. At the start of this project genome sequences were available for strains belonging to 11 (12.5%) of the 88 genera described for the rumen. Currently, genome sequences are available for 73 (83%) of those 88 genera, as well as for 73 strains that are only identified to the family or order taxonomic level. Of the rumen ‘most wanted list’ which comprises 70 rumen bacteria10, the Hungate Collection has now contributed 30 members. In addition to missing bacteria and archaea, the sequencing of rumen eukaryotes presents considerable technical challenges and although some progress has been made in sequencing of anaerobic fungi49, there are no genome data for rumen ciliate protozoa, and only preliminary data on the rumen virome50.

Hungate的基因组收集绝不是完整的。一些重要的类群被遗漏了，特别是类杆菌目10,22的成员。在本项目开始时，属于瘤胃88个属的11个（12.5%）的菌株的基因组序列是可用的。目前，这88个属中有73个（83%）的基因组序列可用，还有73个菌株只被鉴定为科或目分类水平。在由70个瘤胃细菌组成的瘤胃 "最需要的名单 "中，Hungate Collection现在已经贡献了30个成员。除了缺失的细菌和古细菌，瘤胃真核生物的测序也带来了相当大的技术挑战，尽管厌氧真菌的测序已经取得了一些进展49，但没有瘤胃纤毛原虫的基因组数据，只有瘤胃病毒组的初步数据50。

Microbiome research is moving from descriptive to mechanistic, and to translation of those mechanisms into interventions51. Using rumen microbiome data to engineer rumens to reduce CH4 emissions52 and improve productivity and sustainability outcomes is now in sight53. The Hungate Collection provides a starting point for this, shedding light on what has been described as ‘the world’s largest commercial fermentation process’54. Future studies can use the Hungate resources to improve the resolution of rumen meta-omics analyses, to identify antimicrobials, to source carbohydrate-degrading enzymes from the rumen for use as animal feed additives and in lignocellulose-based biofuel generation, and as the basis for synthetic microbial consortia.

微生物组研究正在从描述性研究转向机械性研究，并将这些机制转化为干预措施51。使用瘤胃微生物组数据来设计瘤胃，以减少CH4排放52，提高生产力和可持续发展的结果，现在是有希望的53。Hungate系列提供了一个起点，揭示了被描述为 "世界上最大的商业发酵过程 "54。未来的研究可以利用Hungate资源来提高瘤胃宏组学分析的分辨率，确定抗菌剂，从瘤胃中获取碳水化合物降解酶，用作动物饲料添加剂和基于木质纤维素的生物燃料，并作为合成微生物联合体的基础。

文献阅读：Hungate1000收集的瘤胃微生物组成员的培养和测序

你可能感兴趣的:(文献阅读：Hungate1000收集的瘤胃微生物组成员的培养和测序)