作者: | Huilong Du & Chengzhi Liang |
---|---|
刊物名称: | Nature Communications |
DOI: | |
联系作者: | |
英文联系作者: | |
卷: | |
摘要: | Due to the large number of repetitive sequences in complex eukaryotic genomes, fragmented assemblies lose value as references genomes, often due to incomplete sequences and unanchored or mispositioned short contigs on chromosomes. Here we report a genome assembly method HERA, which includes a concept called a connection graph as well as algorithms for constructing the graph from an overlap graph. HERA resolves repeats at high efficiency with single-molecule sequencing data, and dramatically improves the quality of current genome assemblies. We test HERA with the genomes of rice, maize, human, and Tartary buckwheat. HERA can correctly assemble most of the previously unassembled regions including tandemly repetitive sequences and improve the contig N50 sizes of published maize and human assemblies from 1.3 Mb to 61.2 Mb and from 8.3 Mb to 54.4 Mb, respectively. The application of HERA will greatly improve the quality of new or existing assemblies of complex genomes. |