2014 Fiscal Year Final Research Report
Development of denovo genome/gene assembler for highly heterozygous samples from NGS sequence data
Project/Area Number |
24310142
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Partial Multi-year Fund |
Section | 一般 |
Research Field |
Genome biology
|
Research Institution | Tokyo Institute of Technology |
Principal Investigator |
ITOH TAKEHIKO 東京工業大学, 生命理工学研究科, 教授 (90501106)
|
Co-Investigator(Kenkyū-buntansha) |
NOGUCHI Hideki 国立遺伝学研究所, 先端ゲノミクス推進センター, 特任准教授 (50333349)
MARUYAMA Haruhiko 宮崎大学, 医学部, 教授 (90229625)
|
Project Period (FY) |
2012-04-01 – 2015-03-31
|
Keywords | ゲノム情報 / バイオインフォマティクス |
Outline of Final Research Achievements |
Assembling the highly heterozygous diploid genomes is a big scientific challenge due to the increased complexity of the de Bruijn graph structure. To deal with an increasing demand for sequencing of non-model and/or wild-type sample, we developed a novel de novo assembler, Platanus, which can effectively manage high-throughput data from heterozygous samples. Platanus assembles DNA fragments into contigs by constructing de Bruijn graphs, followed by scaffolding of contigs based on paired-end information. The complicated graph structures that result from the heterozygosity are simplified during not only the contig assembly step but also the scaffolding step. We evaluated the assembly results on eukaryotic samples with various levels of heterozygosity. Compared with other assemblers, the Platanus assembly results have a larger NG50 length without any accompanying loss of accuracy in both simulated data and real data including highly heterozygous Strongyloides samples.
|
Free Research Field |
ゲノム情報
|