2010 Fiscal Year Final Research Report
Research on Fast Search for DNA Sequence Using Vector Quantization
Project/Area Number |
21710207
|
Research Category |
Grant-in-Aid for Young Scientists (B)
|
Allocation Type | Single-year Grants |
Research Field |
ゲノム情報科学
|
Research Institution | Tohoku University |
Principal Investigator |
CHEN Qiu Tohoku University, 未来科学技術共同研究センター, 准教授 (00400292)
|
Project Period (FY) |
2009 – 2010
|
Keywords | バイオインフォマティクス / ベクトル量子化 / 塩基配列 / 高速検索 / データベース / ヒストグラム特徴 |
Research Abstract |
The enormous quantity of DNA sequence data has been accumulated in the database like GenBank, EMBL, and DDBJ, etc. Moreover, the volume of data still increases in exponential. Homology search of DNA sequences is the most important task in the life science area. In this research, we propose an efficient hierarchical DNA sequence search method to improve the search speed while the accuracy is being kept constant. For a given query DNA sequence, firstly, a fast local search method using histogram features is used as a filtering mechanism before scanning the sequences in the database. A large number of DNA sequences with low similarity will be excluded for latter searching. The Smith-Waterman algorithm is then applied to each remainder sequences. Experimental results using GenBank sequence data show the proposed method combining histogram information and Smith-Waterman algorithm is more efficient for DNA sequence search.
|
Research Products
(7 results)