Algorithms for Extraction of Common Patterns from Data in Bioinformatics
Project/Area Number |
13680394
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
計算機科学
|
Research Institution | Kyoto University |
Principal Investigator |
AKUTSU Tatsuya Kyoto University, Institute for Chemical Research, Professor, 化学研究所, 教授 (90261859)
|
Co-Investigator(Kenkyū-buntansha) |
MIYANO Satoru University of Tokyo, Institute of Medical Science, Professor, 医科学研究所, 教授 (50128104)
UEDA Nobuhisa Kyoto University, Institute for Chemical Research, Assistant Professor, 化学研究所, 助手 (80346048)
|
Project Period (FY) |
2001 – 2003
|
Project Status |
Completed (Fiscal Year 2003)
|
Budget Amount *help |
¥2,600,000 (Direct Cost: ¥2,600,000)
Fiscal Year 2003: ¥1,000,000 (Direct Cost: ¥1,000,000)
Fiscal Year 2002: ¥700,000 (Direct Cost: ¥700,000)
Fiscal Year 2001: ¥900,000 (Direct Cost: ¥900,000)
|
Keywords | Bioinformatics / Pattern matching / Sequence motif / Algorithms / Kernel method / Position specific score matrix / Local alignment / サポートベクタマシン / ローカルアライメント / ホモロジー検索 / 点集合 / 合同性判定 / 最大共通部分点集合 / 近似マッチング / モチーフ抽出 / 最大クリーク / 電気泳動 / スポットマッチング / 構造アライメント / GIBBSサンプリング / 相対エントロピー / ローカルサーチ |
Research Abstract |
We studied algorithms for extracting common patterns from biological data such as DNA sequences, protein structures, two-dimensional electrophoresis image and gene expression data. We mainly obtained the following results. 1.Extraction of Common Patterns by Local Alignment. The Gibbs sampling algorithm is widely-used for extraction of common patterns from sequence data. We developed a variant of the Gibbs sampling algorithm, which can be applied to numerical sequences. We applied the developed algorithm to detection of motifs from protein structures. 2.On the Complexity of Deriving Patterns from Positive and Negative Sequences. We studied theoretical aspects of deriving position specific score matrices (PSSM) from positive and negative sequences. 3.Pattern Matching Algorithms for Electrophoresis Image Data and Protein Structures. We formulated a pattern matching problem for 2-dimensional electrophoresis image data as a geometric matching problems and proved that it is NP-hard. On the other hand, we developed practical algorithms for computing optimal matchings for electrophoresis image data analysis and protein structure alignment. 4.Kernel Method for Protein Sequence Classification. We developed a new kernel function for protein sequences based on the well-known local alignment algorithm for protein sequences. The kernel was combined with the support vector machine and was tested using some benchmark data. The results show that the new kernel is better than existing kernels. 5.Inference of Protein-Protein Interactions Using Linear Programming. We developed a new method for inference of probabilities of domain-domain interactions from experimental data by formulating the inference problem as a linear program. The method was applied to inference of protein-protein interactions and was compared with existing methods. The results show that the proposed method outperforms existing methods for numerical data.
|
Report
(4 results)
Research Products
(27 results)