2003 Fiscal Year Final Research Report Summary
Algorithms for Extraction of Common Patterns from Data in Bioinformatics
Project/Area Number |
13680394
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
計算機科学
|
Research Institution | Kyoto University |
Principal Investigator |
AKUTSU Tatsuya Kyoto University, Institute for Chemical Research, Professor, 化学研究所, 教授 (90261859)
|
Co-Investigator(Kenkyū-buntansha) |
MIYANO Satoru University of Tokyo, Institute of Medical Science, Professor, 医科学研究所, 教授 (50128104)
UEDA Nobuhisa Kyoto University, Institute for Chemical Research, Assistant Professor, 化学研究所, 助手 (80346048)
|
Project Period (FY) |
2001 – 2003
|
Keywords | Bioinformatics / Pattern matching / Sequence motif / Algorithms / Kernel method / Position specific score matrix / Local alignment |
Research Abstract |
We studied algorithms for extracting common patterns from biological data such as DNA sequences, protein structures, two-dimensional electrophoresis image and gene expression data. We mainly obtained the following results. 1.Extraction of Common Patterns by Local Alignment. The Gibbs sampling algorithm is widely-used for extraction of common patterns from sequence data. We developed a variant of the Gibbs sampling algorithm, which can be applied to numerical sequences. We applied the developed algorithm to detection of motifs from protein structures. 2.On the Complexity of Deriving Patterns from Positive and Negative Sequences. We studied theoretical aspects of deriving position specific score matrices (PSSM) from positive and negative sequences. 3.Pattern Matching Algorithms for Electrophoresis Image Data and Protein Structures. We formulated a pattern matching problem for 2-dimensional electrophoresis image data as a geometric matching problems and proved that it is NP-hard. On the other hand, we developed practical algorithms for computing optimal matchings for electrophoresis image data analysis and protein structure alignment. 4.Kernel Method for Protein Sequence Classification. We developed a new kernel function for protein sequences based on the well-known local alignment algorithm for protein sequences. The kernel was combined with the support vector machine and was tested using some benchmark data. The results show that the new kernel is better than existing kernels. 5.Inference of Protein-Protein Interactions Using Linear Programming. We developed a new method for inference of probabilities of domain-domain interactions from experimental data by formulating the inference problem as a linear program. The method was applied to inference of protein-protein interactions and was compared with existing methods. The results show that the proposed method outperforms existing methods for numerical data.
|
Research Products
(15 results)