2016 Fiscal Year Final Research Report
Statistical theory for string data analysis and its application to computational biochemistry
Project/Area Number |
26610037
|
Research Category |
Grant-in-Aid for Challenging Exploratory Research
|
Allocation Type | Multi-year Fund |
Research Field |
Foundations of mathematics/Applied mathematics
|
Research Institution | Institute of Physical and Chemical Research (2016) Kyoto University (2014-2015) |
Principal Investigator |
Hitoshi Koyano 国立研究開発法人理化学研究所, 生命システム研究センター, 研究員 (10570989)
|
Project Period (FY) |
2014-04-01 – 2017-03-31
|
Keywords | 文字列 / 確率論 / 統計学 / 機械学習 / 生物配列 / バイオインフォマティクス |
Outline of Final Research Achievements |
In this research project, we first demonstrated limit theorems, extending probability theory that we constructed on a noncommutative topological monoid A* of strings in our previous studies. We next developed a theory of a learning machine that learns under the maximum margin principle in A*, using these theorems, and subsequently applied the machine to the prediction problems of RNA secondary structures and protein-protein interactions to examine its usefulness in practical data analysis. Furthermore, we derived an unsupervised procedure for string clustering by constructing a theory of a mixture model on A* and demonstrated the optimality of the procedure based on the above-mentioned theorems. Lastly, we introduced median and center strings for a distribution on A* and constructed an algorithm that searches them efficiently.
|
Free Research Field |
応用数学、数理統計学、バイオインフォマティクス
|