Artificial intelligence for sequence similarity search
Project/Area Number |
18K18143
|
Research Category |
Grant-in-Aid for Early-Career Scientists
|
Allocation Type | Multi-year Fund |
Review Section |
Basic Section 62010:Life, health and medical informatics-related
|
Research Institution | Tohoku University |
Principal Investigator |
|
Project Period (FY) |
2018-04-01 – 2021-03-31
|
Project Status |
Completed (Fiscal Year 2020)
|
Budget Amount *help |
¥1,820,000 (Direct Cost: ¥1,400,000、Indirect Cost: ¥420,000)
Fiscal Year 2020: ¥520,000 (Direct Cost: ¥400,000、Indirect Cost: ¥120,000)
Fiscal Year 2019: ¥520,000 (Direct Cost: ¥400,000、Indirect Cost: ¥120,000)
Fiscal Year 2018: ¥780,000 (Direct Cost: ¥600,000、Indirect Cost: ¥180,000)
|
Keywords | 人工知能 / 配列解析 / 生物学的文字列 / 文字列 / 配列類似性検索 / 生物学的配列 / 配列アライメント / 機械学習 |
Outline of Final Research Achievements |
Position-specific substitution matrices (PSSMs) are matrices, which include evolutionary information about amino acids. PSSMs are fundamental information for sequence similarity search, evolutionary analysis of amino acids, etc. However, in order to generate PSSMs, it is necessary to perform repeated sequence similarity searches on a large database, which takes a lot of time. In the study, we have developed an artificial intelligence (AI), SPBuild, which could reduce the generation time of PSSMs, keeping information contents of the generated PSSMs. To develop SPbuild, we had utilized a recurrent neural network (RNN). Through the research, we realized that development of AI with existing RNNs would take a lot of time, due to its large time complexity. Thus, we had developed a novel RNN, YamRNN, which showed better convergence performance compared to existing RNNs. SPBuild and YamRNN is publicly available.
|
Academic Significance and Societal Importance of the Research Achievements |
生物学的文字列の類似性検索法は医学や生物学の解析をする際に最も基本的な情報科学的な解析法のひとつです.類似性検索法の利用によって,様々な発見がされてきましたし,この性能向上を達成することでさらなる発見が期待されます.今回の研究で開発した人工知能が出力する情報は,この配列類似性検索法を高性能に行うために必要な情報です.これまでにとても長い時間をかけて生成していたこの情報を高速に生成できるようにしました.また,この研究で人工知能を開発した方法は,それ自体がとても計算量が大きな方法でした.よって,この研究ではさらに発展的により計算量が少ない人工知能を開発する要素技術も新たに開発しました.
|
Report
(4 results)
Research Products
(2 results)