Construction of a neural network for detecting novel domains from amino acid sequence information only
Project/Area Number |
16500189
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Bioinformatics/Life informatics
|
Research Institution | National University Corporation Tokyo University of Agriculture and Technology |
Principal Investigator |
KURODA Yutaka National University Corporation Tokyo University of Agriculture and Technology, Graduate School, Institute of Symbiotic Science and Technology, Associate Professor, 大学院・共生科学技術研究部, 助教授 (10312240)
|
Co-Investigator(Kenkyū-buntansha) |
TOH Hiroyuki Kyushu University, Medical Institute of Bioregulation, Professor, 生体防御医学研究所, 教授 (70192656)
|
Project Period (FY) |
2004 – 2005
|
Project Status |
Completed (Fiscal Year 2005)
|
Budget Amount *help |
¥3,300,000 (Direct Cost: ¥3,300,000)
Fiscal Year 2005: ¥1,200,000 (Direct Cost: ¥1,200,000)
Fiscal Year 2004: ¥2,100,000 (Direct Cost: ¥2,100,000)
|
Keywords | Domain boundary / Strutural domain / Protein structure prediction / Neural Network / Support Vector Machine (SVM) / Ensemble learning / 構造予測 |
Research Abstract |
High throughput structure determination often requires to dissect multi-domain proteins into structural domains that are able to fold independently and their structures readily determinable. Domains are usually identified by sequence similarity to domain databases such as Pfam, Cd or SMART. However, methods that can detect required for detecting novel domains. Such methods, need to predict domains regions solely from the information contained in the amino acid sequence of the protein of interest.. In this project, we report a neural network and preliminary results for an SVM (Support Vector Machine) that identifies domain boundaries in multi-domain proteins : 1) Domain linker sequences. We constructed a multi-domain protein database based on SCOP and CATH domain boundary definition. First, we selected domains that do not form inter-domain interactions, as defined by presence of inter-domain VdW, Hbonds and SS-bonds, and are independently foldable. From this set, we further selected domain boundaries that form loops as defined by DSSP. Domain boundaries that fulfilled both conditions were called linkers and used for training and testing the neural network and the SVM. 2) We developed a neural network recognizing domain linkers. Cross validation indicates that the prediction efficiency of our neural network is 50〜60%, compared to a random guess that yields a〜10% prediction efficiency. 3) In addition to the above neural network, we developed a domain linker prediction based on SVMlight. We observed prediction efficiencies similar to that of the neural network, but the training time was one fifth of that needed for the neural network..
|
Report
(3 results)
Research Products
(20 results)