Budget Amount *help |
¥3,880,000 (Direct Cost: ¥3,400,000、Indirect Cost: ¥480,000)
Fiscal Year 2007: ¥2,080,000 (Direct Cost: ¥1,600,000、Indirect Cost: ¥480,000)
Fiscal Year 2006: ¥1,800,000 (Direct Cost: ¥1,800,000)
|
Research Abstract |
The prediction of structural domains in novel protein sequences is becoming of practical importance. One important area of application is the development of computer-aided techniques for identifying, at a low cost, novel protein domain targets for large-scale functional and structural proteomics. Traditional computer-aided methods rely on sequence similarity to domain databases such as Pfam, Prosite or SMART. However, methods that work independently from domain databases are required for detecting novel domains. Such methods need to predict domain regions solely from the information contained in the amino acid sequence of the protein of interest. In this project, we report a Support Vector Machine (SVM) that identifies domain linkers, which are loops separating two structural domains. We constructed a multi-domain protein database based on SCOP and CATH domain boundary definition. First, we selected domains that do not form inter-domain interactions, as defined by the presence of inter-domain VdW, Hbonds and SS-bonds, and are independently foldable. From this set, we further selected domain boundaries that form loops as defined by DSSP. Domain boundaries that fulfilled both conditions were called linkers and used for training and testing the SVM. We developed a domain linker prediction (DLP-SVM) based on SVMlight. The sensitivity and the specificity were, respectively, 46.8% and 57.1%. These values are over 5.1 and 6.8%, respectively, higher than previously reported methods. DLP-SVM is freely available at : http://www.tuat.ac.jp/~ domserv/cgi-bin/DLP-SVM.cgi A travel grant from the Protein Science Society of Japan was awarded to PhD student Teippei Ebina for presenting this research at the PRICIPS 2008 conference held in Cairns, Australia, June 2008.
|