2016 Fiscal Year Final Research Report
An approach for eliminating chance correlations and its application to pharmaceutical data.
Project/Area Number |
25460035
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Multi-year Fund |
Section | 一般 |
Research Field |
Physical pharmacy
|
Research Institution | Osaka University |
Principal Investigator |
|
Project Period (FY) |
2013-04-01 – 2017-03-31
|
Keywords | Chance Correlation / L1 Regularization / L2 Regularization / Ridge Regression / Elastic Net / Hydrolyzability / Classification / Data Mining |
Outline of Final Research Achievements |
We tried to develop a novel method for eliminating "Chance correlation" descriptors which appear when supervised learning is applied. As a result, we found a combinatoric method using data classification and regression methods gave better results in the case of artificial data. However, we also found that the appropriate combination of L1 and L2 regularization also provided better predictability in the case of real data sets which showed simpler data structures. According to Ockham's prionciple, we adopted elastic net and similar methods to eliminate chance correlation descriptors. Thus, we found the latter combinatoric method applied for predicting hydrolyzabilities of esters, amides, etc showed the best predictability (in the case of esters, the correct classification rate was 89%), when L2 regularization was carried out after L1 one. Therefore, it can be concluded that the former method gives better predictability for complex data, and latter one is better for complex data.
|
Free Research Field |
計量薬学、計量化学、計算化学
|