2000 Fiscal Year Final Research Report Summary
Mathematical Modeling and Stochastic Sensitivity Analysis for Data Mining
Project/Area Number |
11680435
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
社会システム工学
|
Research Institution | University of Tsukuba |
Principal Investigator |
KODA Masato Univ.Tsukuba, Inst.Policy Plan. Sciences, Prof., 社会工学系, 教授 (20114473)
|
Co-Investigator(Kenkyū-buntansha) |
SUZUKI Hideo Univ.Tsukuba, Inst. Policy Plan.Sciences, Ass.Prof., 社会工学系, 講師 (10282328)
YOSHIDA Taketoshi JAIST, School of Knowledge Science, Assoc.Prof., 知識科学研究科, 助教授 (80293398)
|
Project Period (FY) |
1999 – 2000
|
Keywords | Data Mining / Neural Network / Stochastic Sensitivity Analysis / Bootstrap Method / Minimum Description Length (MDL) |
Research Abstract |
We have studied the mathematical modeling and stochastic sensitivity analysis techniques that are required to develop advanced machine-learning systems for data mining, and obtained the following results : 1. A new stochastic learning algorithm for neural networks : Based on a functional derivative formulation of the gradient descent method in conjunction with stochastic sensitivity analysis techniques using variational approach, a novel stochastic learning algorithm using Gaussian white noise is developed for a class of discrete-time neural networks. Unlike the back-propagation algorithm, the proposed method does not require the synchronous transmission of information backward along connection weights. The proposed algorithm uses only ubiquitous noise inherent in the network and local signals, to achieve simple sequential updating of connection weights. 2. Bootstrap re-sampling for unbalanced data in supervised leaning : A technical framework using bootstrap techniques is developed to assess the impact of re-sampling on the generalization ability of a supervised learning. Based on the bootstrap expression of the prediction error, the proposed method enables identification of the optimal re-sampling proportion for unbalanced data set. The analysis is also conducted to extend the proposed method to cross-validation. 3. Applications to manufacturing scheduling and processes : Data mining techniques to assess the association or closeness of dispatching rules are studied in order to develop optimal manufacturing schedules. Minimum Description Length (MDL) criterion is also studied to discover unnatural patterns or events in manufacturing processes. The results we obtained clearly indicate that techniques of data mining will play an essential role in the production scheduling and statistical process control.
|
Research Products
(10 results)