A Statistical Study on Bayes Statistics and Ensemble Learning
Project/Area Number |
14084210
|
Research Category |
Grant-in-Aid for Scientific Research on Priority Areas
|
Allocation Type | Single-year Grants |
Review Section |
Science and Engineering
|
Research Institution | Waseda University |
Principal Investigator |
MURATA Noboru Waseda University, Department of Electrical Engineering and Bioscience, Profassor, 理工学術院, 教授 (60242038)
|
Co-Investigator(Kenkyū-buntansha) |
IKEDA Kazushi Kyoto University, Department of Systems Science, Associate Professor, 情報学研究科, 助教授 (10262552)
|
Project Period (FY) |
2002 – 2005
|
Project Status |
Completed (Fiscal Year 2005)
|
Budget Amount *help |
¥7,000,000 (Direct Cost: ¥7,000,000)
Fiscal Year 2005: ¥800,000 (Direct Cost: ¥800,000)
Fiscal Year 2004: ¥1,200,000 (Direct Cost: ¥1,200,000)
Fiscal Year 2003: ¥3,200,000 (Direct Cost: ¥3,200,000)
Fiscal Year 2002: ¥1,800,000 (Direct Cost: ¥1,800,000)
|
Keywords | statistical learning theory / ensemble learning / Bayes statistics / learning algorithm / information geometry / boosting / Bregman divergence / 情報幾何 |
Research Abstract |
In order to study boosting algorithms, we consider the structure of a space of general learning models which is naturally introduced by Bregman divergence. Statistical properties such as robustness against noises and outliers, asymptotic efficiency depending on the size of training samples and learning models, and Bayes optimality and consistency of convex functions which induce Bregman divergences, are discussed and clarified. Based on the above consideration, we have proposed a new generic class of boosting algorithms, which is called "U-Boost". Moreover, extending boosting algorithms to the density estimation, we have proposed an algorithm for regression problems. In the algorithm, Gaussian processes in reproducing kernel Hilbert spaces are used as regressors, and estimating functions based on Bregman divergences are utilized for inference. In our study, a close relationship between boosting algorithm and support vector machines has been exposed, therefore we have also studied on the generalization errors of support vector machines from an algebraic and geometrical viewpoint. For practical applications, we have coped with the following problems. In order to avoid an explosion of the number of parameters, which frequently occurs in estimating a huge probability table of graphical models and Bayesian networks, we have constructed a mixture model based on the concept of ensemble learning. The model consists of simple tables and has rather good generalization errors. We discussed an estimation algorithm of the model, which is an extension of the EM algorithm from a viewpoint of information geometry. We also worked on constructing an on-line algorithm for boosting, in order to apply the boosting to learning problems such as reinforcement learning, in which plenty of data are observed one after another. We have considered methods for reconstructing the objective function from sequentially obtained data, and compared with ordinary off-line boosting algorithms.
|
Report
(5 results)
Research Products
(18 results)