Development of statistical models for knowledge acquisition from large-scale data including multiform samples
Project/Area Number |
13680507
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
社会システム工学
|
Research Institution | Gunma University |
Principal Investigator |
SEKI Yoichi Gunma University, Department of Engineering, Professor, 工学部, 教授 (90196949)
|
Project Period (FY) |
2001 – 2003
|
Project Status |
Completed (Fiscal Year 2003)
|
Budget Amount *help |
¥3,100,000 (Direct Cost: ¥3,100,000)
Fiscal Year 2003: ¥700,000 (Direct Cost: ¥700,000)
Fiscal Year 2002: ¥500,000 (Direct Cost: ¥500,000)
Fiscal Year 2001: ¥1,900,000 (Direct Cost: ¥1,900,000)
|
Keywords | data mining / tree regression analysis / nonparametric regression / minimum description length / nearest neighborhood / POS data with customer ID / SOM / Nearest Neighbor / 交互作用効果 / POSデータ |
Research Abstract |
As theoretical viewpoints, we proposed the following models and developed the prototype programs of these methods with the statistical language S. (1)We proposed a recursive partitioning linear model, which can be transformed to ordinal linear models, based on tree regression model with linear terms (Seki & Tsutsui 98). (2)We proposed a process monitoring chart to monitor the parts whether irregular deterioration occurred, supposing that the deterioration characteristic can be continuously monitored by development of sensor technology. (3)We proposed a method to estimate a Nearest Neighbor type non-parametric regression model, by optimizing the weight of the weighted Euclidian distance, in order to take into consideration the levels of explanation variables effect. (4)We proposed a method which stratifies sample set by discriminating the class in which the relation between the response variable and explanation variables are similar, in the case where two explanation variable sets are given : one can be used for stratification of sample set, and another can be used for regression of response variable. (5)We proposed a non-parametric, test using Minimum Description length criterion for comparing dose levels. In order to verify proposed methodology about POS (Point Of Sales) data with customer ID, we participate in the data analysis competition sponsored by the Operations Research Society of Japan etc., and obtained the following results. 2001 a food supermarket's data, competition championship 2002 a department store's data, Sectional-meeting fighting spirit award 2003 3 department stores' data, Sectional-meeting superior prize
|
Report
(4 results)
Research Products
(14 results)