2023 Fiscal Year Final Research Report
Machine learning from incomplete information table by rule generation and its application
Project/Area Number |
20K11954
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Multi-year Fund |
Section | 一般 |
Review Section |
Basic Section 61030:Intelligent informatics-related
|
Research Institution | Kyushu Institute of Technology |
Principal Investigator |
Sakai Hiroshi 九州工業大学, 大学院工学研究院, 教授 (60201513)
|
Project Period (FY) |
2020-04-01 – 2024-03-31
|
Keywords | 表データ解析 / ルール生成 / 不完全情報 / NIS-アプリオリアルゴリズム / 欠損値補完 / データマイニング / ルール生成による機械学習 / ラフ集合と粒状計算 |
Outline of Final Research Achievements |
Using the implemented NIS-Apriori method (generating certain rules from an incomplete information table NIS), we studied a method for missing value imputation in tabular data. If attribute A of instance x is missing, certain rules with attribute A as the decision attribute are generated, and the missing value is filled in with the conclusion part of the strongest certain rule that hits x. This framework is considered an unprecedented method, and we have created a new execution environment. In cross-validation experiments, we did not obtain uniformly favorable imputation. Still, when an attribute had a strong dependency on attribute A, we could impute the true value with high accuracy. The Congressional Voting data in the UCI repository has strong certainty rules, and our method imputed the true value with an accuracy of 93%.
|
Free Research Field |
データサイエンス
|
Academic Significance and Societal Importance of the Research Achievements |
表データにおける欠損値の問題は古くから取り上げられており,主に統計的な手法が用いられる.しかし,表データがカテゴリカルな値を持つ場合,例えば血液型データのように平均や分散などの統計量が明確にならないことも考えられる.今回提案している欠損値補完法はカテゴリカルな値を研究対象にしており,統計的手法にはなじまない場合の新たな欠損値補完法に繋がると考えられる.
|