2023 Fiscal Year Final Research Report

A Study on Statistical Interpretation Methods for Machine Learning Results Using Shapley Values

Research Project

PDF

Project/Area Number	20K11938
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Multi-year Fund
Section	一般
Review Section	Basic Section 61030:Intelligent informatics-related
Research Institution	Kumamoto University
Principal Investigator	NOHARA Yasunobu 熊本大学, 大学院先端科学研究部(工), 准教授 (30624829)
Co-Investigator(Kenkyū-buntansha)	松本晃太郎久留米大学, 付置研究所, 講師 (60932217)
Project Period (FY)	2020-04-01 – 2024-03-31
Keywords	機械学習 / 解釈手法 / シャプレー値 / 説明性の定量化 / 変数重要度
Outline of Final Research Achievements	In recent years, machine learning technologies, including deep learning, have been gaining attention and are increasingly being implemented. However, there is a strong demand for the explanation and interpretability of the results these technologies produce. This study applies the Shapley value;　a method of fair profit distribution among multiple collaborators used in economics;　to the research of interpretability methods for machine learning models. First, we propose a method to quantitatively evaluate interpretability based on how accurately the model can be interpreted. Then, we theoretically prove that, in the absence of correlation among features, selecting features in descending order of the variance of their Shapley values maximizes interpretability when the number of usable features is limited.
Free Research Field	機械学習
Academic Significance and Societal Importance of the Research Achievements	既存の機械学習の解釈手法において、どの説明変数が重要であるかを表す指標である変数重要度は、経験的に使われてきたものであり、理論的な裏付けはなかった。本研究では、モデルをどれだけ正確に解釈できたかという説明性を定量的に評価する手法を提案し、その説明性を最大化するという理論的な裏付けがある手法を提案した点に大きな学術的意義を有する。近年、機械学習は様々な分野で用いられようとしている。特に、病気の診断や自動運転等、間違いが重大な結果をもたらす分野において、機械学習がなぜそのような結果を出力したかを説明することは重要である。機械学習を広く社会へ適用するにあたって、本研究の社会的意義は大きい。