2016 Fiscal Year Final Research Report

Similarity Measures for Nearest Neighbor Search and Classification Methods in High Dimensional and Large Number of Data

Research Project

Project/Area Number	25730142
Research Category	Grant-in-Aid for Young Scientists (B)
Allocation Type	Multi-year Fund
Research Field	Intelligent informatics
Research Institution	Yamagata University (2015-2016) National Institute of Genetics (2013-2014)
Principal Investigator	Suzuki Ikumi 山形大学, 大学院理工学研究科, 助教 (20637730)
Project Period (FY)	2013-04-01 – 2016-03-31
Keywords	ハブネス / ハブの軽減 / センタリング / 近傍法
Outline of Final Research Achievements	Recently, hubness, a phenomenon occurring in high-dimensional datasets as a result of curse of dimensionality has attracted the attention of researchers in the artificial intelligence community, especially for data mining and machine learning. In this work, we pointed out that the hubness influences the performance of k-nearest neighbor (k-NN) methods. We reported that subtracting mean vector from each sample (centering) is a simple, yet very effective for improving k-NN classification. Also, we proved that centering is effective for k-NNs, because centering reduces hubs in a dataset.
Free Research Field	統計的機械学習