2018 Fiscal Year Annual Research Report

Data Mining for Graphs and Networks via Local Intrinsic Dimensional Modeling

Research Project

Project/Area Number	18H03296
Research Institution	National Institute of Informatics
Principal Investigator	Michael E.Houle 国立情報学研究所, 大学共同利用機関等の部局等, 客員教授 (90399270)
Project Period (FY)	2018-04-01 – 2021-03-31
Keywords	高次元空間 / 極値理論 / データマイニング / 機械学習 / ニューラルネットワーク
Outline of Annual Research Achievements	1. Publication of a refereed paper at the top-tier International Conference on Learning Representations (ICLR 2018). This paper presented a characterization of corrupted examples in adversarial attack on classification systems, and a practical detection method, based on the local intrinsic dimensionality (LID) model central to this project. (Acceptance rate: 2.5%) 2. Publication of a refereed paper at the top-tier International Conference on Machine Learning (ICML 2018). This paper demonstrated that the progress of learning in deep neural network (DNN) classifiers is strongly correlated with a drop in LID at the deep feature level, and showed how to use this effect to prevent overtraining and overfitting to data. (World first in automatic detection and avoidance of overtraining during DNN learning.) 3. 4 other refereed publications at international conferences: (3 at SISAP 2018) correlation between LID and outlierness, the use of LID in accelerating the performance of data fingerprinting, and an adaptation of LID to model the local growth rate of search neighborhoods within graphs; (1 at WIMS 2018) examination of the effect of reverse neighborhood imbalance in similarity graph construction. 4. 1 refereed international top journal publication expanding on earlier work on LID estimation. 5. Two top-tier international conference publications accepted for presentation in FY 2019, on the topics of tight LID estimation and the use of LID in the generation of more realistic neighboring examples in explainability of DNN classification.
Current Status of Research Progress	Current Status of Research Progress 1: Research has progressed more than it was originally planned. Reason The ICLR 2018 paper on adversarial characterization and detection, and the ICML 2018 paper on the characterization and prevention of overfitting, both use the local intrinsic dimensionality (LID) model for both theoretical explanation and practical management of deep learning classification. Both papers have made an impact at the highest levels of the field, and have introduced the LID model to the full international research community, much sooner than expected, and at a higher scale than expected. In its first 13 months since publication, the ICLR paper (2.5% acceptance rate) has been cited 65 times, and in its first 10 months since publication, the ICML paper has been cited 23 times. In my recent reviews for the top-tier IJCAI 2019 international conference, 3 out of the 7 submissions I was assigned to review have independently adopted LID for excellent effect in deep neural network (DNN) applications. All this indicates that one of the most important objectives of the project, to establish LID as an essential, standard model for DNN, has already been met. For the remainder of the project, we will seek to build upon this already very satisfying start.
Strategy for Future Research Activity	Within FY2019: (1) In FY 2018 we developed estimators for a decomposition of LID, following theory published in 2017. Our initial work showed that these estimators can help determine low-dimensional yet discriminative feature subsets. This year, we will work on sharpening these estimators so as to help in feature ranking for subspace clustering applications. (2) In FY2018 we verified the use of LID for applications in deep neural network learning, including classification and adversarial detection. In FY2019, we will extend this work to other neural network applications, including generalized adversarial networks. (3) Work in FY2018 has revealed the importance of sharp, tight estimation of LID in application areas. In FY2019, we will establish new techniques for estimation in databases, data mining and multimedia settings where relatively few sample points may be used; examples of these will include outlier detection and recommender systems. (4) We will make technical innovations available to researchers and practitioners by integrating fundamental tools based on LID into practical systems. In FY2018, we had hoped to integrate an effective new estimator of LID into the ELKI data mining framework - this estimator is now fully designed, and we will propose it for ELKI this year. (5) We will further promote the interdisciplinary international research community by proposing a third NII Shonan Meeting on Dimensionality and Scalability for 2020. This meeting was anticipated for 2019, but we have decided to postpone it due to the circumstances of some of the prospective organizers.

Research Products
(15 results)

All 2019 2018 Other

All Int'l Joint Research (6 results) Journal Article (9 results) (of which Int'l Joint Research: 9 results, Peer Reviewed: 9 results)

[Int'l Joint Research] University of Melbourne(オーストラリア)
- Country Name
  AUSTRALIA
- Counterpart Institution
  University of Melbourne
[Int'l Joint Research] University of Southern Denmark(デンマーク)
- Country Name
  DENMARK
- Counterpart Institution
  University of Southern Denmark
[Int'l Joint Research] CNRS / IRISA Rennes/INRIA / IRISA Rennes(フランス)
- Country Name
  FRANCE
- Counterpart Institution
  CNRS / IRISA Rennes/INRIA / IRISA Rennes
[Int'l Joint Research] University of Novi Sad(セルビア)
- Country Name
  SERBIA
- Counterpart Institution
  University of Novi Sad
[Int'l Joint Research] New Jersey Institute of Technology(米国)
- Country Name
  U.S.A.
- Counterpart Institution
  New Jersey Institute of Technology
[Int'l Joint Research]
- # of Other Countries
  1
[Journal Article] Intrinsic Dimensionality Estimation within Tight Localities2019
- Author(s)
  Amsaleg Laurent, Chelly Oussama, Houle Michael E., Kawarabayashi Ken-ichi, Radovanovic Milos, Treeratanajaru Weeris
- Journal Title
  
  SIAM International Conference on Data Mining (SDM 2019)
  
  Volume: 19 Pages: 181-189
- DOI
  https://doi.org/10.1137/1.9781611975673.21
- Peer Reviewed / Int'l Joint Research
[Journal Article] Improving the Quality of Explanations with Local Embedding Perturbations2019
- Author(s)
  Yunzhe Jia, James Bailey, Kotagiri Ramamohanarao, Christopher Leckie, Michael E. Houle
- Journal Title
  
  25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2019)
  
  Volume: 25 Pages: 印刷中
- Peer Reviewed / Int'l Joint Research
[Journal Article] Extreme-value-theoretic estimation of local intrinsic dimensionality2018
- Author(s)
  Laurent Amsaleg, Oussama Chelly, Teddy Furon, Stephane Girard, Michael E. Houle, Ken-ichi Kawarabayashi, Michael Nett
- Journal Title
  
  Data Mining and Knowledge Discovery
  
  Volume: 32 Pages: 1768～1805
- DOI
  10.1007/s10618-018-0578-6
- Peer Reviewed / Int'l Joint Research
[Journal Article] Intrinsic Degree: An Estimator of the Local Growth Rate in Graphs2018
- Author(s)
  Lorenzo von Ritter, Michael E. Houle, Stephan Guennemann
- Journal Title
  
  11th International Conference on Similarity Search and Applications (SISAP 2018)
  
  Volume: 11 Pages: 195～208
- DOI
  10.1007/978-3-030-02224-2_15
- Peer Reviewed / Int'l Joint Research
[Journal Article] On the Correlation Between Local Intrinsic Dimensionality and Outlierness2018
- Author(s)
  Michael E. Houle, Erich Schubert, Arthur Zimek
- Journal Title
  
  11th International Conference on Similarity Search and Applications (SISAP 2018)
  
  Volume: 11 Pages: 177～191
- DOI
  10.1007/978-3-030-02224-2_14
- Peer Reviewed / Int'l Joint Research
[Journal Article] LID-Fingerprint: A Local Intrinsic Dimensionality-Based Fingerprinting Method2018
- Author(s)
  Michael E. Houle, Vincent Oria, Kurt R. Rohloff, Arwa M. Wali
- Journal Title
  
  11th International Conference on Similarity Search and Applications (SISAP 2018)
  
  Volume: 11 Pages: 134～147
- DOI
  10.1007/978-3-030-02224-2_11
- Peer Reviewed / Int'l Joint Research
[Journal Article] Dimensionality-Driven Learning with Noisy Labels2018
- Author(s)
  Xingjun Ma, Yisen Wang, Michael E. Houle, Shuo Zhou, Sarah M. Erfani, Shu-Tao Xia, Sudanthi Wijewickrema, James Bailey
- Journal Title
  
  35th International Conference on Machine Learning (ICML 2018)
  
  Volume: 35 Pages: 3361～3370
- Peer Reviewed / Int'l Joint Research
[Journal Article] NN-Descent on High-Dimensional Data2018
- Author(s)
  Brankica Bratic, Michael E. Houle, Vladimir Kurbalija, Vincent Oria, Milos; Radovanovic
- Journal Title
  
  8th International Conference on Web Intelligence, Mining and Semantics (WIMS 2018)
  
  Volume: 8 Pages: 20:1～20:8
- DOI
  10.1145/3227609.3227643
- Peer Reviewed / Int'l Joint Research
[Journal Article] Characterizing Adversarial Subspaces Using Local Intrinsic Dimensionality2018
- Author(s)
  Xingjun Ma, Bo Li, Yisen Wang, Sarah M. Erfani, Sudanthi N. R. Wijewickrema, Grant Schoenebeck, Dawn Song, Michael E. Houle, James Bailey
- Journal Title
  
  6th International Conference on Learning Representations (ICLR 2018), CoRR abs/1801.02613
  
  Volume: 6 Pages: 1～15
- Peer Reviewed / Int'l Joint Research

2018 Fiscal Year Annual Research Report

Data Mining for Graphs and Networks via Local Intrinsic Dimensional Modeling

Principal Investigator

Michael E.Houle 国立情報学研究所, 大学共同利用機関等の部局等, 客員教授 (90399270)

Current Status of Research Progress

Reason

Research Products

[Int'l Joint Research] University of Melbourne(オーストラリア)

Country Name

Counterpart Institution

[Int'l Joint Research] University of Southern Denmark(デンマーク)

Country Name

Counterpart Institution

[Int'l Joint Research] CNRS / IRISA Rennes/INRIA / IRISA Rennes(フランス)

Country Name

Counterpart Institution

[Int'l Joint Research] University of Novi Sad(セルビア)

Country Name

Counterpart Institution

[Int'l Joint Research] New Jersey Institute of Technology(米国)

Country Name

Counterpart Institution

[Int'l Joint Research]

# of Other Countries

[Journal Article] Intrinsic Dimensionality Estimation within Tight Localities2019

Author(s)

Journal Title

DOI

[Journal Article] Improving the Quality of Explanations with Local Embedding Perturbations2019

Author(s)

Journal Title

[Journal Article] Extreme-value-theoretic estimation of local intrinsic dimensionality2018

Author(s)

Journal Title

DOI

[Journal Article] Intrinsic Degree: An Estimator of the Local Growth Rate in Graphs2018

Author(s)

Journal Title

DOI

[Journal Article] On the Correlation Between Local Intrinsic Dimensionality and Outlierness2018

Author(s)

Journal Title

DOI

[Journal Article] LID-Fingerprint: A Local Intrinsic Dimensionality-Based Fingerprinting Method2018

Author(s)

Journal Title

DOI

[Journal Article] Dimensionality-Driven Learning with Noisy Labels2018

Author(s)

Journal Title

[Journal Article] NN-Descent on High-Dimensional Data2018

Author(s)

Journal Title

DOI

[Journal Article] Characterizing Adversarial Subspaces Using Local Intrinsic Dimensionality2018

Author(s)

Journal Title