• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Statistically Sound Pattern Mining

Research Project

Project/Area Number 17K12736
Research Category

Grant-in-Aid for Young Scientists (B)

Allocation TypeMulti-year Fund
Research Field Intelligent informatics
Research InstitutionThe University of Tokyo

Principal Investigator

Komiyama Junpei  東京大学, 生産技術研究所, 助教 (20780042)

Research Collaborator Ishihata Masakazu  
Arimura Hiroki  
Nishibayashi Takashi  
Minato Shin-ichi  
Maehara Takanori  
Project Period (FY) 2017-04-01 – 2019-03-31
Project Status Completed (Fiscal Year 2018)
Budget Amount *help
¥4,030,000 (Direct Cost: ¥3,100,000、Indirect Cost: ¥930,000)
Fiscal Year 2018: ¥1,950,000 (Direct Cost: ¥1,500,000、Indirect Cost: ¥450,000)
Fiscal Year 2017: ¥2,080,000 (Direct Cost: ¥1,600,000、Indirect Cost: ¥480,000)
Keywordsパターンマイニング / 統計検定 / 多重検定 / データマイニング / 機械学習 / FDR / アルゴリズム / FWER / 統計数学
Outline of Final Research Achievements

Pattern mining algorithms enumerate all the combinatorial patterns with their frequency larger than a given threshold. Existing algorithms output many patterns that characterizes a dataset, they do not address how significant the found patterns are in terms of statistical significance. To address this issue, we propose a method that guarantees the rate of false discovery in found patterns while keeping its computational efficiency. The proposed method is presented in a top-tier data mining / artificial intelligence conference (KDD2017).

Academic Significance and Societal Importance of the Research Achievements

データマイニングは知識発見を求める分野であるが、発見が統計的にどの程度の確からしさがあるのかは多くの場合考慮されていない。とくに、パターンマイニングはパターン(特徴量)の組合せの中から興味があるものを探すが、パターン数が多い場合には出版バイアスが発生し、得られたパターンが偶然の偏りなのか再現可能なものかの判断がつかない。この現状を鑑みて、本研究は得られたパターンのうち統計的に有意なものを探すアルゴリズムや、出版バイアスがどの程度大きくなりうるのかを定量化することで、データマイニングの知識発見としての健全性を保証するための基礎的な結果が得られたと考える。

Report

(3 results)
  • 2018 Annual Research Report   Final Research Report ( PDF )
  • 2017 Research-status Report
  • Research Products

    (5 results)

All 2018 2017

All Journal Article (1 results) (of which Peer Reviewed: 1 results) Presentation (4 results) (of which Int'l Joint Research: 2 results)

  • [Journal Article] Statistical Emerging Pattern Mining with Multiple Testing Correction2017

    • Author(s)
      Junpei Komiyama and Masakazu Ishihata and Hiroki Arimura and Takashi Nishibayashi and Shin-ichi Minato
    • Journal Title

      Proceedings of the 23rd {ACM} {SIGKDD} International Conference on Knowledge Discovery and Data Mining

      Volume: 1 Pages: 897-906

    • DOI

      10.1145/3097983.3098137

    • Related Report
      2017 Research-status Report
    • Peer Reviewed
  • [Presentation] A Simple Way to Deal with Cherry-picking2018

    • Author(s)
      Junpei Komiyama, Takanori Maehara
    • Organizer
      Computing Research Repository (CoRR), arXiv
    • Related Report
      2018 Annual Research Report
  • [Presentation] Statistical Emerging Pattern Mining with Multiple Testing Correction2017

    • Author(s)
      Junpei Komiyama and Masakazu Ishihata and Hiroki Arimura and Takashi Nishibayashi and Shin-ichi Minato
    • Organizer
      International Conference on Knowledge Discovery and Data Mining
    • Related Report
      2017 Research-status Report
    • Int'l Joint Research
  • [Presentation] Controlling FWER and FDR in emerging pattern mining2017

    • Author(s)
      Junpei Komiyama and Masakazu Ishihata and Hiroki Arimura and Takashi Nishibayashi and Shin-ichi Minato
    • Organizer
      International Conference on Multiple Comparison Procedures
    • Related Report
      2017 Research-status Report
    • Int'l Joint Research
  • [Presentation] Statistical Emerging Pattern Mining with Multiple Testing Correction2017

    • Author(s)
      Junpei Komiyama
    • Organizer
      北海道大学離散構造処理系プロジェクトセミナー
    • Related Report
      2017 Research-status Report

URL: 

Published: 2017-04-28   Modified: 2020-03-30  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi