Development of Integrated Approximation and Compression Techniques for Next Generation Streaming Data Mining

Research Project

Project/Area Number	17K00301
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Multi-year Fund
Section	一般
Research Field	Intelligent informatics
Research Institution	Shizuoka University (2018-2020) University of Yamanashi (2017)
Principal Investigator	Yamamoto Yoshitaka 静岡大学, 情報学部, 准教授 (30550793)
Project Period (FY)	2017-04-01 – 2021-03-31
Project Status	Completed (Fiscal Year 2020)
Budget Amount *help	¥4,680,000 (Direct Cost: ¥3,600,000、Indirect Cost: ¥1,080,000) Fiscal Year 2019: ¥2,080,000 (Direct Cost: ¥1,600,000、Indirect Cost: ¥480,000) Fiscal Year 2018: ¥2,080,000 (Direct Cost: ¥1,600,000、Indirect Cost: ¥480,000) Fiscal Year 2017: ¥520,000 (Direct Cost: ¥400,000、Indirect Cost: ¥120,000)
Keywords	ストリームデータ / オンラインアルゴリズム / 系列予測 / 頻出パターンマイニング / 頻出系列パターンマイニング / 非可逆圧縮 / 異常・変化検知
Outline of Final Research Achievements	In this research, we developed a fast and memory-efficient algorithm for frequent sequential pattern mining from streaming data (FSP-SD). Streaming data analysis is a central issue in many domains. FSP-SD is one of the most fundamental tasks in streaming data analysis dealing with discrete structures. It exhibits two important issues; (1) the real time property to process a huge volume of transactions continuously arriving at high speed and simultaneously output the frequent sequences (FSs); and (2) memory efficiency to enumerate FSs while managing an exponential number of candidates with limited memory resource. We have addressed these two issues based on a novel technique, which is achieved by integrating approximation and compression. Our proposed algorithm and implementation, called PARASOL, is published in Journal of Intelligent Information Systems, and now available freely for academic. We also applied PARASOL to the event prediction problem.
Academic Significance and Societal Importance of the Research Achievements	クラウドサービスやIoTの発展に伴い，多くのストリームデータが生み出されている．ストリームデータのインパクトはリアルタイム分析にあるが，他方，大量のデータを高速・省メモリで処理する必要がある．本研究で扱う問題は，組み合わせ爆発やリアルタイム性などオンライン処理を実現するストリームデータマイニングに共通する技術的制約や難しさを含んでおり重要な基礎問題に位置付けられる．本研究を通して，適用困難だった大規模データへのデータマイニング法の可用性が高められ、安価な計算資源でビッグデータの相関分析や時系列解析を行えるようになっている．

Report

(5 results)

2020 Annual Research Report Final Research Report ( PDF )
2019 Research-status Report
2018 Research-status Report
2017 Research-status Report

Research Products
(21 results)

All 2021 2020 2019 2018 2017 Other

All Journal Article (5 results) (of which Peer Reviewed: 3 results, Open Access: 1 results) Presentation (13 results) (of which Int'l Joint Research: 3 results, Invited: 2 results) Remarks (3 results)

[Journal Article] PARASOL: a hybrid approximation approach for scalable frequent itemset mining in streaming data2019
- Author(s)
  Yoshitaka Yamamoto, Yasuo Tabei, Koji Iwanuma
- Journal Title
  
  Journal of Intelligent Information Systems
  
  Volume: 17 Issue: 1 Pages: 1-29
- DOI
  10.1007/s10844-019-00590-9
- Related Report
  2019 Research-status Report
- Peer Reviewed / Open Access
[Journal Article] A Skipping FP-Tree for Incrementally Intersecting Closed Itemsets in On-Line Stream Mining2019
- Author(s)
  Takumi Nishina, Koji Iwanuma, Yoshitaka Yamamoto
- Journal Title
  
  Proc. of BigComp2019
  
  Volume: - Pages: 1-4
- Related Report
  2018 Research-status Report
- Peer Reviewed
[Journal Article] Approximate-Closed-Itemset Mining for Streaming Data Under Resource Constraint2018
- Author(s)
  Yoshitaka Yamamoto, Yasuo Tabei, Koji Iwanuma
- Journal Title
  
  CoRR abs/1901.01710 (2019)
  
  Volume: - Pages: 1-14
- Related Report
  2018 Research-status Report
[Journal Article] On-Line Approximation Mining for Frequent Closed Itemsets Greater than or Equal to Size K2018
- Author(s)
  Takumi Nishina, Koji Iwanuma, Yoshitaka Yamamoto
- Journal Title
  
  Proc of BCD2018
  
  Volume: - Pages: 61-66
- Related Report
  2018 Research-status Report
- Peer Reviewed
[Journal Article] 深層知識を獲得するストリームデータマイニングの研究2017
- Author(s)
  山本泰生
- Journal Title
  
  山梨科学アカデミー会報
  
  Volume: 44 Pages: 15-22
- Related Report
  2017 Research-status Report
[Presentation] ストリームデータの劣線形要約とその応用2021
- Author(s)
  山本泰生
- Organizer
  名城大学理工談話会
- Related Report
  2020 Annual Research Report
- Invited
[Presentation] Transient pattern detection from streaming nature data2020
- Author(s)
  Thanapol Phungtua-eng, Yoshitaka Yamamoto, Shigeyuki Sako
- Organizer
  CANDAR'20 WANC workshop
- Related Report
  2020 Annual Research Report
- Int'l Joint Research
[Presentation] Mining Consistent, Non-Redundant and Minimal Negative Rues Based on Minimal Generators2020
- Author(s)
  Koji Iwanuma, Kento Yajima, Yoshitaka Yamamoto
- Organizer
  IEEE BigData2019 in poster
- Related Report
  2020 Annual Research Report
- Int'l Joint Research
[Presentation] メンバーシップクエリに対する劣線形サマリの構築ー第２報ー2020
- Author(s)
  山本泰生
- Organizer
  第120回知識ベースシステム研究会(オンライン)
- Related Report
  2020 Annual Research Report
[Presentation] 射影積算法による劣線形メンバーシップサマリの構築に向けて2020
- Author(s)
  山本泰生
- Organizer
  第127回情報処理学会プログラミング研究会
- Related Report
  2019 Research-status Report
[Presentation] 確率的メンバーシップサマリの構築に向けて2019
- Author(s)
  山本泰生，錦戸彩
- Organizer
  第117回知識ベースシステム研究会
- Related Report
  2019 Research-status Report
[Presentation] Accelerating an On-Line Approximation Mining for Large Closed Itemsets2019
- Author(s)
  Koji Iwanuma ; Takumi Nishina ; Yoshitaka Yamamoto
- Organizer
  IEEE BigData2019
- Related Report
  2019 Research-status Report
- Int'l Joint Research
[Presentation] 半順序ストリームデータのサマリ構築2018
- Author(s)
  山本泰生, 岩沼宏治, 今井友輝
- Organizer
  人工知能学会知識ベースシステム研究会
- Related Report
  2018 Research-status Report
[Presentation] 正負の相関ルールの妥当性の再考察と正負ルールの高速抽出手法2018
- Author(s)
  雨宮晶良, 岩沼宏治, 谷島健斗, 山本泰生
- Organizer
  人工知能学会知識ベースシステム研究会
- Related Report
  2018 Research-status Report
[Presentation] 逆順走査FP木とトライ木を併用したストリーム上の飽和集合のオンライン抽出2018
- Author(s)
  仁科拓巳, 岩沼宏治, 山本泰生
- Organizer
  人工知能基本問題研究会 (SIG-FPAI)
- Related Report
  2017 Research-status Report
[Presentation] 飽和集合上の極小生成子の支持度計算を行わない高速抽出ー負の相関ルール抽出の効率化にむけてー2018
- Author(s)
  谷島健斗, 岩沼宏治, 山本泰生
- Organizer
  人工知能基本問題研究会 (SIG-FPAI)
- Related Report
  2017 Research-status Report
[Presentation] リソース指向型計算に基づくストリームデータマイニングの研究2017
- Author(s)
  山本泰生
- Organizer
  人工知能学会合同研究会2017 知識ベースシステム研究会
- Related Report
  2017 Research-status Report
- Invited
[Presentation] 負の相関ルールマイニングの効率化のための飽和アイテム集合からの極小生成子の高速抽出2017
- Author(s)
  谷島健斗，岩沼宏治，山本泰生
- Organizer
  人工知能学会合同研究会2017 知識ベースシステム研究会
- Related Report
  2017 Research-status Report
[Remarks] PARASOL (ver. 1.00)
- URL
  https://github.com/Yoshitaka-Yamamoto/parasol
- Related Report
  2020 Annual Research Report
[Remarks] 公開ソフトウェア PARASOL (ver. 1.00)
- URL
  https://github.com/Yoshitaka-Yamamoto/parasol
- Related Report
  2019 Research-status Report 2018 Research-status Report
[Remarks] 研究者HP
- URL
  http://www.iwlab.org/our-lab/our-staff/yy
- Related Report
  2017 Research-status Report

Development of Integrated Approximation and Compression Techniques for Next Generation Streaming Data Mining

Principal Investigator

Yamamoto Yoshitaka 静岡大学, 情報学部, 准教授 (30550793)

¥4,680,000 (Direct Cost: ¥3,600,000、Indirect Cost: ¥1,080,000)

Report

Research Products

[Journal Article] PARASOL: a hybrid approximation approach for scalable frequent itemset mining in streaming data2019

Author(s)

Journal Title

DOI

Related Report

[Journal Article] A Skipping FP-Tree for Incrementally Intersecting Closed Itemsets in On-Line Stream Mining2019

Author(s)

Journal Title

Related Report

[Journal Article] Approximate-Closed-Itemset Mining for Streaming Data Under Resource Constraint2018

Author(s)

Journal Title

Related Report

[Journal Article] On-Line Approximation Mining for Frequent Closed Itemsets Greater than or Equal to Size K2018

Author(s)

Journal Title

Related Report

[Journal Article] 深層知識を獲得するストリームデータマイニングの研究2017

Author(s)

Journal Title

Related Report

[Presentation] ストリームデータの劣線形要約とその応用2021

Author(s)

Organizer

Related Report

[Presentation] Transient pattern detection from streaming nature data2020

Author(s)

Organizer

Related Report

[Presentation] Mining Consistent, Non-Redundant and Minimal Negative Rues Based on Minimal Generators2020

Author(s)

Organizer

Related Report

[Presentation] メンバーシップクエリに対する劣線形サマリの構築ー第２報ー2020

Author(s)

Organizer

Related Report

[Presentation] 射影積算法による劣線形メンバーシップサマリの構築に向けて2020

Author(s)

Organizer

Related Report

[Presentation] 確率的メンバーシップサマリの構築に向けて2019

Author(s)

Organizer

Related Report

[Presentation] Accelerating an On-Line Approximation Mining for Large Closed Itemsets2019

Author(s)

Organizer

Related Report

[Presentation] 半順序ストリームデータのサマリ構築2018

Author(s)

Organizer

Related Report

[Presentation] 正負の相関ルールの妥当性の再考察と正負ルールの高速抽出手法2018

Author(s)

Organizer

Related Report

[Presentation] 逆順走査FP木とトライ木を併用した ストリーム上の飽和集合のオンライン抽出2018

Author(s)

Organizer

Related Report

[Presentation] 飽和集合上の極小生成子の支持度計算を行わない高速抽出ー負の相関ルール抽出の効率化にむけてー2018

Author(s)

Organizer

Related Report

[Presentation] リソース指向型計算に基づくストリームデータマイニングの研究2017

Author(s)

Organizer

Related Report

[Presentation] 負の相関ルールマイニングの効率化のための飽和アイテム集合からの極小生成子の高速抽出2017

Author(s)

Organizer

Related Report

[Remarks] PARASOL (ver. 1.00)

[Presentation] 逆順走査FP木とトライ木を併用したストリーム上の飽和集合のオンライン抽出2018