2020 Fiscal Year Final Research Report
Development of Integrated Approximation and Compression Techniques for Next Generation Streaming Data Mining
Project/Area Number |
17K00301
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Multi-year Fund |
Section | 一般 |
Research Field |
Intelligent informatics
|
Research Institution | Shizuoka University (2018-2020) University of Yamanashi (2017) |
Principal Investigator |
|
Project Period (FY) |
2017-04-01 – 2021-03-31
|
Keywords | ストリームデータ / オンラインアルゴリズム / 系列予測 / 頻出パターンマイニング |
Outline of Final Research Achievements |
In this research, we developed a fast and memory-efficient algorithm for frequent sequential pattern mining from streaming data (FSP-SD). Streaming data analysis is a central issue in many domains. FSP-SD is one of the most fundamental tasks in streaming data analysis dealing with discrete structures. It exhibits two important issues; (1) the real time property to process a huge volume of transactions continuously arriving at high speed and simultaneously output the frequent sequences (FSs); and (2) memory efficiency to enumerate FSs while managing an exponential number of candidates with limited memory resource. We have addressed these two issues based on a novel technique, which is achieved by integrating approximation and compression. Our proposed algorithm and implementation, called PARASOL, is published in Journal of Intelligent Information Systems, and now available freely for academic. We also applied PARASOL to the event prediction problem.
|
Free Research Field |
知能情報学
|
Academic Significance and Societal Importance of the Research Achievements |
クラウドサービスやIoTの発展に伴い,多くのストリームデータが生み出されている.ストリームデータのインパクトはリアルタイム分析にあるが,他方,大量のデータを高速・省メモリで処理する必要がある.本研究で扱う問題は,組み合わせ爆発やリアルタイム性などオンライン処理を実現するストリームデータマイニングに共通する技術的制約や難しさを含んでおり重要な基礎問題に位置付けられる.本研究を通して,適用困難だった大規模データへのデータマイニング法の可用性が高められ、安価な計算資源でビッグデータの相関分析や時系列解析を行えるようになっている.
|