Speeding up the clustering methods with summable lower bounds in contractive mappings

Research Project

Project/Area Number	17K00159
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Multi-year Fund
Section	一般
Research Field	Multimedia database
Research Institution	University of Shizuoka
Principal Investigator	IKEDA Tetsuo 静岡県立大学, 経営情報学部, 教授 (60363727)
Co-Investigator(Kenkyū-buntansha)	斉藤和巳神奈川大学, 理学部, 教授 (80379544) 青山一生日本電信電話株式会社NTTコミュニケーション科学基礎研究所, その他部局等, 主任研究員 (80447028)
Project Period (FY)	2017-04-01 – 2020-03-31
Project Status	Completed (Fiscal Year 2019)
Budget Amount *help	¥4,290,000 (Direct Cost: ¥3,300,000、Indirect Cost: ¥990,000) Fiscal Year 2019: ¥910,000 (Direct Cost: ¥700,000、Indirect Cost: ¥210,000) Fiscal Year 2018: ¥910,000 (Direct Cost: ¥700,000、Indirect Cost: ¥210,000) Fiscal Year 2017: ¥2,470,000 (Direct Cost: ¥1,900,000、Indirect Cost: ¥570,000)
Keywords	情報検索 / クラスタリング / 縮小写像 / クラスリング / 可視化
Outline of Final Research Achievements	The purpose of this research project is to establish efficient clustering and similarity search technologies for large data: (1) We proposed index construction algorithm that recursively builds a CBT (complete binary tree) index, and an online similarity search algorithm that efficiently prunes unnecessary branches and filters objects by using the CBT index. (2) We proposed an efficient acceleration algorithm for Lloyd-type k-means clustering, which employs a projection-based filter (PRJ) to avoid unnecessary distance calculations. The PRJ exploits a summable lower bound on a squared distance defined in a lower-dimensional space to which data points are projected. (3) We proposed an inverted-file k-means clustering algorithm (IVF). To achieve high performance, IVF exploits two distinct data representations. One is a sparse expression for both the object and mean feature vectors. The other is an inverted-file data structure for a set of the mean feature vectors.
Academic Significance and Societal Importance of the Research Achievements	画像、文書、DNA 配列などのマルチメディアデータは近年爆発的に増加している。これらのマルチメディアデータの集合の基本構造を把握し理解するための技法としてクラスタリング技法と類似検索技法がある。クラスタリングとは、データの集合をクラスタという互いに似ているデータからなる部分集合に分けることである。類似検索とは、入力となるデータと類似度の大きいデータを検索することである。クラスタリングおよび類似検索ともに、一般にデータ量が大きいと処理時間を多く要することが知られており、高効率なクラスタリング技法及び類似検索技法の実現が強く求められている。本研究の成果はこの要望に応えるものである。

Report

(4 results)

2019 Annual Research Report Final Research Report ( PDF )
2018 Research-status Report
2017 Research-status Report

Research Products
(5 results)

All 2020 2018 2017

All Journal Article (4 results) (of which Open Access: 3 results, Peer Reviewed: 3 results) Presentation (1 results)

[Journal Article] Inverted-File k-Means Clustering: Performance Analysis2020
- Author(s)
  Kazuo Aoyama, Kazumi Saito, Tetsuo Ikeda
- Journal Title
  
  arXiv:2002.09094
  
  Volume: －
- Related Report
  2019 Annual Research Report
- Open Access
[Journal Article] Accelerating a Lloyd-Type k-Means Clustering Algorithm with Summable Lower Bounds in a Lower-Dimensional Space2018
- Author(s)
  AOYAMA Kazuo、SAITO Kazumi、IKEDA Tetsuo
- Journal Title
  
  IEICE Transactions on Information and Systems
  
  Volume: E101.D Issue: 11 Pages: 2773-2783
- DOI
  10.1587/transinf.2017EDP7392
- ISSN
  0916-8532, 1745-1361
- Year and Date
  2018-11-01
- Related Report
  2018 Research-status Report
- Peer Reviewed
[Journal Article] Pivot Generation Algorithm with a Complete Binary Tree for Efficient Exact Similarity Search2018
- Author(s)
  Yuki Yamagishi, Kazuo Aoyama, Kazumi Saito, and Tetsuo Ikeda
- Journal Title
  
  IEICE Transactions on Information and Systems
  
  Volume: E101.D Issue: 1 Pages: 142-151
- DOI
  10.1587/transinf.2017EDP7077
- NAID
  130006301174
- ISSN
  0916-8532, 1745-1361
- Related Report
  2017 Research-status Report
- Peer Reviewed / Open Access
[Journal Article] Efficient Similarity Search with a Pivot-Based Complete Binary Tree2017
- Author(s)
  Yuki Yamagishi, Kazuo Aoyama, Kazumi Saito, and Tetsuo Ikeda
- Journal Title
  
  IEICE Transactions on Information and Systems
  
  Volume: E100.D Issue: 10 Pages: 2526-2536
- DOI
  10.1587/transinf.2017EDP7100
- NAID
  130006110304
- ISSN
  0916-8532, 1745-1361
- Related Report
  2017 Research-status Report
- Peer Reviewed / Open Access
[Presentation] 貪欲到達中心性によるネットワーク探索性能の特徴付け2017
- Author(s)
  宋鵬, 斉藤和巳, 池田哲夫, 青山一生
- Organizer
  第16回情報科学技術フォーラム（FIT2017）
- Related Report
  2017 Research-status Report

Speeding up the clustering methods with summable lower bounds in contractive mappings

Principal Investigator

IKEDA Tetsuo 静岡県立大学, 経営情報学部, 教授 (60363727)

¥4,290,000 (Direct Cost: ¥3,300,000、Indirect Cost: ¥990,000)

Report

Research Products

[Journal Article] Inverted-File k-Means Clustering: Performance Analysis2020

Author(s)

Journal Title

Related Report

[Journal Article] Accelerating a Lloyd-Type k-Means Clustering Algorithm with Summable Lower Bounds in a Lower-Dimensional Space2018

Author(s)

Journal Title

DOI

ISSN

Year and Date

Related Report

[Journal Article] Pivot Generation Algorithm with a Complete Binary Tree for Efficient Exact Similarity Search2018

Author(s)

Journal Title

DOI

NAID

ISSN

Related Report

[Journal Article] Efficient Similarity Search with a Pivot-Based Complete Binary Tree2017

Author(s)

Journal Title

DOI

NAID

ISSN

Related Report

[Presentation] 貪欲到達中心性によるネットワーク探索性能の特徴付け2017

Author(s)

Organizer

Related Report