A Study on Rate-distortion Theory-based Learning and its Application for Advanced Cluster Analyses

Research Project

Project/Area Number	20700132
Research Category	Grant-in-Aid for Young Scientists (B)
Allocation Type	Single-year Grants
Research Field	Intelligent informatics
Research Institution	Gunma University
Principal Investigator	ANDO Shin Gunma University, 大学院・工学研究科, 助教 (70401685)
Project Period (FY)	2008 – 2009
Project Status	Completed (Fiscal Year 2009)
Budget Amount *help	¥4,290,000 (Direct Cost: ¥3,300,000、Indirect Cost: ¥990,000) Fiscal Year 2009: ¥1,560,000 (Direct Cost: ¥1,200,000、Indirect Cost: ¥360,000) Fiscal Year 2008: ¥2,730,000 (Direct Cost: ¥2,100,000、Indirect Cost: ¥630,000)
Keywords	知識発見とデータマイニング / レート歪み理論 / データ圧縮 / クラスタリング / 時系列クラスタリング / 外れ値検出 / 転移学習
Research Abstract	Our study achieved the extension of Rate-Distortion (RD) theoretically-principled learning method for practical and leading-edge problems in data mining and machine learning. One of our concrete achievement is formalizingRD learning for time series data described bymultivariate polynomial regression models and Markov chains. As a result, we developeda methodology for anomaly detection of dynamic systems. We validated our methodswith microarray time series data.We were able to detect the active state of the network with significantly higher precision and recall than the conventional methods.These results were published in KDD,which is one of the major conferences in Machine Learning/Data Minining. With respect to the time series data mining, we constructed benchmark datasets for different domains: including microarray data, financial time series, and robot trajectory. We developed an efficient instance-based method for online outlier detection method based on multi-perspective ensemble le … More arning. This results is presented at a Japanese workshop and submitted for an international conference. We also extended the RD formalization fortransfer learning problems, addressingmultiple, heterogeneous data sourcesand developed a methodology for regularized learning for unsupervised transfer learning.The main concrete result is the clustering of the heterogeneous text data, where significantly higher precision and recall was achieved in comparison to conventional methods.We showed further extension of the RD learningfor integrating geometric structures intoregularization framework. For validating theproposed approach, we prepared a benchmark data from bibliographical data annotated withco-author graph information.We applied ItGA, an information-theoreticGeo-topico analysis, and discovered better topics than popular PLSA and LDA methods.ItGA were significantly better as a dimensionality reduction method to extract important featuresof text. These results are published at ICDMand are in submission for other major Data Mining conferences. Less

Report

(3 results)

2009 Annual Research Report Final Research Report ( PDF )
2008 Annual Research Report

Research Products
(6 results)

All 2010 2009 2008

All Journal Article (3 results) (of which Peer Reviewed: 3 results) Presentation (3 results)

[Journal Article] Detection of unique temporal segments by information theoretic meta-clustering2009
- Author(s)
  Shin Ando, Einoshin Suzuki
- Journal Title
  
  Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
  
  Pages: 59-68
- Related Report
  2009 Final Research Report
- Peer Reviewed
[Journal Article] Detection of unique temporal segments by information theoretic meta-clustering2009
- Author(s)
  安藤晋
- Journal Title
  
  Proceedings of the ACM SIGKDD international conference on Knowledge discovery and data mining 15
  
  Pages: 59-68
- Related Report
  2009 Annual Research Report
- Peer Reviewed
[Journal Article] Unsupervised Cross-domain Learning by Interaction Information Co-dustering2008
- Author(s)
  Shin Ando and Einoshin Suzuki
- Journal Title
  
  Proceedings of the 8th IEEE International Conference on Data Mining (ICDM08)
  
  Pages: 13-12
- Related Report
  2008 Annual Research Report
- Peer Reviewed
[Presentation] 異常クラスタのアンサンブルによる特異行動の検出2010
- Author(s)
  星野大祐, Theerasak Tanomphongphang, 安藤晋, 関庸一, Swagat, 鈴木英之進
- Organizer
  第78回数理モデル化と問題解決研究会(mps78)
- Place of Presentation
  群馬大学荒牧キャンパス情報処理センター
- Year and Date
  2010-05-21
- Related Report
  2009 Final Research Report
[Presentation] サンプルの所属度に応じた可変自己組織化マップ2010
- Author(s)
  多賀谷侑史, 安藤晋, 関庸一
- Organizer
  第77回数理モデル化と問題解決研究会(mps77)
- Place of Presentation
  伊豆高原ルネッサ赤沢
- Year and Date
  2010-03-05
- Related Report
  2009 Final Research Report
[Presentation] サンプルの所属度に応じた可変自己組織化マップ2010
- Author(s)
  多賀谷侑史
- Organizer
  MPS(数理モデル化と問題解決)研究会
- Place of Presentation
  伊豆高原ルネッサ赤沢
- Year and Date
  2010-03-05
- Related Report
  2009 Annual Research Report

A Study on Rate-distortion Theory-based Learning and its Application for Advanced Cluster Analyses

Principal Investigator

ANDO Shin Gunma University, 大学院・工学研究科, 助教 (70401685)

¥4,290,000 (Direct Cost: ¥3,300,000、Indirect Cost: ¥990,000)

Report

Research Products

[Journal Article] Detection of unique temporal segments by information theoretic meta-clustering2009

Author(s)

Journal Title

Related Report

[Journal Article] Detection of unique temporal segments by information theoretic meta-clustering2009

Author(s)

Journal Title

Related Report

[Journal Article] Unsupervised Cross-domain Learning by Interaction Information Co-dustering2008

Author(s)

Journal Title

Related Report

[Presentation] 異常クラスタのアンサンブルによる特異行動の検出2010

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] サンプルの所属度に応じた可変自己組織化マップ2010

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] サンプルの所属度に応じた可変自己組織化マップ2010

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report