2012 Fiscal Year Annual Research Report

予測と意思決定のための機械学習理論の構築とその神経回路での実現

Planned Research

Project Area	Elucidation of neural computation for prediction and decision making: toward better human understanding and applications
Project/Area Number	23120004
Research Institution	Tokyo Institute of Technology
Principal Investigator	杉山将東京工業大学, 情報理工学(系)研究科, 准教授 (90334515)
Co-Investigator(Kenkyū-buntansha)	森本淳株式会社国際電気通信基礎技術研究所, 脳情報通信総合研究所, 研究員 (10505986)
Project Period (FY)	2011-07-25 – 2016-03-31
Keywords	予測 / 意志決定 / 機械学習 / 特徴選択 / 強化学習
Research Abstract	特徴選択に関しては，Ｈ２３年度に原理を考案した小～中規模な予測・意思決定問題に対する特徴選択アルゴリズムの具体的なアルゴリズム(L1-LSMI)を構築し，適切に動作することを確認した．また，大規模な予測・意思決定問題に対しては，Ｈ２３年度に考案した原理を元に具体的な特徴選択アルゴリズムを構築した．そして，高次元データに対して高速に計算が可能であることを確認した．強化学習に関しては，軌道ベースのモデル化と方策改善の枠組みを構築し，経由点到達運動課題に対する行動則の獲得を実ヒューマノイドロボットの腕部４自由度を用いて実現した．提案手法を用いることにより，数十回程度の現実的な試行回数内で，実環境において強化学習を用いた運動学習が可能であることを実験的に示した．また，標本を有効的に再利用することにより少数データに対しても優れた性能を発揮する政策勾配型の強化学習アルゴリズム開発し，その有効性をシミュレーションにより実証した．
Current Status of Research Progress	Current Status of Research Progress 1: Research has progressed more than it was originally planned. Reason もともと計画していた研究内容に加え，平成24年8月に強化学習技術のロボット制御に対する新たな応用の可能性が見出されたため，モデルベース，モデルフリーの強化学習法の更なる詳細な調査を実施し，より精密な理論解析と実験評価を行った．
Strategy for Future Research Activity	当初の計画以上に研究成果が出ているため，更に高いゴールを設定して，社会にインパクトを与えられるような成果が得られるよう全力を尽くす．

Research Products
(16 results)

All 2012 Other

All Journal Article (7 results) (of which Peer Reviewed: 7 results) Presentation (5 results) Book (2 results) Remarks (2 results)

[Journal Article] Analysis and improvement of policy gradient estimation.2012
- Author(s)
  Zhao, T., Hachiya, H., Niu, G., & Sugiyama, M.
- Journal Title
  
  Neural Networks
  
  Volume: 26 Pages: 118-129
- Peer Reviewed
[Journal Article] Multi-task approach to reinforcement learning for factored-state Markov decision problems.2012
- Author(s)
  Simm, J., Sugiyama, M., & Hachiya, H.
- Journal Title
  
  IEICE Transactions on Information and Systems
  
  Volume: .E95-D Pages: 2426-2437
- Peer Reviewed
[Journal Article] Improving importance estimation in pool-based batch active learning for approximate linear regression.2012
- Author(s)
  Kurihara, N. & Sugiyama, M.
- Journal Title
  
  Neural Networks
  
  Volume: 36 Pages: 73-82
- Peer Reviewed
[Journal Article] Early stopping heuristics in pool-based incremental active learning for least-squares probabilistic classifier.2012
- Author(s)
  Kobayashi, T. & Sugiyama, M.
- Journal Title
  
  IEICE Transactions on Information and Systems
  
  Volume: E95-D Pages: 2065-2073
- Peer Reviewed
[Journal Article] Real-time stylistic prediction for whole-body human motions2012
- Author(s)
  Matsubara, T, Hyon, S.-H., & Morimoto, J.
- Journal Title
  
  Neural Networks
  
  Volume: 25 Pages: 191-199
- Peer Reviewed
[Journal Article] The eMOSAIC model for humanoid robot control2012
- Author(s)
  Sugimoto, N., Morimoto, J., Hyon, S.-H., & Kawato, M.
- Journal Title
  
  Neural Networks
  
  Volume: 29 Pages: 8-19
- Peer Reviewed
[Journal Article] On-line motion synthesis and adaptation using a trajectory database2012
- Author(s)
  Forte, D., Gams, A., Morimoto, J., & Ude, A.
- Journal Title
  
  Robotics and Autonomous Systems
  
  Volume: 60 Pages: 1327-1339
- Peer Reviewed
[Presentation] Perfect dimensionality recovery by variational Bayesian PCA.2012
- Author(s)
  Nakajima, S., Tomioka, R., Sugiyama, M., & Babacan, D.
- Organizer
  Neural Information Processing Systems (NIPS2012),
- Place of Presentation
  Lake Tahoe, Nevada, USA
- Year and Date
  20121203-20121206
[Presentation] Sparse additive matrix factorization for robust PCA and its generalization.2012
- Author(s)
  Nakajima, S., Sugiyama, M., & Babacan, D.
- Organizer
  the Fourth Asian Conference on Machine Learning (ACML2012),
- Place of Presentation
  Singapore
- Year and Date
  20121104-20121106
[Presentation] Artist agent: A reinforcement learning approach to automatic stroke generation in oriental ink painting.2012
- Author(s)
  Xie, N., Hachiya, H., & Sugiyama, M.
- Organizer
  29th International Conference on Machine Learning (ICML2012),
- Place of Presentation
  Edinburgh, Scotland
- Year and Date
  20120626-20120701
[Presentation] Extraction of latent kinematic relationships between human users and assistive robots2012
- Author(s)
  Morimoto, J., Noda, T., & Hyon, S.-H.
- Organizer
  IEEE International Conference on Robotics and Automation (ICRA2012)
- Place of Presentation
  St. Paul, Minnesota, USA
- Year and Date
  20120514-20120518
[Presentation] Fast learning rate of multiple kernel learning: Trade-off between sparsity and smoothness.2012
- Author(s)
  Suzuki, T. & Sugiyama, M.
- Organizer
  Fifteenth International Conference on Artificial Intelligence and Statistics (AISTATS2012),
- Place of Presentation
  La Palma, Canary Islands
- Year and Date
  20120421-20120423
[Book] Machine Learning in Non-Stationary Environments: Introduction to Covariate Shift Adaptation2012
- Author(s)
  Sugiyama, M. & Kawanabe, M.
- Total Pages
  308
- Publisher
  MIT Press
[Book] Density Ratio Estimation in Machine Learning,2012
- Author(s)
  Sugiyama, M., Suzuki, T., & Kanamori, T.
- Total Pages
  344
- Publisher
  Cambridge University Press
[Remarks] 杉山将のページ
- URL
  http://sugiyama-www.cs.titech.ac.jp/~sugi/a
[Remarks] 森本淳のページ
- URL
  http://www.cns.atr.jp/~xmorimo/

2012 Fiscal Year Annual Research Report

予測と意思決定のための機械学習理論の構築とその神経回路での実現

Principal Investigator

杉山 将 東京工業大学, 情報理工学(系)研究科, 准教授 (90334515)

Current Status of Research Progress

Reason

Research Products

[Journal Article] Analysis and improvement of policy gradient estimation.2012

Author(s)

Journal Title

[Journal Article] Multi-task approach to reinforcement learning for factored-state Markov decision problems.2012

Author(s)

Journal Title

[Journal Article] Improving importance estimation in pool-based batch active learning for approximate linear regression.2012

Author(s)

Journal Title

[Journal Article] Early stopping heuristics in pool-based incremental active learning for least-squares probabilistic classifier.2012

Author(s)

Journal Title

[Journal Article] Real-time stylistic prediction for whole-body human motions2012

Author(s)

Journal Title

[Journal Article] The eMOSAIC model for humanoid robot control2012

Author(s)

Journal Title

[Journal Article] On-line motion synthesis and adaptation using a trajectory database2012

Author(s)

Journal Title

[Presentation] Perfect dimensionality recovery by variational Bayesian PCA.2012

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Sparse additive matrix factorization for robust PCA and its generalization.2012

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Artist agent: A reinforcement learning approach to automatic stroke generation in oriental ink painting.2012

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Extraction of latent kinematic relationships between human users and assistive robots2012

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Fast learning rate of multiple kernel learning: Trade-off between sparsity and smoothness.2012

Author(s)

Organizer

Place of Presentation

Year and Date

[Book] Machine Learning in Non-Stationary Environments: Introduction to Covariate Shift Adaptation2012

Author(s)

Total Pages

Publisher

[Book] Density Ratio Estimation in Machine Learning,2012

Author(s)

Total Pages

Publisher

[Remarks] 杉山将のページ

URL

[Remarks] 森本淳のページ

URL

杉山将東京工業大学, 情報理工学(系)研究科, 准教授 (90334515)