2008 Fiscal Year Annual Research Report

音環境理解研究からのロボット聴覚の構築

Research Project

Project/Area Number	19100003
Research Institution	Kyoto University
Principal Investigator	奥乃博 Kyoto University, 情報学研究科, 教授 (60318201)
Co-Investigator(Kenkyū-buntansha)	尾形哲也京都大学, 情報学研究科, 准教授 (00318768) 駒谷和範京都大学, 情報学研究科, 助教 (40362579) 高橋徹京都大学, 情報学研究科, 助教 (30419494)
Keywords	ロボット聴覚 / 音環境理解 / 身体性 / ロボットインタラクション / アクティブオーディション / 聴覚アウエアネス / マルチドメイン音声対話 / バージン発話
Research Abstract	平成20年度は要素技術の洗練化と公開に取り組んだ. (1) 実時間ロボット聴覚ソフトウエアHARKの機能拡張:特徴量信頼度を連続値で表現するソフトマスク自動生成に取り組み,音声認識率が約10%向上.また,システムの発話中にユーザが割り込み発話を行うバージイン発話認識のために独立成分解析によるセミブラインド分離を開発.2種類の音楽ロボットに応用し,ロボットが歌っても音楽だけを聞き分ける機能を実現.2件の論文がIEEE/RSJ IROS-2008 Award for Entertainment Robots and Systems Nomination Finalistの4件に選出.さらに,HARKを応用した音環境可視化システムにより聴覚アウエアネス(音の気付き)の改善手法を考案し,実装. (2) HARKのオープンソース化と講習会の実施:京都大学と韓国KISTで無料講習会を開催.ロボット聴覚特別セッションをIROS-2008で主宰.信号処理国際会議ICASSP-2009にも提案採択. (3) アクティブオーディションをSIG2上で開発:2本のマイクロフォンによる音源定位で不可避な前後問題の曖昧性解消のために,ロボットの首の動作を設計.首の動きが最初に斜め下に動かし,その後横に動かす方が,いきなり横に動かすよりも性能が改善.人も同様の動作をすることが知られており,ロボットでの有効性を確認. (4) ロボットの経験に基づいた物体ダイナミクスの予測:RNNPBにより学習した物体のダイナミクスのモデルを通じて,未知物体であっても,ロボットの動作によりその物体がどのように動くかを予測する技術基盤を確立. (5) マルチドメイン音声対話システムの高性能化:どのドメインからも受理されない想定外発話からのユーザ意図推定法とそれに基づいたヘルプ生成法を開発し,その有効性を確認.

Research Products
(42 results)

All 2009 2008 Other

All Journal Article (15 results) (of which Peer Reviewed: 14 results) Presentation (19 results) Remarks (1 results) Patent(Industrial Property Rights) (7 results) (of which Overseas: 3 results)

[Journal Article] 音色特徴の歪みを回避した楽器音の音高・音長操作手法2009
- Author(s)
  安部武宏, 糸山克寿, 吉井和佳, 駒谷和範, 尾形哲也, 奥乃博
- Journal Title
  
  情報処理学会論文誌 Vol. 50, No. 3
  
  Pages: 1054-1066
- Peer Reviewed
[Journal Article] マルチドメイン音声対話システムにおけるトピック推定と対話履歴の統合によるドメイン選択手法2009
- Author(s)
  池田智志, 駒谷和範, 尾形哲也, 奥乃博
- Journal Title
  
  情報処理学会論文誌 Vol. 50, No. 2
  
  Pages: 488-500
- Peer Reviewed
[Journal Article] Game-Theoretic Model of Referential Coherence and Its Empirical Verification Usine Large Jacanese and English Cornora2009
- Author(s)
  Shun Shiramatsu, Kazunori Komatani, Koiti Hasida, Tetsuva Ogata Hiroshi G. Okuno
- Journal Title
  
  ACM Transactions on Speech and Language Processing Vol. 5, No. 3
  
  Pages: Article 6
- Peer Reviewed
[Journal Article] 分析時刻に依存しない周期信号のパワースペクトル推定法を用いた音声分析2009
- Author(s)
  森勢将雅, 高橋徹, 河原英紀, 入野俊夫
- Journal Title
  
  電子情報通信学会論文誌A Vol. J92-A, No. 3
  
  Pages: 163-171
- Peer Reviewed
[Journal Article] 歌声の統計的モデル化とビタビ探索を用いた多重奏中のボーカルパートに対する音高推定手法2008
- Author(s)
  藤原弘将, 後藤真孝, 奥乃博
- Journal Title
  
  情報処理学会論文誌 Vol. 49, No. 10
  
  Pages: 3682-3693
- Peer Reviewed
[Journal Article] Managing Out-of-Grammar Utterances by Topic Estimation with Domain Extensibility in Multi Domain Spoken2008
- Author(s)
  Kazunori Komatani, Satoshi Ikeda, Tetsuya Ogata, Hiroshi G. Okuno
- Journal Title
  
  Speech Communcation Vol. 50, No. 10
  
  Pages: 836-870
- Peer Reviewed
[Journal Article] 独立成分分析に基づく適応フィルタのロボット聴覚への応用2008
- Author(s)
  武田龍, 中毫一博, 駒谷和範, 尾形哲也, 奥乃博
- Journal Title
  
  日本ロボット学会誌 Vol. 26, No. 6
  
  Pages: 529-536
- Peer Reviewed
[Journal Article] 音声対話システムにおけるラピッドプロトタイピングを指向したWFSTに基づく言語理解2008
- Author(s)
  福林雄一朗, 駒谷和範, 中野幹生, 船越孝太郎, 辻野広司, 尾形哲也, 奥乃博
- Journal Title
  
  情報処理学会論文誌 Vol. 49, No. 8
  
  Pages: 2762-2772
- Peer Reviewed
[Journal Article] Predicting Object Dynamics from Visual Images through Active Sensing Experiences2008
- Author(s)
  Shun Nishide, Tetsuya Ogata, J. Tani, Kazunori Komatani, Hiroshi G Okuno
- Journal Title
  
  Advanced Robotics Vol. 22, No. 5
  
  Pages: 527-546
- Peer Reviewed
[Journal Article] A Portable Robot Audition Software System for Multiple Simultaneous Speech Signals2008
- Author(s)
  Hiroshi G Okuno, Shun'ichi Yamamoto, Kazuhiro Nakadai, J-M Valin, K. i Komatani, T. Ogata
- Journal Title
  
  ournal of Acoustic Society of America Vol. 123, No. 5
  
  Pages: 3066-3067
- Peer Reviewed
[Journal Article] SalienceGraph : Visualizing Salience Dynamics of Written Discourse by Using Reference Probability and PLSA2008
- Author(s)
  Shun Shiramatsu, Kazunori Kofflatani, Tetsuya Ogata, Hiroshi G. Okuno
- Journal Title
  
  PRICAI-2008 : Trends in Artificial Intelligence LNCS Vol. 5351
  
  Pages: 890-902
- Peer Reviewed
[Journal Article] 多数の人の声を一度に聞き分ける聴覚センサ2008
- Author(s)
  奥乃博
- Journal Title
  
  日経エレクトロニクス 2008年9月22日号
  
  Pages: 115-123
[Journal Article] Integrating Topic Estimation and Dialogue HDomain Selection in Multi-Domain Spoken Dialogue Systems2008
- Author(s)
  Satoshi Ikeda, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
- Journal Title
  
  New Frontiers in Applied Artificial Intelligence LNAI Vol. 5027
  
  Pages: 294-304
- Peer Reviewed
[Journal Article] Vowel Imitation using Vocal Tract Model and Recurrent Neural Network2008
- Author(s)
  Hisashi Kanda, Tetsuya Ogata, Kazunori Komalani, Hiroshi G. Okuno
- Journal Title
  
  Neural Information Processing LKCS Vol. 4985
  
  Pages: 222-232
- Peer Reviewed
[Journal Article] Motion Emergence from Sound using Cross-Modal Mapping on Recurrent Neural Network2008
- Author(s)
  Tetsuya Ogata, Hiroshi G. Okuno
- Journal Title
  
  IEEE Intelligent System Vol. 23, No. 2
  
  Pages: 74-84
- Peer Reviewed
[Presentation] Design and Implementation of 3D Auditory Scene Visualizer towards Auditory Awareness with Face Tracking2008
- Author(s)
  Yuji Kubota, Masatoshi Yoshida, Kazunori Komatani, Tetsuva Ogata, Hiroshi G. Okuno
- Organizer
  Proceedings of IEEE International Symposium on Multimedia (ISM08)
- Place of Presentation
  Berkeley, U.S.A
- Year and Date
  2008-12-16
[Presentation] 3D Auditory Scene Visualizer With Face Tracking : Designand Implementation For Auditory Awareness Compensation2008
- Author(s)
  Yuji Kubota, Shun Shiramatsu, Kazunori Komatani, Tetsuva Ogata, Hiroshi G. Okuno
- Organizer
  Proceedings of 2nd International Symposium on Universal Communication (ISUC2008
- Place of Presentation
  Osaka, Japan
- Year and Date
  2008-12-15
[Presentation] A Beat-Tracking Robot for Human-Robot Interaction and Its Evaluation2008
- Author(s)
  Kazumasa Murata, Kazuhiro Nakadai, Ryu Takeda, Hiroshi G. Okuno, T. Torii, Y. Hasegawa, H, Tsujino
- Organizer
  Proceedings of IEEE-RAS Interanational Conference on Humanoid Robots (Humanoids 2008)
- Place of Presentation
  Daejeon, Korea
- Year and Date
  2008-12-03
[Presentation] Computational Auditory Scene Analysis and Its Application to Robot Audition (Invited Talk)2008
- Author(s)
  Hiroshi G. Okuno
- Organizer
  Proceedings of the Second International Symposiumon Robotics and Artificial Intelligence
- Place of Presentation
  Chofu, Japan
- Year and Date
  2008-10-09
[Presentation] A Robot Listens to Music and Counts Its Beats Aloud by Separating Music from Counting Voice2008
- Author(s)
  Takeshi Mizumoto, Ryu Takeda, K. Yoshii, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
- Organizer
  Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS-2008)
- Place of Presentation
  Nice, France
- Year and Date
  2008-09-24
[Presentation] Soft Missing-Feature Mask Generation for Simultaneous Speech Recognition System in Robots2008
- Author(s)
  Toru Takahashi. Shun'ichi Yamamoto Kazuhiro Nakadai, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
- Organizer
  Proceedings of International Conference on Spoken Language Processing (Interspeech-2008)
- Place of Presentation
  Brisbane, Australia
- Year and Date
  2008-09-24
[Presentation] Predicting ASR Errors by Exploiting Barge-In Rate of Individual Users for Spoken Dialogue Systems2008
- Author(s)
  Kazunori Komatani, Tatsuya Kawahara, Hiroshi G. Okuno
- Organizer
  Proceedings of International Conference on Spoken Language Processing (Interspeech-2008)
- Place of Presentation
  Brisbane, Australia
- Year and Date
  2008-09-24
[Presentation] Expanding Vocabulary for Recognizing User's Abbreviations of Proper Nouns without Increasing ASR Error Rates in Spoken Dialogue Systems2008
- Author(s)
  Masaki Katsumaru, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
- Organizer
  Proceedings of International Conference on Spoken Language Processing (Interspeech-2008)
- Place of Presentation
  Brisbane, Australia
- Year and Date
  2008-09-24
[Presentation] Extensibility Verification of Robust Domain Selection against Out-of-Grammar Utterances in Multi-Domain Spoken Dialogue System2008
- Author(s)
  Satoshi Ikeda, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
- Organizer
  Proceedings of International Conference on Spoken Language Processing (Interspeech-2008)
- Place of Presentation
  Brisbane, Australia
- Year and Date
  2008-09-24
[Presentation] Target Speech Detection and Separation for Humanoid Robot in Sparse Dialogue with Noisy Home Environinents2008
- Author(s)
  Hyun-Don Kim, Jinsung Kim, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
- Organizer
  Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS-2008)
- Place of Presentation
  Nice, France
- Year and Date
  2008-09-24
[Presentation] Segmenting Acoustic Signal with Articulatory Movement using Recurrent Neural Network for Phoneme Acquisition2008
- Author(s)
  Hisashi Kanda, Tetsuya Ogata, Kazunori Komatani, Hiroshi G. Okuno
- Organizer
  Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS-2008)
- Place of Presentation
  Nice, France
- Year and Date
  2008-09-24
[Presentation] Barge-in-able Robot Audition Based on ICA and Missing Feature Theory under Semi-Blind Situation2008
- Author(s)
  Ryu Takeda, Kazuhiro Nakadai, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
- Organizer
  Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS-2008)
- Place of Presentation
  Nice, France
- Year and Date
  2008-09-24
[Presentation] A Robot Uses Its Own Microphone to Synchronize Its Stepsto Musical Beats While Scatting and Singing2008
- Author(s)
  K. Murata, K. Nakadai, K. Yoshii, R. Takeda, T. Torii, Hiroshi G. Okuno, Y. Hasegawa, H. Tsujino
- Organizer
  Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS-2008)
- Place of Presentation
  Nice, France
- Year and Date
  2008-09-24
[Presentation] Active Ssensing based Dynamical Object Feature Extraction2008
- Author(s)
  Shun Nishide, Tetsuya Ogata, Ryunosuke Yokoya, Jun Tani, Kazunori Komatani, Hiroshi G. Okuno
- Organizer
  Proceedings of IEEE/RSJ Internalional Conference on Inteligent Robots and Systems (IROS-2008)
- Place of Presentation
  Nice, France
- Year and Date
  2008-09-23
[Presentation] Analysis of Reliable Predictability based Mo tion Generation using RNNP2008
- Author(s)
  Shun Nishide, Tetsuya Ogata, Jun Tani, Kazunori Komatani, Hiroshi G. Okuno
- Organizer
  Proc. of Joint 4th International Conf. on Soft Computing and Intelligent Systems and 9th International Symposium on advanced Intelligent Systems (SCIS & ISIS 2008)
- Place of Presentation
  Nagoya, Japan
- Year and Date
  2008-09-18
[Presentation] Automatic Chord Recognition Based on Probabilistic Integration of Chord Transition and Bass Pitch2008
- Author(s)
  Kohei Sumi, Katsutoshi Itoyama, K. Yoshii, Kazunori Komatani, T. Ogata, Hiroshi G. Okuno
- Organizer
  Proceedings of 9th International Conference on Musical Information Retreival (ISMIR-2008)
- Place of Presentation
  Philadelphia, U.S.A
- Year and Date
  2008-09-15
[Presentation] Instrument Equalizer for Query-by-Example Retrieval : Improving Sound Source Separation based on Integrated Harmonic and Inharmonic Models2008
- Author(s)
  Katsutoshi Itoyama, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
- Organizer
  Proceedings of 9th International Conference on Musical Information Retreival (ISMIR-2008)
- Place of Presentation
  Philadelphia, U.S.A
- Year and Date
  2008-09-15
[Presentation] A Robot Referee for Rock-Paper-Scissors Sound Games2008
- Author(s)
  Kazuliiro Nakadai, Shun'ichi Yamamoto, Hiroshi G. Okuno, H. Nakajima, Y. Hasegawa, H. Tsujino
- Organizer
  Proceedings of IEEE-RAS International Conference on Robotics and Automation (ICRA-2008)
- Place of Presentation
  Pasadena, U.S.A
- Year and Date
  2008-05-20
[Presentation] COMPUTATIONAL AUDITORY SCENE ANALYSIS AND ITS APPLICATION TO ROBOT AUDITION2008
- Author(s)
  Hiroshi G. Okuno, Kazuhiro Nakadai
- Organizer
  Proceedings of Hands-free Speech Communication and Microphone Arrays (HSCMA-2008)
- Place of Presentation
  Trino, Italy
- Year and Date
  2008-05-07
[Remarks]
- URL
  http://winnie.kuis.kyoto-u.ac.jp/
[Patent(Industrial Property Rights)] 音楽音響信号の音色変更システム2009
- Inventor(s)
  安部武宏, 糸山克寿, 奥乃博
- Industrial Property Rights Holder
  国立大学法人京都大学
- Industrial Property Number
  特願2009-34664号
- Filing Date
  2009-02-17
[Patent(Industrial Property Rights)] 文単位検索方法, 文単位検索装置, コンピュータプログラム, 記憶媒体, 及び文書記憶装置2008
- Inventor(s)
  白松俊, 駒谷和範, 奥乃博
- Industrial Property Rights Holder
  国立大学法人京都大学
- Industrial Property Number
  PCT/JP2007/055448
- Filing Date
  2008-12-12
- Overseas
[Patent(Industrial Property Rights)] 音源分離システム2008
- Inventor(s)
  武田龍, 中田一博, 辻野広司, 奥乃博
- Industrial Property Rights Holder
  本田技研工業株式会社
- Industrial Property Number
  特願2008-191382号
- Filing Date
  2008-07-24
[Patent(Industrial Property Rights)] 音源分離システム, 音源分離方法及び音源分離用コンピュータプログラム2008
- Inventor(s)
  糸山克寿, 奥乃博, 後藤真孝
- Industrial Property Rights Holder
  京都大学, 産業技術総合研究所
- Industrial Property Number
  PCT/JP2008/05731
- Filing Date
  2008-04-14
- Overseas
[Patent(Industrial Property Rights)] SalienceGraph (議事録閲覧システム)2008
- Inventor(s)
  白松俊, 奥乃博
- Industrial Property Rights Holder
  国立大学法人京都大学
- Industrial Property Number
  京都大学デジタルコンテンツC32
- Acquisition Date
  2008-08-01
[Patent(Industrial Property Rights)] Robot Audition Software HARK2008
- Inventor(s)
  Shun'chi Yamamoto, hiroshi G., Okuno Kazuhiro, Nakadai Hirofumi, Nakashima Hiroshi Tsujino
- Industrial Property Rights Holder
  京都大学, 本田技研工業
- Industrial Property Number
  オープンソースソフトウエア
- Acquisition Date
  2008-05-01
- Overseas
[Patent(Industrial Property Rights)] 音声認識装置2008
- Inventor(s)
  中毫一鳳辻野広司, 奥乃1専, 山本俊一
- Industrial Property Rights Holder
  本田技研工業株式会社
- Industrial Property Number
  特許第4157581号
- Acquisition Date
  2008-07-18

2008 Fiscal Year Annual Research Report

音環境理解研究からのロボット聴覚の構築

Principal Investigator

奥乃 博 Kyoto University, 情報学研究科, 教授 (60318201)

Research Products

[Journal Article] 音色特徴の歪みを回避した楽器音の音高・音長操作手法2009

Author(s)

Journal Title

[Journal Article] マルチドメイン音声対話システムにおけるトピック推定と対話履歴の統合によるドメイン選択手法2009

Author(s)

Journal Title

[Journal Article] Game-Theoretic Model of Referential Coherence and Its Empirical Verification Usine Large Jacanese and English Cornora2009

Author(s)

Journal Title

[Journal Article] 分析時刻に依存しない周期信号のパワースペクトル推定法を用いた音声分析2009

Author(s)

Journal Title

[Journal Article] 歌声の統計的モデル化とビタビ探索を用いた多重奏中のボーカルパートに対する音高推定手法2008

Author(s)

Journal Title

[Journal Article] Managing Out-of-Grammar Utterances by Topic Estimation with Domain Extensibility in Multi Domain Spoken2008

Author(s)

Journal Title

[Journal Article] 独立成分分析に基づく適応フィルタのロボット聴覚への応用2008

Author(s)

Journal Title

[Journal Article] 音声対話システムにおけるラピッドプロトタイピングを指向したWFSTに基づく言語理解2008

Author(s)

Journal Title

[Journal Article] Predicting Object Dynamics from Visual Images through Active Sensing Experiences2008

Author(s)

Journal Title

[Journal Article] A Portable Robot Audition Software System for Multiple Simultaneous Speech Signals2008

Author(s)

Journal Title

[Journal Article] SalienceGraph : Visualizing Salience Dynamics of Written Discourse by Using Reference Probability and PLSA2008

Author(s)

Journal Title

[Journal Article] 多数の人の声を一度に聞き分ける聴覚センサ2008

Author(s)

Journal Title

[Journal Article] Integrating Topic Estimation and Dialogue HDomain Selection in Multi-Domain Spoken Dialogue Systems2008

Author(s)

Journal Title

[Journal Article] Vowel Imitation using Vocal Tract Model and Recurrent Neural Network2008

Author(s)

Journal Title

[Journal Article] Motion Emergence from Sound using Cross-Modal Mapping on Recurrent Neural Network2008

Author(s)

Journal Title

[Presentation] Design and Implementation of 3D Auditory Scene Visualizer towards Auditory Awareness with Face Tracking2008

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] 3D Auditory Scene Visualizer With Face Tracking : Designand Implementation For Auditory Awareness Compensation2008

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] A Beat-Tracking Robot for Human-Robot Interaction and Its Evaluation2008

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Computational Auditory Scene Analysis and Its Application to Robot Audition (Invited Talk)2008

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] A Robot Listens to Music and Counts Its Beats Aloud by Separating Music from Counting Voice2008

Author(s)

Organizer

Place of Presentation

Year and Date

[Presentation] Soft Missing-Feature Mask Generation for Simultaneous Speech Recognition System in Robots2008

Author(s)

Organizer

Place of Presentation

Year and Date

奥乃博 Kyoto University, 情報学研究科, 教授 (60318201)