• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Study on Integrated Processing of Speech and Gesture in Multimodal Communication

Research Project

Project/Area Number 10480083
Research Category

Grant-in-Aid for Scientific Research (B).

Allocation TypeSingle-year Grants
Section一般
Research Field 情報システム学(含情報図書館学)
Research InstitutionWaseda University

Principal Investigator

SHIRAI Katsuhiko  Waseda University, School of Science and Engineering, Professor, 理工学部, 教授 (10063702)

Co-Investigator(Kenkyū-buntansha) YAMASAKI Yoshio  Waseda University, Graduate School of Global Information and Telecommunication Studies, Professor, 国際情報通信研究センター, 教授 (10257199)
HASHIMOTO Shuji  Waseda University, School of Science and Engineering, Professor, 理工学部, 教授 (60063806)
KOBAYASHI Tetsunori  Waseda University, School of Science and Engineering, Professor, 理工学部, 教授 (30162001)
OKAWA Shigeki  Chiba Institute of Technology, Department of Information and Network Science, Associate Professor, 情報ネットワーク学科, 助教授 (40306395)
Project Period (FY) 1998 – 2000
Project Status Completed (Fiscal Year 2000)
Budget Amount *help
¥9,200,000 (Direct Cost: ¥9,200,000)
Fiscal Year 2000: ¥1,500,000 (Direct Cost: ¥1,500,000)
Fiscal Year 1999: ¥3,600,000 (Direct Cost: ¥3,600,000)
Fiscal Year 1998: ¥4,100,000 (Direct Cost: ¥4,100,000)
KeywordsMultimodal Communication / Gesture Recognition / Speech Recognition / Partly-Hidden Markov Model / Multi-Person Conversation / Dialogue Control / Misunderstanding Detection / Domain Independent Platform / 複数話者対話 / 統計的発話交代モデル / 部分空間法 / 顔面像抽出 / 複合周波数帯域型音声認識 / 姿勢推定 / 音声対話システム汎用プラットフォーム / 音声対話システム / 対話コーパス / マルチモーダル / 隠れマルコフモデル / 顔方向認識 / 対話コーバス
Research Abstract

The purpose of this research is to develop the multimodal communication system which can recognize multimodal Information such as speech and gesture on natural dialog, understand the intention of human by the integration of them, and respond to human appropriately.
First of all, it is necessary to clarify the structure of understanding of human intention by the integration of multimodal information and response by multiple modalities. Therefore we have analyzed the acoustic features of speech such as fillers and the roles of gestures such as head movement on the various natural human dialogues.
Then we have made studies of speech and gesture recognition algorithm that is fundamental technique for multimodal communication system. We suggest a recombination strategy for multi-band automatic speech recognition which gives more accurate recognition, especially in noisy acoustic environments. And we propose a speech decoder in which the language models are modified to deal with timing of the turn taking and the speaker models are also utilized. We apply a new pattern matching method, Partly-Hidden Markov model, in which the first state is hidden and the second one is observable, to gesture recognition. And we propose the face extraction and the pose detection method to recognize the head movement.
Finally, we have implemented multimodal communication model to the human-machine dialogue system. This system uses a method of generalization considering trade-off between variety of dialogue and easiness to describes rules and provides a domain independent platform. Also, it has a spoken dialogue control model for improvement of dialogue efficiency and a dialogue management model for detection of misunderstanding in spoken dialogue system.

Report

(4 results)
  • 2000 Annual Research Report   Final Research Report Summary
  • 1999 Annual Research Report
  • 1998 Annual Research Report
  • Research Products

    (44 results)

All Other

All Publications (44 results)

  • [Publications] 横山真男,白井克彦: "人間型ロボットの対話インタフェースにおける発話交替時の非言語情報の制御"情報処理学会論文誌. Vol.40,No.2. 487-496 (1999)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] 村井則之,小林哲則: "話者性と発話交代を考慮した複数話者対話音声の認識"電子情報通信学会論文誌D-II. J83,No.11. 2465-2472 (2000)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] 益満健,小林哲則: "部分隠れマルコフモデルとそのジェスチャの認識への応用"情報処理学会論文誌. Vol.41,No.11. 3060-3069 (2000)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] H.Kikuchi,K.Shirai: "Controlling Gaze of Humanoid in Communication with Human"Proc.of International Conference on Intelligent Robots and Systems (IROS). Vol.1. 255-260 (1998)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] H.Kikuchi,K.Shirai: "Multimodal Communication Between Human and Robot"Proc.of International Wireless and Telecommunications Symposium (IWTS). 322-325 (1998)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] M.Yokoyama,K.Shirai: "Use of Non-Verbal Information in Communication between Human and Robot"Proc.of International Conference on Spoken Language Processing (ICSLP). 2351-2354 (1998)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] H.Kikuchi,K.Shirai: "Controlling Dialogue Strategy According to Performance of Processes"ESCA Workshop,Session5.2. 85-88 (1999)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] S.Okawa,K.Shirai: "A Recombination Strategy for Multi-band Speech Recognition Based on Mutual Information Criterion"6th European Conference on Speech Communication and Technology : EUROSPEECH'99. Vol.2. 603-606 (1999)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] Y.Matsusaka,T.Kobayashi: "Multi-person Conversation Robot using Multi-modal Interface"SCI'99. Vol.7. 450-455 (1999)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] N.Murai,T.Kobayashi: "DICTATION OF MULTIPARTY CONVERSATION USING STATISTICAL TURN TAKING MODEL AND SPEAKER"Proc.of International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Vol.3. 1575-1578 (2000)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] K.Aoyama,K.Shirai: "Controlling Non-verbal Information in Speaker-change for Spoken Dialogue"2000 IEEE International Conference on Systems Man and Cybernetics (SMC2000). 1354-1359 (2000)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] K.Aoyama,K.Shirai: "DESIGNING A DOMAIN INDEPENDENT PLATFORM OF SPOKEN DIALOGUE SYSTEM"Proc.of International Conference on Spoken Language Processing (ICSLP). (CD-ROM). (2000)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] M.Murakami,K.Shirai: "Accurate Extraction of Human Face Area using Subspace Method and Genetic Algorithm"Proc.of International Conference Multimedia and Expo. 411-414 (2000)

    • Description
      「研究成果報告書概要(和文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] M.Yokoyama, K.Shirai: "Controlling Non-verbal Information in Speaker-changing For Spoken Dialogue Interface of Humanoid Robot"Transactions of IPSJ. Vol.40, No.2. 487-496 (1999)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] N.Murai, T.Kobayashi: "Dictation of Multiparty Conversation Considering Speaker Individuality and Turn Taking"Transactions of IEICE. D-II, Vol.J83-D-II, No.11. 2465-2472 (2000)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] K.Masumitsu, T.Kobayashi: "Partly-Hidden Markov Model and Its Application To Gesture Recognition"Transactions of IPSJ. Vol.41, No.11. 3060-3069 (2000)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] H.Kikuchi, K.Shirai: "Controlling Gaze of Humanoid in Communication with Human"Proc.of International Conference onIntelligent Robots and Systems (IROS). Vol.1. 255-260 (1998)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] H.Kikuchi, K.Shirai: "Multimodal Communication Between Human and Robot"Proc.of International Wireless and Telecommunications Symposium (IWIS). 322-325 (1998)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] M.Yokoyama, K.Shirai: "Use of Non-Verbal Information in Communication between Human and Robot"Proc.of International Conference on Spoken Language Processing (ICSLP). 2351-2354 (1998)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] H.Kikuchi, K.Shirai: "Controlling Dialogue Strategy According to Performance of Processes"ESCA Workshop. Session5.2. 85-88 (1999)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] S.Okawa, K.Shirai: "A Recombination Strategy for Multi-band Speech Recognition Based on Mutual Information Criterion"6th European Conference on Speech Communication and Technology : EUROSPEECH'99. Vol.2. 603-606 (1998)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] Y.Matsusaka, T.Kobayashi: "Multi-person Conversation Robot using Multi-modal Interface"SCI'99. Vol.7. 450-455 (1999)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] N.Murai, T.Kobayashi: "DICTATION OF MULTIPARTY CONVERSATION USING STATISTICAL TURN TAKING MODEL AND SPEAKER MODEL"Proc.of International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Vol.3. 1575-1578 (2000)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] K.Aoyama, K.Shirai: "Controlling Non-verbal Information in Speaker-change for Spoken Dialogue"2000 IEEE International Conference on Systems Man and Cybemetics (SMC2000). 1354-1359 (2000)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] K.Aoyama, K.Shirai: "DESIGNING A DOMAIN INDEPENDENT PLATFORM OF SPOKEN DIALOGUE SYSTEM"Proc.of International Conference on Spoken Language Processing (ICSLP), CD-ROM. (2000)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] M.Murakami, K.Shirai: "Accurate Extraction of Human Face Area using Subspace Method and Genetic Algorithm"Proc.of International Conference Multimedia and Expo. 411-414 (2000)

    • Description
      「研究成果報告書概要(欧文)」より
    • Related Report
      2000 Final Research Report Summary
  • [Publications] Kazumi Aoyama: "Controlling Non-verbal Information in Speaker-change for Spoken Dialogue"IEEE Proc.of SMC2000. 1354-1359 (2000)

    • Related Report
      2000 Annual Research Report
  • [Publications] Kazumi Aoyama: "DESIGNING A DOMAIN INDEPENDENT PLATFORM OF SPOKEN DIALOGUE SYSTEM"Proc.of ICSLP 2000. CD-ROM (2000)

    • Related Report
      2000 Annual Research Report
  • [Publications] 村井則之: "話者性と発話交代を考慮した複数話者対話音声の認識"電子情報通信学会論文誌D-II. vol.J83,No.11. 2465-2472 (2000)

    • Related Report
      2000 Annual Research Report
  • [Publications] 益満健: "部分隠れマルコフモデルとそのジェスチャの認識への応用"情報処理学会論文誌. vol.41,No.11. 3060-3069 (2000)

    • Related Report
      2000 Annual Research Report
  • [Publications] Makoto Murakami: "Accurate Extraction of Human Face Area using Subspace Method and Genetic Algorithm"Proc.of International Conference Multimedia and Expo. 411-414 (2000)

    • Related Report
      2000 Annual Research Report
  • [Publications] Noriyuki Murai: "DICTATION OF MULTIPARTY CONVERSATION USING STATISTICAL TURN TAKING MODEL AND SPEAKER MODEL"Proc.of ICASSP 2000. Vol.3. 1575-1578 (2000)

    • Related Report
      2000 Annual Research Report
  • [Publications] Hideaki Kikuchi 他: "Controlling Dialogue Strategy According to Performance of Processes"Proc of ESCA Workshop. 85-88 (1999)

    • Related Report
      1999 Annual Research Report
  • [Publications] Shigeki Okawa 他: "A Recombination Strategy for Multi-band Speech Recognition Based on Mutual Information Criterion"Proc. of EUROSPEECH'99. Vol.2. 603-606 (1999)

    • Related Report
      1999 Annual Research Report
  • [Publications] 中島 雄大 他: "マルチバンド型音声認識のための部分帯域特徴量の情報量評価"電子情報通信学会技術報告. SP99-97. 25-30 (1999)

    • Related Report
      1999 Annual Research Report
  • [Publications] 青山 一美 他: "音声対話システム汎用ブラットフォ-ムの検討"情報処理学会研究報告. SLP-30. 7-12 (2000)

    • Related Report
      1999 Annual Research Report
  • [Publications] Yosuke Matsusaka 他: "Multi-person Conversation via Multi-modal Interface"Proc. of EUROSPEECH '99. Vol.4. 1723-1726 (1999)

    • Related Report
      1999 Annual Research Report
  • [Publications] Shigeki Ohira: "Proposal and Evaluation of Significant Word Selection Method."Proc. of the First NTCIR Workshop on R-JTRTR. 109-116 (1999)

    • Related Report
      1999 Annual Research Report
  • [Publications] Hideaki Kikuchi Katsuhiko Shirai: "Controlling Gaze of Humanoid in Communication with Human" Proc.Of International conference on Intelligent Robots and Systems. Vol.1. 255-260 (1998)

    • Related Report
      1998 Annual Research Report
  • [Publications] 横山 真男:白井克彦: "人間型ロボットの対話インタフェースにおける発話交替時の非言語情報の制御" 情報処理学会 論文誌. 2月号. (1999)

    • Related Report
      1998 Annual Research Report
  • [Publications] Masao Yokoyama: Katsuhiko Shirai: "Use of Non-Verbal Information in Communication between Human and Robot" Proc.Of International conference on Spoken Language Procesing. 2351-2354 (1998)

    • Related Report
      1998 Annual Research Report
  • [Publications] Hideaki Kikuchi : Katsuhiko Shirai: "Multimodal Communication Between Human and Robot" Proc.of International Wireless and Telecommunications Symposium. 322-325 (1998)

    • Related Report
      1998 Annual Research Report
  • [Publications] 益満 健:白井克彦: "部分隠れマルコフモデルとそのジェスチャー認識への応用" 電子情報通信学会 技術研究報告. PRMU97-203. 35-62 (1998)

    • Related Report
      1998 Annual Research Report
  • [Publications] 田窪 行則:白井 克彦: "岩波書店" 岩波講座 言語の科学 2音声, 249 (1998)

    • Related Report
      1998 Annual Research Report

URL: 

Published: 1998-04-01   Modified: 2016-04-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi