• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Structure Extraction and Visualization of Spontaneous Speech Communication

Research Project

Project/Area Number 19300061
Research Category

Grant-in-Aid for Scientific Research (B)

Allocation TypeSingle-year Grants
Section一般
Research Field Perception information processing/Intelligent robotics
Research InstitutionKyoto University

Principal Investigator

KAWAHARA Tatsuya  Kyoto University, 学術情報メディアセンター, 教授 (00234104)

Co-Investigator(Kenkyū-buntansha) NAKAMURA Yuichi  京都大学, 学術情報メディアセンター, 教授 (40227947)
AKITA Yuya  京都大学, 学術情報メディアセンター, 助教 (90402742)
UCHIMOTO Kiyotaka  情報通信研究機構, 知識創成コミュニケーション研究センター, 主任研究員 (60358885)
MORI Shinsuke  京都大学, 学術情報メディアセンター, 准教授 (90456773)
Project Period (FY) 2007 – 2009
Project Status Completed (Fiscal Year 2009)
Budget Amount *help
¥17,940,000 (Direct Cost: ¥13,800,000、Indirect Cost: ¥4,140,000)
Fiscal Year 2009: ¥5,460,000 (Direct Cost: ¥4,200,000、Indirect Cost: ¥1,260,000)
Fiscal Year 2008: ¥5,460,000 (Direct Cost: ¥4,200,000、Indirect Cost: ¥1,260,000)
Fiscal Year 2007: ¥7,020,000 (Direct Cost: ¥5,400,000、Indirect Cost: ¥1,620,000)
Keywords音声言語処理 / 話し言葉 / 音声認識 / 言語解析 / メタデータ付与 / メディア検索 / 映像解析
Research Abstract

For effective exploitation of large-scale audio archives such as lectures, conferences and meetings, we investigate automatic speech recognition of these kinds of spontaneous speech communication, as well as extraction of linguistic structures and effective presentation. Automatic transcription systems for academic lectures, classroom lectures and parliamentary meetings are implemented.

Report

(4 results)
  • 2009 Annual Research Report   Final Research Report ( PDF )
  • 2008 Annual Research Report
  • 2007 Annual Research Report
  • Research Products

    (60 results)

All 2010 2009 2008 2007

All Journal Article (23 results) (of which Peer Reviewed: 9 results) Presentation (34 results) Book (2 results) Patent(Industrial Property Rights) (1 results)

  • [Journal Article] Online unsupervised classification with model comparison in the Variational Bayes framework for voice activity detection.2010

    • Author(s)
      D. Cournapeau, S. Watanabe, A. Nakamura, T. Kawahara
    • Journal Title

      IEEE J. Selected Topics in Signal Processing (accepted for publication)

    • NAID

      120002598753

    • Related Report
      2009 Final Research Report
  • [Journal Article] Gaussian mixture optimization based on efficient cross-validation.2010

    • Author(s)
      T. Shinozaki, S. Furui, T. Kawahara
    • Journal Title

      IEEE J. Selected Topics in Signal Processing (accepted for publication)

    • NAID

      110006381954

    • Related Report
      2009 Final Research Report
  • [Journal Article] Statistical transformation of language and pronunciation models for spontaneous speech recognition.2010

    • Author(s)
      Y. Akita, T. Kawahara
    • Journal Title

      IEEE Trans. Audio, Speech & Language Process. (accepted for publication)

    • NAID

      120002511319

    • Related Report
      2009 Final Research Report
  • [Journal Article] Speech activity detection for multi-party conversation analyses based on likelihood ratio test on spatial magnitude estimation.2010

    • Author(s)
      K. Ishizuka, S. Araki, T. Kawahara
    • Journal Title

      IEEE Trans. Audio, Speech & Language Process. Vol.18(accepted for publication)

    • Related Report
      2009 Final Research Report
  • [Journal Article] Bayes risk-based dialogue management for document retrieval system with speech interface.2010

    • Author(s)
      T. Misu, T. Kawahara
    • Journal Title

      Speech Communication Vol.52,No.1

      Pages: 61-71

    • Related Report
      2009 Final Research Report
  • [Journal Article] Online unsupervised classification with model comparison in the Variational Bayes framework for voice activity detection2010

    • Author(s)
      D.Cournapeau, S.Watanabe, A.Nakamura, T.Kawahara
    • Journal Title

      IEEE J.Selected Topics in Signal Processing (掲載決定)

    • NAID

      120002598753

    • Related Report
      2009 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Statistical transformation of language and pronunciation models for spontaneous speech recognition2010

    • Author(s)
      Y.Akita, T.Kawahara
    • Journal Title

      IEEE Trans.Audio, Speech & Language Processing Vol. 18(掲載決定)

    • NAID

      120002511319

    • Related Report
      2009 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Speech activity detection for multi-party conversation analyses based on likelihood ratio test on spatial magnitude estimation2010

    • Author(s)
      K.Ishizuka, S.Araki, T.Kawahara
    • Journal Title

      IEEE Trans.Audio, Speech & Language Processing Vol. 18(掲載決定)

    • Related Report
      2009 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Effective prediction of errors by non-native speakers using decision tree for speech recognition-based CALL system.2009

    • Author(s)
      H. Wang, T. Kawahara
    • Journal Title

      IEICE Trans. Vol.E92-D,No.12

      Pages: 2462-2468

    • NAID

      10026812661

    • Related Report
      2009 Final Research Report
  • [Journal Article] Computer assisted language learning system based on dynamic question generation and error prediction for automatic speech recognition.2009

    • Author(s)
      H. Wang, C.J. Waple, T. Kawahara
    • Journal Title

      Speech Communication Vol.51,No.10

      Pages: 995-1005

    • Related Report
      2009 Final Research Report
  • [Journal Article] 局所的な係り受けの情報を用いた話し言葉の節・文境界の推定.2009

    • Author(s)
      西光雅弘, 秋田祐哉, 高梨克也, 尾嶋憲治, 河原達也
    • Journal Title

      情報処理学会論文誌 Vol.50,No.2

      Pages: 544-552

    • NAID

      110007970350

    • Related Report
      2009 Final Research Report
  • [Journal Article] スライド情報を用いた言語モデル適応による講義音声認識2009

    • Author(s)
      河原達也, 根本雄介, 勝丸徳浩, 秋田祐哉
    • Journal Title

      情報処理学会論文誌 Vol.50,No.2

      Pages: 469-476

    • NAID

      110007970343

    • Related Report
      2009 Final Research Report
  • [Journal Article] 話し言葉における引用節・挿入節の自動認定および係り受け解析への応用2009

    • Author(s)
      浜辺良二, 内元清貴, 河原達也, 井佐原均
    • Journal Title

      自然言語処理 Vol.16,No.1

      Pages: 3-23

    • NAID

      10024758516

    • Related Report
      2009 Final Research Report
  • [Journal Article] 局所的な係り受けの情報を用いた話し言葉の節・文境界の推定.2009

    • Author(s)
      西光雅弘, 秋田祐哉, 高梨克也, 尾嶋憲治, 河原達也.
    • Journal Title

      情報処理学会論文誌 Vol. 50, No. 2

      Pages: 544-552

    • NAID

      110007970350

    • Related Report
      2008 Annual Research Report
    • Peer Reviewed
  • [Journal Article] スライド情報を用いた言語モデル適応による講義音声認識.2009

    • Author(s)
      河原達也, 根本雄介, 勝丸徳浩, 秋田祐哉.
    • Journal Title

      情報処理学会論文誌 Vol. 50, No . 2

      Pages: 469-476

    • NAID

      110007970343

    • Related Report
      2008 Annual Research Report
    • Peer Reviewed
  • [Journal Article] 話し言葉における引用節・挿入節の自動認定および係り受け解析への応用.2009

    • Author(s)
      浜辺良二, 内元清貴, 河原達也, 井佐原均.
    • Journal Title

      自然言語処理 Vol. 16, No. 1

      Pages: 3-23

    • NAID

      10024758516

    • Related Report
      2008 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Voice activity detection based on high order statistics and online EM algorithm.2008

    • Author(s)
      D. Cournapeau, T. Kawahara
    • Journal Title

      IEICE Trans. Vol.E91-D,No.12

      Pages: 2854-2861

    • NAID

      10026806855

    • Related Report
      2009 Final Research Report
  • [Journal Article] 音声理解を指向したベイズリスク最小化枠組みに基づく音声認識2008

    • Author(s)
      南條浩輝, 河原達也, 七里崇
    • Journal Title

      電子情報通信学会論文誌 Vol.J91-D,No.5

      Pages: 1314-1324

    • NAID

      110007380122

    • Related Report
      2009 Final Research Report
  • [Journal Article] 音声理解を指向したベイズリスク最小化枠組みに基づく音声認識.2008

    • Author(s)
      南條浩輝, 河原達也, 七里崇.
    • Journal Title

      電子情報通信学会論文誌 J91-D

      Pages: 1314-1324

    • NAID

      110007380122

    • Related Report
      2007 Annual Research Report
    • Peer Reviewed
  • [Journal Article] 質問応答・情報推薦機能を備えた音声による情報案内システム2007

    • Author(s)
      翠輝久, 河原達也, 正司哲朗, 美濃導彦
    • Journal Title

      情報処理学会論文誌 Vol.48,No.12

      Pages: 3602-3611

    • NAID

      110006531940

    • Related Report
      2009 Final Research Report
  • [Journal Article] ドメインとスタイルを考慮したwebテキストの選択による音声対話システム用言語モデルの構築.2007

    • Author(s)
      翠輝久, 河原達也
    • Journal Title

      電子情報通信学会論文誌 Vol.J90-D,No.11

      Pages: 3024-3032

    • NAID

      110007380619

    • Related Report
      2009 Final Research Report
  • [Journal Article] 質問応答・情報推薦機能を備えた音声による情報案内システム.2007

    • Author(s)
      翠輝久, 河原達也, 正司哲朗, 美濃導彦.
    • Journal Title

      情報処理学会論文誌 48

      Pages: 3602-3611

    • NAID

      110006531940

    • Related Report
      2007 Annual Research Report
    • Peer Reviewed
  • [Journal Article] ドメインとスタイルを考慮したwebテキストの選択による音声対話システム用言語モデルの構築.2007

    • Author(s)
      翠輝久, 河原達也.
    • Journal Title

      電子情報通信学会論文誌 J90-D

      Pages: 3024-3032

    • NAID

      110007380619

    • Related Report
      2007 Annual Research Report
    • Peer Reviewed
  • [Presentation] Improved statistical models for SMT-based speaking style transformation.2010

    • Author(s)
      G. Neubig, Y. Akita, S. Mori, T. Kawahara
    • Organizer
      In Proc. IEEE-ICASSP
    • Place of Presentation
      米国・ダラス
    • Related Report
      2009 Final Research Report
  • [Presentation] Optimizing spectral subtraction and Wiener filtering for robust speech recognition in reverberant and noisy conditions.2010

    • Author(s)
      R. Gomez, T. Kawahara
    • Organizer
      In Proc. IEEE-ICASSP
    • Place of Presentation
      米国・ダラス
    • Related Report
      2009 Final Research Report
  • [Presentation] Using online model comparison in the Variational Bayes framework for online unsupervised voice activity detection.2010

    • Author(s)
      D. Cournapeau, S. Watanabe, A. Nakamura, T. Kawahara
    • Organizer
      In Proc. IEEE-ICASSP
    • Place of Presentation
      米国・ダラス
    • Related Report
      2009 Final Research Report
  • [Presentation] Transcription system using automatic speech recognition for the Japanese parliament (Diet)2009

    • Author(s)
      T.Kawahara
    • Organizer
      INTERSTENO
    • Place of Presentation
      中国・北京(招待講演)
    • Year and Date
      2009-08-19
    • Related Report
      2009 Annual Research Report
  • [Presentation] New perspectives on spoken language understanding: Does machine need to fully understand speech?2009

    • Author(s)
      T. Kawahara
    • Organizer
      In Proc. IEEE Workshop on Automatic Speech Recognition and Understanding
    • Place of Presentation
      イタリア・メラノ
    • Related Report
      2009 Final Research Report
  • [Presentation] Tight integration of dereverberation and automatic speech recognition.2009

    • Author(s)
      R. Gomez, T. Kawahara
    • Organizer
      In Proc. APSIPA ASC
    • Place of Presentation
      札幌
    • Related Report
      2009 Final Research Report
  • [Presentation] Recent development of open-source speech recognition engine Julius.2009

    • Author(s)
      A. Lee, T. Kawahara
    • Organizer
      In Proc. APSIPA ASC
    • Place of Presentation
      札幌
    • Related Report
      2009 Final Research Report
  • [Presentation] A WFST-based log-linear framework for speaking-style transformation.2009

    • Author(s)
      G. Neubig, S. Mori, T. Kawahara
    • Organizer
      In Proc. INTERSPEECH
    • Place of Presentation
      英国・ブライトン
    • Related Report
      2009 Final Research Report
  • [Presentation] Optimization of dereverberation parameters based on likelihood of speech recognizer.2009

    • Author(s)
      R. Gomez, T. Kawahara
    • Organizer
      In Proc. INTERSPEECH
    • Place of Presentation
      英国・ブライトン
    • Related Report
      2009 Final Research Report
  • [Presentation] Acoustic event detection for spotting hot spots in podcasts.2009

    • Author(s)
      K. Sumi, T. Kawahara, J. Ogata, M. Goto.
    • Organizer
      In Proc. INTERSPEECH
    • Place of Presentation
      英国・ブライトン
    • Related Report
      2009 Final Research Report
  • [Presentation] Automatic transcription system for meetings of the Japanese.2009

    • Author(s)
      Y. Akita, M. Mimura, T. Kawahara
    • Organizer
      In Proc. INTERSPEECH
    • Place of Presentation
      英国・ブライトン
    • Related Report
      2009 Final Research Report
  • [Presentation] Language model transformation applied to lightly supervised training of acoustic model for congress meetings.2009

    • Author(s)
      T. Kawahara, M. Mimura, Y. Akita
    • Organizer
      In Proc. IEEE-ICASSP
    • Place of Presentation
      台北
    • Related Report
      2009 Final Research Report
  • [Presentation] Automatic lecture transcription by exploiting presentation slide information for language model adaptation.2008

    • Author(s)
      T. Kawahara, Y. Nemoto, Y. Akita.
    • Organizer
      IEEE-ICASSP
    • Place of Presentation
      アメリカ合衆国(ラスベガス)
    • Year and Date
      2008-04-01
    • Related Report
      2008 Annual Research Report
  • [Presentation] Extracting word-pronunciation pairs from comparable set of text and speech.2008

    • Author(s)
      T. Sasada, S. Mori, T. Kawahara
    • Organizer
      In Proc. INTERSPEECH
    • Place of Presentation
      豪州・ブリスベーン
    • Related Report
      2009 Final Research Report
  • [Presentation] A Japanese CALL system based on dynamic question generation and error prediction for ASR.2008

    • Author(s)
      H. Wang, T. Kawahara
    • Organizer
      In Proc. INTERSPEECH
    • Place of Presentation
      豪州・ブリスベーン
    • Related Report
      2009 Final Research Report
  • [Presentation] Detection of feeling through back-channels in spoken dialogue.2008

    • Author(s)
      T. Kawahara, M. Toyokura, T. Misu, C. Hori
    • Organizer
      In Proc. INTERSPEECH
    • Place of Presentation
      豪州・ブリスベーン
    • Related Report
      2009 Final Research Report
  • [Presentation] Multi-modal recording, analysis and indexing of poster sessions.2008

    • Author(s)
      T. Kawahara, H. Setoguchi, K. Takanashi, K. Ishizuka, S. Araki.
    • Organizer
      In Proc. INTERSPEECH
    • Place of Presentation
      豪州・ブリスベーン
    • Related Report
      2009 Final Research Report
  • [Presentation] Statistical speech activity detection based on spatial power distribution for analyses of poster presentations.2008

    • Author(s)
      K. Ishizuka, S. Araki, T. Kawahara
    • Organizer
      In Proc. INTERSPEECH
    • Place of Presentation
      豪州・ブリスベーン
    • Related Report
      2009 Final Research Report
  • [Presentation] Bayes risk-based dialogue management for document retrieval system with speech interface.2008

    • Author(s)
      T. Misu, T. Kawahara
    • Organizer
      In Proc. COLING, Vol. Posters & Demo.
    • Place of Presentation
      英国・マンチェスター
    • Related Report
      2009 Final Research Report
  • [Presentation] Effective error prediction using decision tree for ASR grammar network in CALL system.2008

    • Author(s)
      H. Wang, T. Kawahara
    • Organizer
      In Proc. IEEE-ICASSP
    • Place of Presentation
      米国・ラスベガス
    • Related Report
      2009 Final Research Report
  • [Presentation] Automatic lecture transcription by exploiting presentation slide information for language model adaptation.2008

    • Author(s)
      T. Kawahara, Y. Nemoto, Y. Akita
    • Organizer
      In Proc. IEEE-ICASSP
    • Place of Presentation
      米国・ラスベガス
    • Related Report
      2009 Final Research Report
  • [Presentation] Using Variational Bayes Free Energy for unsupervised voice activity detection.2008

    • Author(s)
      D. Cournapeau, T. Kawahara
    • Organizer
      In Proc. IEEE-ICASSP
    • Place of Presentation
      米国・ラスベガス
    • Related Report
      2009 Final Research Report
  • [Presentation] GMM and HMM training by aggregated EM algorithm with increased ensemble sizes for robust parameter estimation.2008

    • Author(s)
      T. Shinozaki, T. Kawahara
    • Organizer
      In Proc. IEEE-ICASSP
    • Place of Presentation
      米国・ラスベガス
    • Related Report
      2009 Final Research Report
  • [Presentation] Speech-based interactive information guidance systemusing question-answering technique.2007

    • Author(s)
      T. Misu and T. Kawahara.
    • Organizer
      IEEE-ICASSP
    • Place of Presentation
      アメリカ合衆国
    • Year and Date
      2007-04-18
    • Related Report
      2007 Annual Research Report
  • [Presentation] HMM training based on CV-EM and CV Gaussian mixture optimization.2007

    • Author(s)
      T. Shinozaki, T. Kawahara
    • Organizer
      In Proc. IEEE Workshop on Automatic Speech Recognition and Understanding
    • Place of Presentation
      京都
    • Related Report
      2009 Final Research Report
  • [Presentation] Evaluation of real-time voice activity detection based on high order statistics.2007

    • Author(s)
      D. Cournapeau, T. Kawahara
    • Organizer
      In Proc. INTERSPEECH
    • Place of Presentation
      ベルギー・ブリュッセル
    • Related Report
      2009 Final Research Report
  • [Presentation] Bayes risk-based optimization of dialogue management for document retrieval system with speech interface.2007

    • Author(s)
      T. Misu, T. Kawahara
    • Organizer
      In Proc. INTERSPEECH
    • Place of Presentation
      ベルギー・ブリュッセル
    • Related Report
      2009 Final Research Report
  • [Presentation] Evaluating and optimizing Japanese tutor system featuring dynamic question generation and interactive guidance.2007

    • Author(s)
      C. Waple, H. Wang, T. Kawahara Y. Tsubota, M. Dantsuji
    • Organizer
      In Proc. INTERSPEECH
    • Place of Presentation
      ベルギー・ブリュッセル
    • Related Report
      2009 Final Research Report
  • [Presentation] Gaussian mixture optimization for HMM based on efficient cross-validation.2007

    • Author(s)
      T. Shinozaki, T. Kawahara
    • Organizer
      In Proc. INTERSPEECH
    • Place of Presentation
      ベルギー・ブリュッセル
    • Related Report
      2009 Final Research Report
  • [Presentation] PLSA-based topic detection in meetings for adaptation of lexicon and language model.2007

    • Author(s)
      Y. Akita, Y. Nemoto, T. Kawahara
    • Organizer
      In Proc. INTERSPEECH
    • Place of Presentation
      ベルギーブリュッセル
    • Related Report
      2009 Final Research Report
  • [Presentation] An interactive framework for document retrieval and presentation with question-answering function in restricted domain.2007

    • Author(s)
      T. Misu, T. Kawahara
    • Organizer
      In Proc. IEA/AIE
    • Place of Presentation
      京都
    • Related Report
      2009 Final Research Report
  • [Presentation] Speech-based interactive information guidance system using question-answering technique.2007

    • Author(s)
      T. Misu, T. Kawahara
    • Organizer
      In Proc. IEEE-ICASSP
    • Place of Presentation
      米国・ホノルル
    • Related Report
      2009 Final Research Report
  • [Presentation] Automatic detection of sentence and clause units using local syntactic dependency.2007

    • Author(s)
      T. Kawahara, M. Saikou, K. Takanashi
    • Organizer
      In Proc. IEEE-ICASSP
    • Place of Presentation
      米国・ホノルル
    • Related Report
      2009 Final Research Report
  • [Presentation] Topic-independent speaking-style transformation of language model for spontaneous speech recognition.2007

    • Author(s)
      Y. Akita, T. Kawahara
    • Organizer
      In Proc. IEEE-ICASSP
    • Place of Presentation
      米国・ホノルル
    • Related Report
      2009 Final Research Report
  • [Book]2008

    • Author(s)
      S. Furui, T. Kawahara
    • Publisher
      Springer
    • Related Report
      2009 Final Research Report
  • [Book] Springer Handbook of Speech Processing2008

    • Author(s)
      Sadaoki Furui and Tatsuya Kawahara
    • Publisher
      Springer
    • Related Report
      2007 Annual Research Report
  • [Patent(Industrial Property Rights)] 音響モデル学習装置、音声認識装置、及び音響モデル学習のためのコンピュータプログラム2009

    • Inventor(s)
      三村正人, 河原達也
    • Industrial Property Rights Holder
      京都大学
    • Industrial Property Number
      2009-094212
    • Filing Date
      2009-04-08
    • Related Report
      2009 Annual Research Report 2009 Final Research Report

URL: 

Published: 2007-04-01   Modified: 2016-04-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi