• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Advanced indexing based on spoken document retrieval and its feedback

Research Project

Project/Area Number 25330128
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeMulti-year Fund
Section一般
Research Field Multimedia database
Research InstitutionShizuoka University

Principal Investigator

KAI ATSUHIKO  静岡大学, 工学部, 准教授 (60283496)

Co-Investigator(Kenkyū-buntansha) WANG Longbiao  長岡技術科学大学, 技学研究院, 准教授 (30510458)
Co-Investigator(Renkei-kenkyūsha) KOGURE Satoru  静岡大学, 情報学部, 講師 (40359758)
Project Period (FY) 2013-04-01 – 2016-03-31
Project Status Completed (Fiscal Year 2015)
Budget Amount *help
¥4,940,000 (Direct Cost: ¥3,800,000、Indirect Cost: ¥1,140,000)
Fiscal Year 2015: ¥1,560,000 (Direct Cost: ¥1,200,000、Indirect Cost: ¥360,000)
Fiscal Year 2014: ¥1,560,000 (Direct Cost: ¥1,200,000、Indirect Cost: ¥360,000)
Fiscal Year 2013: ¥1,820,000 (Direct Cost: ¥1,400,000、Indirect Cost: ¥420,000)
Keywords音声ドキュメント検索 / 音声検索語検出 / STD / 音声クエリ / DNN / 音声認識信頼度 / スコア正規化 / 音声区間検出 / 雑音残響環境 / 残響除去 / 認識精度推定 / VAD / 話者認識 / 信頼度
Outline of Final Research Achievements

We investigated and developed elemental technologies for indexing and other related processes which are designed to permit efficient and sustainable development of spoken document retrieval systems. For dealing with a possible change in speech features regarding to the recording conditions and speakers, we proposed DNN-based voice activity detection (VAD) and dereverberation models as a frontend of speaker diarization and speech recognition systems and improved accuracy for those systems. Also, we proposed DNN-based feature transformation as a rescoring step of spoken term detection (STD) system for coping with out-of-vocabulary words and the STD performance has been significantly improved.

Report

(4 results)
  • 2015 Annual Research Report   Final Research Report ( PDF )
  • 2014 Research-status Report
  • 2013 Research-status Report
  • Research Products

    (20 results)

All 2016 2015 2014 2013 Other

All Journal Article (10 results) (of which Peer Reviewed: 9 results,  Open Access: 3 results,  Acknowledgement Compliant: 1 results) Presentation (10 results) (of which Int'l Joint Research: 1 results)

  • [Journal Article] Combination of bottleneck feature extraction and dereverberation for distant-talking speech recognition2015

    • Author(s)
      Ren, Bo and Wang, Longbiao and Lu, Liang and Ueda, Yuma and Kai, Atsuhiko
    • Journal Title

      MULTIMEDIA TOOLS AND APPLICATIONS

      Volume: 75 Pages: 1-16

    • Related Report
      2015 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Environment-dependent denoising autoencoder for distant-talking speech recognition2015

    • Author(s)
      Y. Ueda, L. Wang, A. Kai, B. Ren
    • Journal Title

      Eurasip Journal on Advances in Signal Processing

      Volume: 2015:92 Issue: 1 Pages: 1-11

    • DOI

      10.1186/s13634-015-0278-y

    • Related Report
      2015 Annual Research Report
    • Peer Reviewed / Open Access / Acknowledgement Compliant
  • [Journal Article] Distant-talking speaker identification by generalized spectral subtraction-based dereverberation and its efficient computation2014

    • Author(s)
      Zhaofeng Zhang, Longbiao Wang and Atsuhiko Kai
    • Journal Title

      EURASIP Journal on Audio, Speech, and Music Processing

      Volume: 2014:15 Issue: 1 Pages: 1-12

    • DOI

      10.1186/1687-4722-2014-15

    • Related Report
      2014 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] Combining Subword and State-level Dissimilarity Measures for Improved Spoken Term Detection in NTCIR-11 SpokenQuery&Doc Task2014

    • Author(s)
      Mitsuaki Makino and Atsuhiko Kai
    • Journal Title

      Proceedings of the 11th NTCIR Conference on Evaluation of Information Access Technologies

      Volume: - Pages: 413-418

    • Related Report
      2014 Research-status Report
    • Open Access
  • [Journal Article] Utilizing State-level Distance Vector Representation for Improved Spoken Term Detection by Text and Spoken Queries2014

    • Author(s)
      Mitsuaki Makino, Naoki Yamamoto, Atsuhiko Kai
    • Journal Title

      Proceedings of the 15th Annual Conference of the International Speech Communication Association (INTERSPEECH 2014)

      Volume: - Pages: 1732-1736

    • Related Report
      2014 Research-status Report
    • Peer Reviewed
  • [Journal Article] Denoising autoencoder and environment adaptation for distant-talking speech recognition with asynchronous speech recording2014

    • Author(s)
      Longbiao Wang, Bo Ren, Yuma Ueda, Atsuhiko Kai, Shunta Teraoka and Taku Fukushima
    • Journal Title

      Proceedings of Asia-Pacific Signal Information Processing Association Annual Summit and Conference (APSIPA ASC)

      Volume: - Pages: 1-5

    • DOI

      10.1109/apsipa.2014.7041548

    • Related Report
      2014 Research-status Report
    • Peer Reviewed
  • [Journal Article] Single-channel dereverberation for distant-talking speech recognition by combining denoising autoencoder and temporal structure normalization2014

    • Author(s)
      Yuma Ueda, Longbiao Wang, Atsuhiko Kai, Xiong Xiao, EngSiong Chng and Haizhou Li
    • Journal Title

      Proceedings of the 9th International Symposium on Chinese Spoken Language Processing (ISCSLP 2014)

      Volume: - Pages: 379-383

    • DOI

      10.1109/iscslp.2014.6936613

    • Related Report
      2014 Research-status Report
    • Peer Reviewed
  • [Journal Article] Single-sided Approach to Discriminative PLDA Training for Text-Independent Speaker Verification without Using Expanded I-vector2014

    • Author(s)
      Ikuya Hirano, Kong Aik Lee, Zhaofeng Zhang, Longbiao Wang and Atsuhiko Kai
    • Journal Title

      Proceedings of the 9th International Symposium on Chinese Spoken Language Processing (ISCSLP 2014)

      Volume: - Pages: 59-63

    • DOI

      10.1109/iscslp.2014.6936581

    • Related Report
      2014 Research-status Report
    • Peer Reviewed
  • [Journal Article] Using Acoustic Dissimilarity Measures Based on State-Level Distance Vector Representation for Improved Spoken Term Detection2013

    • Author(s)
      Naoki Yamamoto, Atsuhiko Kai
    • Journal Title

      Proc. of APSIPA Annual Summit and Conference 2013

      Volume: - Pages: 1-4

    • DOI

      10.1109/apsipa.2013.6694151

    • Related Report
      2013 Research-status Report
    • Peer Reviewed
  • [Journal Article] Improvement of distant-talking speaker identification using bottleneck features of DNN2013

    • Author(s)
      Takanori Yamada, Longbiao Wang, Atsuhiko Kai
    • Journal Title

      Proc. of INTERSPEECH 2013

      Volume: - Pages: 3661-3664

    • Related Report
      2013 Research-status Report
    • Peer Reviewed
  • [Presentation] Combining State-level and DNN-based Acoustic Matches for Efficient Spoken Term Detection in NTCIR-12 SpokenQuery&Doc-2 Task2016

    • Author(s)
      Shuji Oishi, Tatsuya Matsuba, Mitsuaki Makino, Atsuhiko Kai
    • Organizer
      NTCIR 12 Conference
    • Place of Presentation
      学術総合センター(東京)
    • Year and Date
      2016-06-08
    • Related Report
      2015 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Cepstral domain denoising autoencoder およびDNN-HMM による雑音・残響下音声認識2015

    • Author(s)
      上田雄磨,王 龍標,甲斐充彦
    • Organizer
      日本音響学会2015年春季研究発表会
    • Place of Presentation
      中央大学後楽園キャンパス(東京都文京区)
    • Year and Date
      2015-03-17
    • Related Report
      2014 Research-status Report
  • [Presentation] Speech selection and environmental adaptation for asynchronous speech recording based on deep neural network2014

    • Author(s)
      Bo Ren, Longbiao Wang and Atsuhiko Kai
    • Organizer
      第16回音声言語シンポジウム(電子情報通信学会)
    • Place of Presentation
      東京工業大学すずかけ台キャンパス(神奈川県横浜市)
    • Year and Date
      2014-12-16
    • Related Report
      2014 Research-status Report
  • [Presentation] DNNに基づく特徴変換による残響環境話者認識2014

    • Author(s)
      張 兆峰, 王 龍標, 甲斐充彦, 李 衛鋒, 岩橋政宏
    • Organizer
      第16回音声言語シンポジウム(電子情報通信学会)
    • Place of Presentation
      東京工業大学すずかけ台キャンパス(神奈川県横浜市)
    • Year and Date
      2014-12-16
    • Related Report
      2014 Research-status Report
  • [Presentation] 会議音声における音声区間検出のためのDeep Neural Networkとクロス適応の検討2014

    • Author(s)
      中谷彰宏, 王 龍標, 甲斐充彦
    • Organizer
      第16回音声言語シンポジウム(電子情報通信学会)
    • Place of Presentation
      東京工業大学すずかけ台キャンパス(神奈川県横浜市)
    • Year and Date
      2014-12-15
    • Related Report
      2014 Research-status Report
  • [Presentation] 非同期音声収録を用いた遠隔発話音声認識2014

    • Author(s)
      寺岡俊汰, 上田雄磨, 王 龍標, 甲斐充彦, 福島 拓
    • Organizer
      音学シンポジウム2014 (電子情報通信学会)
    • Place of Presentation
      日本大学文理学部キャンパス(東京都世田谷区)
    • Year and Date
      2014-05-24
    • Related Report
      2014 Research-status Report
  • [Presentation] Spoken Term Detection Using Distance-Vector based Dissimilarity Measures and Its Evaluation on the NTCIR-10 SpokenDoc-2 Task

    • Author(s)
      Naoki Yamamoto, Atsuhiko Kai
    • Organizer
      The 10th NTCIR Conference
    • Place of Presentation
      学術総合センター(東京)
    • Related Report
      2013 Research-status Report
  • [Presentation] 雑音に頑健な音声区間検出のためのDeep Belief Networkの適用

    • Author(s)
      中谷彰宏, 王 龍標, 甲斐充彦
    • Organizer
      日本音響学会2013年秋季研究発表会
    • Place of Presentation
      豊橋技術科学大学(愛知)
    • Related Report
      2013 Research-status Report
  • [Presentation] 分布間距離ベクトルに基づく音響的類似度とサブワード事後確率の併用による音声検索語検出の改善

    • Author(s)
      山本 直樹, 甲斐 充彦
    • Organizer
      情報処理学会音声言語情報処理研究会
    • Place of Presentation
      筑波大学文京キャンパス(東京)
    • Related Report
      2013 Research-status Report
  • [Presentation] 分布間距離ベクトル表現による音響的類似度を利用したテキスト及び音声クエリからの音声検索語検出の改善

    • Author(s)
      牧野光晃, 山本直樹, 甲斐充彦
    • Organizer
      第8回音声ドキュメント処理ワークショップ
    • Place of Presentation
      豊橋市民センター(愛知)
    • Related Report
      2013 Research-status Report

URL: 

Published: 2014-07-25   Modified: 2019-07-29  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi