• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

構造不変の定理に基づく音声アフォーダンスの提案とそれに基づく音声認識系の構築

Publicly Offered Research

Project AreaCyber Infrastructure for the Information-explosion Era
Project/Area Number 21013015
Research Category

Grant-in-Aid for Scientific Research on Priority Areas

Allocation TypeSingle-year Grants
Review Section Science and Engineering
Research InstitutionThe University of Tokyo

Principal Investigator

峯松 信明  東京大学, 大学院・情報理工学系研究科, 准教授 (90273333)

Project Period (FY) 2009 – 2010
Project Status Completed (Fiscal Year 2010)
Budget Amount *help
¥6,900,000 (Direct Cost: ¥6,900,000)
Fiscal Year 2010: ¥3,400,000 (Direct Cost: ¥3,400,000)
Fiscal Year 2009: ¥3,500,000 (Direct Cost: ¥3,500,000)
Keywords音声アフォーダンス / 音声の構造的表象 / f-divergence / 外国語発音評定 / 音声認識 / 自閉症 / ゲシュタルト知覚 / 構造不変の定理 / 変換不変量 / 発音習熟度推定 / 非言語的要因
Research Abstract

音声の音色・声色は,発話者の発声器官サイズや形状の違いに依存するため,同一内容の発声であっても,音としては異なる。音色の違いは音空間の写像として捉えられるため,写像不変の計量を導出し,それのみを用いて音声を表象すれば,任意の写像(即ち,話者による違い)に対して頑健な音声処理が可能となる。本研究では,1)写像不変量の導出とそれに基づく音声アフォーダンスの提案,2)音声アフォーダンスに基づく孤立単語発声を対象とした音声認識系の構築と,連続音声認識への拡張,3)音声アフォーダンスに基づく外国語発音評定技術の構築と,従来技術との融合,4)音声アフォーダンスに基づく重度自閉症者の行動理解や,幼児の音声模倣に対する情報処理モデルの構築,について検討することを目的としており,特に本年度は,3)についての検討を行なった。外国語発音評価を行なう場合,教師音声と学習者音声をそのまま比較すれば声帯模写の上手下手を判定することになる。そこで,体格や年齢による声色のバイアスを除去した上で発音を表象し,両者を比較する技術を構築し,また従来の声帯模写評価的な,音声の絶対的特性に基づく技術との融合を図った。その結果,体格差に頑健に動作し,また(ミスマッチが少ない場合においても)従来手法より高い精度を示す技術を構築することに成功した。実際にデモシステムを用いた英語発音教育指導などの実践も行なった。これら成果は高い評価を受け,外国語学習に関する国内外の会議にて招待講演をする機会を得た。

Report

(2 results)
  • 2010 Annual Research Report
  • 2009 Annual Research Report
  • Research Products

    (42 results)

All 2011 2010 2009

All Journal Article (24 results) (of which Peer Reviewed: 24 results) Presentation (16 results) Book (2 results)

  • [Journal Article] 音声の構造的表象と多段階の重回帰を用いた外国語発音評価2011

    • Author(s)
      鈴木雅之, 峯松信明, 広瀬啓吉
    • Journal Title

      情報処理学会論文誌

      Volume: 52 Pages: 1899-1909

    • NAID

      110008508020

    • Related Report
      2010 Annual Research Report
    • Peer Reviewed
  • [Journal Article] 音声に含まれる言語的情報を非言語的情報から音響的に分離して抽出する手法の提案~人間らしい音声情報処理の実現に向けた一検討~2011

    • Author(s)
      峯松信明, 櫻庭京子, 西村多寿子, 喬宇, 朝川智, 鈴木雅之, 齋藤大輔
    • Journal Title

      電子情報通信学会論文誌

      Volume: J94-D Pages: 12-26

    • NAID

      110008006543

    • Related Report
      2010 Annual Research Report
    • Peer Reviewed
  • [Journal Article] グローバル時代における英語発音とその科学的な分析方法2011

    • Author(s)
      峯松信明
    • Journal Title

      大学英語教育学会関東支部学会誌

      Volume: 7 Pages: 5-14

    • Related Report
      2010 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Speech Structure and its Application to Robust Speech Processing2010

    • Author(s)
      N.Minematsu, Y.Qiao, S.Asakawa, M.Suzuki
    • Journal Title

      Journal of New Generation Computing

      Volume: 28 Pages: 299-319

    • Related Report
      2010 Annual Research Report
    • Peer Reviewed
  • [Journal Article] A study of invariance of f-divergence and its application to speech recognition2010

    • Author(s)
      Y.Qiao, N.Minematsu
    • Journal Title

      IEEE Trans.On Signal Processing

      Volume: 58 Pages: 3884-3890

    • Related Report
      2010 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Dialect-based speaker classification using speaker-invariant dialect features2010

    • Author(s)
      X.Ma, R.Xu, N.Minematsu, Y.Qiao, K.Hirose, A.Li
    • Journal Title

      Proc.Int.Symposium on Chinese Spoken Language Processing

      Volume: 1 Pages: 171-176

    • Related Report
      2010 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Human speech model based on information separation and its application to speech processing2010

    • Author(s)
      N.Minematsu
    • Journal Title

      Proc.Int.Symposium on Chinese Spoken Language Processing

      Volume: 1 Pages: 477-482

    • Related Report
      2010 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Improved generation of speech from its abstract and structural representation2010

    • Author(s)
      N.Minematsu, D.Saito, K.Hirose
    • Journal Title

      Proc.Int.Conf.on Signal Processing

      Volume: 1 Pages: 597-600

    • Related Report
      2010 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Integration of multilayer regression with structure-based pronunciation assessment2010

    • Author(s)
      M.Suzuki, Y.Qiao, N.Minematsu, K.Hirose
    • Journal Title

      Proc.INTERSPEECH

      Volume: 1 Pages: 586-589

    • Related Report
      2010 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Pronunciation proficiency estimation based on multilayer regression analysis using speaker-independent strucural features2010

    • Author(s)
      M.Suzuki, Y.Qiao, N.Minematsu, K.Hirose
    • Journal Title

      Proc.Int.Workshop on Second Language Studies

      Volume: 1(CD-ROM)

    • Related Report
      2010 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Human speech model based on information separation--collection or separation, that is the question.--2010

    • Author(s)
      N.Minematsu
    • Journal Title

      Proc.Int.Conf.on Electronic Speech Signal Processing

      Volume: 1 Pages: 273-280

    • Related Report
      2010 Annual Research Report
    • Peer Reviewed
  • [Journal Article] A modulation-demodulation model of speech communication2010

    • Author(s)
      N.Minematsu
    • Journal Title

      Proc.Int.Conf.Speech Prosody

      Volume: 1(CD-ROM)

    • Related Report
      2010 Annual Research Report
    • Peer Reviewed
  • [Journal Article] A study of Hidden Structure Model and its application to labeling sequences2009

    • Author(s)
      Y.Qiao, M.Suzuki, N.Minematsu
    • Journal Title

      Proc.Int.Workshop on Automatic Speech Recognition and Understanding

      Pages: 118-123

    • Related Report
      2009 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Sub-structure-based estimation of pronunciation proficiency and classification of learners2009

    • Author(s)
      M.Suzuki, N.Minematsu, D.Luo, K.Hiro
    • Journal Title

      Proc.Int.Workshop on Automatic Speech Recognition and Understanding

      Pages: 574-579

    • Related Report
      2009 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Implementation of robust speech recognition by simulating infants' speech perception based on the invariant sound shape embedded in utterances2009

    • Author(s)
      N.Minematsu, S.Asakawa, Y.Qiao, D.Saito, T.Nishimura
    • Journal Title

      Proc.Speech and Computer

      Pages: 35-40

    • Related Report
      2009 Annual Research Report
    • Peer Reviewed
  • [Journal Article] A consideration of ASR based on animal evolution and human development-what should A of ASR stand for2009

    • Author(s)
      N.Minematsu
    • Journal Title

      Proc.Int.Workshop on Computational Models of Language Evolution, Acquisition and Processing (CD-ROM)

    • Related Report
      2009 Annual Research Report
    • Peer Reviewed
  • [Journal Article] On invariant structural representation for speech recognition : theoretical validation and experimental improvement2009

    • Author(s)
      Y.Qiao, S.Asakawa, N.Minematsu, K.Hirose
    • Journal Title

      Proc.INTERSPEECH

      Pages: 3055-3058

    • Related Report
      2009 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Structural analysis of dialects, sub-dialects, and sub-sub-dialects of Chinese2009

    • Author(s)
      X.Ma, A.Nemoto, N.Miriematsu, Y.Qiao, K.Hirose
    • Journal Title

      Proc.INTERSPEECH

      Pages: 2219-2222

    • Related Report
      2009 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Optimal event search using a structural cost function-improvement structure to speech conversion-2009

    • Author(s)
      D.Saito, Y.Qiao, N.Minematsu, K.Hirose
    • Journal Title

      Proc.INTERSPEECH

      Pages: 2047-2050

    • Related Report
      2009 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Improved structure-based automatic estimation of pronunciation proficiency2009

    • Author(s)
      M.Suzuki, L.Dean, N.Minematsu, K.Hirose
    • Journal Title

      Proc.ISCA Tutorial and Research Workshop on Speech and Language Technology in Education (CD-ROM)

    • NAID

      110007990634

    • Related Report
      2009 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Speech structure : a new framework of speech processing inspired from infants 'behaviors and animals' behaviors2009

    • Author(s)
      N.Minematsu
    • Journal Title

      Proc.National Conference on Man-Machine Speech Communication

      Pages: 504-509

    • Related Report
      2009 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Structural analysis of Chinese dialect speakers and their automatic classification2009

    • Author(s)
      X.Ma, N.Minematsu, A.Nemoto, M.Takazawa, Y.Qiao, K.Hirose
    • Journal Title

      Proc.National Conference on Man-Machine Speech Communication

      Pages: 440-445

    • Related Report
      2009 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Improvement of structure to speech conversion using iterative optimization2009

    • Author(s)
      D.Saito, Y.Qiao, N.Minematsu, K.Hirose
    • Journal Title

      Proc.Speech and Computer

      Pages: 174-179

    • Related Report
      2009 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Dialect-based speaker classification of Chinese using structural representation of pronunciation2009

    • Author(s)
      X.Ma, N.Minematsu, Y.Qiao, K.Hirose, A.Nemoto, F.Shi
    • Journal Title

      Proc.Speech and Computer

      Pages: 350-355

    • Related Report
      2009 Annual Research Report
    • Peer Reviewed
  • [Presentation] 情報の分離と音響モデリング~人間らしい音響モデリング~2011

    • Author(s)
      峯松信明
    • Organizer
      日本音響学会春季全国大会
    • Place of Presentation
      早稲田大学
    • Year and Date
      2011-03-10
    • Related Report
      2010 Annual Research Report
  • [Presentation] 声の物理的多様性とその認知的不変性~音声認識技術と自閉症の類似性~2010

    • Author(s)
      峯松信明, 西村多寿子, 櫻庭京子
    • Organizer
      「コミュニケーションとリハビリテーションの現象学」研究会
    • Place of Presentation
      東京大学
    • Year and Date
      2010-10-29
    • Related Report
      2010 Annual Research Report
  • [Presentation] 脳科学者と音声工学者が考える言葉との出会いとその演出2010

    • Author(s)
      峯松信明, 茂木健一郎
    • Organizer
      外国語教育メディア学会50周年記念全国研究大会
    • Place of Presentation
      横浜市立横浜サイエンスフロンティア高等学校
    • Year and Date
      2010-08-03
    • Related Report
      2010 Annual Research Report
  • [Presentation] 英語発音の物理現象を眺めていて気づくこと2010

    • Author(s)
      峯松信明
    • Organizer
      外国語教育メディア学会50周年記念全国研究大会
    • Place of Presentation
      横浜市立横浜サイエンスフロンティア高等学校
    • Year and Date
      2010-08-03
    • Related Report
      2010 Annual Research Report
  • [Presentation] グローバル時代の英語発音とその科学的な分析方法2010

    • Author(s)
      峯松信明
    • Organizer
      JACET関東支部大会
    • Place of Presentation
      東洋学園大学
    • Year and Date
      2010-06-27
    • Related Report
      2010 Annual Research Report
  • [Presentation] 話者不変な相対関係特徴を音響単位とする音響モデリングに関する実験的検討2009

    • Author(s)
      齋藤大輔, 松浦良, 峯松信明, 広瀬敬吉
    • Organizer
      電子情報通信学会音声研究会
    • Place of Presentation
      東京大学
    • Year and Date
      2009-12-21
    • Related Report
      2009 Annual Research Report
  • [Presentation] 二言語に渡る構造的表象に基づく音声・言語変換の実験的検討2009

    • Author(s)
      見原隆介, 齋藤大輔, 峯松信明, 広瀬啓吉
    • Organizer
      電子情報通信学会音声研究会
    • Place of Presentation
      静岡大学
    • Year and Date
      2009-11-01
    • Related Report
      2009 Annual Research Report
  • [Presentation] 構造評価関数を用いた構造的表象からの音声合成系の高精度化2009

    • Author(s)
      斎藤大輔, 喬宇, 峯松信明, 広瀬敬吉
    • Organizer
      電子情報通信学会音声研究会
    • Place of Presentation
      静岡大学
    • Year and Date
      2009-11-01
    • Related Report
      2009 Annual Research Report
  • [Presentation] 二言語に渡る構造的表象に基づく音声・言語変換の実験的検討2009

    • Author(s)
      見原隆介, 齋藤大輔, 峯松信明, 広瀬啓吉
    • Organizer
      日本音響学会秋季全国大会
    • Place of Presentation
      日本大学
    • Year and Date
      2009-09-01
    • Related Report
      2009 Annual Research Report
  • [Presentation] 発音構造を用いた話者の違いに頑健な発音評定・学習者分類2009

    • Author(s)
      鈴木雅之, 羅徳安, 峯松信明, 広瀬啓吉
    • Organizer
      日本音響学会秋季全国大会
    • Place of Presentation
      日本大学
    • Year and Date
      2009-09-01
    • Related Report
      2009 Annual Research Report
  • [Presentation] 音声事象の相対関係を音響単位とした未知語用音響モデルに関する実験的検討2009

    • Author(s)
      齋藤大輔, 松浦良, 峯松信明, 広瀬啓吉
    • Organizer
      日本音響学会秋季全国大会
    • Place of Presentation
      日本大学
    • Year and Date
      2009-09-01
    • Related Report
      2009 Annual Research Report
  • [Presentation] Proposal of Hidden Structure Model2009

    • Author(s)
      喬宇, 鈴木雅之, 峯松信明
    • Organizer
      日本音響学会秋季全国大会
    • Place of Presentation
      日本大学
    • Year and Date
      2009-09-01
    • Related Report
      2009 Annual Research Report
  • [Presentation] 音声情報処理技術を活用した英語発音の自動分類と発音矯正部位の自動推定2009

    • Author(s)
      峯松信明, 山内豊
    • Organizer
      外国語教育メディア学会全国研究大会
    • Place of Presentation
      経済流通大学
    • Year and Date
      2009-08-01
    • Related Report
      2009 Annual Research Report
  • [Presentation] An Investigation of Hiden Structure Model2009

    • Author(s)
      喬宇, 鈴木雅之, 峯松信明
    • Organizer
      情報処理学会音声言語情報処理研究会
    • Place of Presentation
      福島県飯坂温泉
    • Year and Date
      2009-07-01
    • Related Report
      2009 Annual Research Report
  • [Presentation] 音声の構造的表象を用いた自動発音評定法の改善2009

    • Author(s)
      鈴木雅之, 羅徳安, 峯松信明, 広瀬啓吉
    • Organizer
      情報処理学会音声言語情報処理研究会
    • Place of Presentation
      福島県飯坂温泉
    • Year and Date
      2009-07-01
    • Related Report
      2009 Annual Research Report
  • [Presentation] Structural analysis of Chinese dialects and its experimental application to pronunciation assessment2009

    • Author(s)
      X.Ma, N.Minematsu, A.Nemoto, Y.Qiao, K.Hirose
    • Organizer
      電子情報通信学会音声研究会
    • Place of Presentation
      福島県飯坂温泉
    • Year and Date
      2009-07-01
    • Related Report
      2009 Annual Research Report
  • [Book] Development of ERJ (English Read by Japanese) database for CALL research", in Computer processing of Asian spoken languages (in Computer processing of Asian spoken languages, edited by S.Itahashi and C.Tseng)2010

    • Author(s)
      N.Minematsu
    • Total Pages
      5
    • Publisher
      Consideration Books
    • Related Report
      2010 Annual Research Report
  • [Book] 「人間に近づく音声認識」(日経サイエンス6月号)2009

    • Author(s)
      峯松信明
    • Total Pages
      6
    • Publisher
      日経サイエンス
    • Related Report
      2009 Annual Research Report

URL: 

Published: 2009-04-01   Modified: 2018-03-28  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi