構造不変の定理に基づく音声アフォーダンスの提案とそれに基づく音声認識系の構築

Publicly Offered Research

Project Area	Cyber Infrastructure for the Information-explosion Era
Project/Area Number	21013015
Research Category	Grant-in-Aid for Scientific Research on Priority Areas
Allocation Type	Single-year Grants
Review Section	Science and Engineering
Research Institution	The University of Tokyo
Principal Investigator	峯松信明東京大学, 大学院・情報理工学系研究科, 准教授 (90273333)
Project Period (FY)	2009 – 2010
Project Status	Completed (Fiscal Year 2010)
Budget Amount *help	¥6,900,000 (Direct Cost: ¥6,900,000) Fiscal Year 2010: ¥3,400,000 (Direct Cost: ¥3,400,000) Fiscal Year 2009: ¥3,500,000 (Direct Cost: ¥3,500,000)
Keywords	音声アフォーダンス / 音声の構造的表象 / f-divergence / 外国語発音評定 / 音声認識 / 自閉症 / ゲシュタルト知覚 / 構造不変の定理 / 変換不変量 / 発音習熟度推定 / 非言語的要因
Research Abstract	音声の音色・声色は,発話者の発声器官サイズや形状の違いに依存するため,同一内容の発声であっても,音としては異なる。音色の違いは音空間の写像として捉えられるため,写像不変の計量を導出し,それのみを用いて音声を表象すれば,任意の写像(即ち,話者による違い)に対して頑健な音声処理が可能となる。本研究では,1)写像不変量の導出とそれに基づく音声アフォーダンスの提案,2)音声アフォーダンスに基づく孤立単語発声を対象とした音声認識系の構築と,連続音声認識への拡張,3)音声アフォーダンスに基づく外国語発音評定技術の構築と,従来技術との融合,4)音声アフォーダンスに基づく重度自閉症者の行動理解や,幼児の音声模倣に対する情報処理モデルの構築,について検討することを目的としており,特に本年度は,3)についての検討を行なった。外国語発音評価を行なう場合,教師音声と学習者音声をそのまま比較すれば声帯模写の上手下手を判定することになる。そこで,体格や年齢による声色のバイアスを除去した上で発音を表象し,両者を比較する技術を構築し,また従来の声帯模写評価的な,音声の絶対的特性に基づく技術との融合を図った。その結果,体格差に頑健に動作し,また(ミスマッチが少ない場合においても)従来手法より高い精度を示す技術を構築することに成功した。実際にデモシステムを用いた英語発音教育指導などの実践も行なった。これら成果は高い評価を受け,外国語学習に関する国内外の会議にて招待講演をする機会を得た。

Report

(2 results)

2010 Annual Research Report
2009 Annual Research Report

Research Products
(42 results)

All 2011 2010 2009

All Journal Article (24 results) (of which Peer Reviewed: 24 results) Presentation (16 results) Book (2 results)

[Journal Article] 音声の構造的表象と多段階の重回帰を用いた外国語発音評価2011
- Author(s)
  鈴木雅之, 峯松信明, 広瀬啓吉
- Journal Title
  
  情報処理学会論文誌
  
  Volume: 52 Pages: 1899-1909
- NAID
  110008508020
- Related Report
  2010 Annual Research Report
- Peer Reviewed
[Journal Article] 音声に含まれる言語的情報を非言語的情報から音響的に分離して抽出する手法の提案～人間らしい音声情報処理の実現に向けた一検討～2011
- Author(s)
  峯松信明, 櫻庭京子, 西村多寿子, 喬宇, 朝川智, 鈴木雅之, 齋藤大輔
- Journal Title
  
  電子情報通信学会論文誌
  
  Volume: J94-D Pages: 12-26
- NAID
  110008006543
- Related Report
  2010 Annual Research Report
- Peer Reviewed
[Journal Article] グローバル時代における英語発音とその科学的な分析方法2011
- Author(s)
  峯松信明
- Journal Title
  
  大学英語教育学会関東支部学会誌
  
  Volume: 7 Pages: 5-14
- Related Report
  2010 Annual Research Report
- Peer Reviewed
[Journal Article] Speech Structure and its Application to Robust Speech Processing2010
- Author(s)
  N.Minematsu, Y.Qiao, S.Asakawa, M.Suzuki
- Journal Title
  
  Journal of New Generation Computing
  
  Volume: 28 Pages: 299-319
- Related Report
  2010 Annual Research Report
- Peer Reviewed
[Journal Article] A study of invariance of f-divergence and its application to speech recognition2010
- Author(s)
  Y.Qiao, N.Minematsu
- Journal Title
  
  IEEE Trans.On Signal Processing
  
  Volume: 58 Pages: 3884-3890
- Related Report
  2010 Annual Research Report
- Peer Reviewed
[Journal Article] Dialect-based speaker classification using speaker-invariant dialect features2010
- Author(s)
  X.Ma, R.Xu, N.Minematsu, Y.Qiao, K.Hirose, A.Li
- Journal Title
  
  Proc.Int.Symposium on Chinese Spoken Language Processing
  
  Volume: 1 Pages: 171-176
- Related Report
  2010 Annual Research Report
- Peer Reviewed
[Journal Article] Human speech model based on information separation and its application to speech processing2010
- Author(s)
  N.Minematsu
- Journal Title
  
  Proc.Int.Symposium on Chinese Spoken Language Processing
  
  Volume: 1 Pages: 477-482
- Related Report
  2010 Annual Research Report
- Peer Reviewed
[Journal Article] Improved generation of speech from its abstract and structural representation2010
- Author(s)
  N.Minematsu, D.Saito, K.Hirose
- Journal Title
  
  Proc.Int.Conf.on Signal Processing
  
  Volume: 1 Pages: 597-600
- Related Report
  2010 Annual Research Report
- Peer Reviewed
[Journal Article] Integration of multilayer regression with structure-based pronunciation assessment2010
- Author(s)
  M.Suzuki, Y.Qiao, N.Minematsu, K.Hirose
- Journal Title
  
  Proc.INTERSPEECH
  
  Volume: 1 Pages: 586-589
- Related Report
  2010 Annual Research Report
- Peer Reviewed
[Journal Article] Pronunciation proficiency estimation based on multilayer regression analysis using speaker-independent strucural features2010
- Author(s)
  M.Suzuki, Y.Qiao, N.Minematsu, K.Hirose
- Journal Title
  
  Proc.Int.Workshop on Second Language Studies
  
  Volume: 1(CD-ROM)
- Related Report
  2010 Annual Research Report
- Peer Reviewed
[Journal Article] Human speech model based on information separation--collection or separation, that is the question.--2010
- Author(s)
  N.Minematsu
- Journal Title
  
  Proc.Int.Conf.on Electronic Speech Signal Processing
  
  Volume: 1 Pages: 273-280
- Related Report
  2010 Annual Research Report
- Peer Reviewed
[Journal Article] A modulation-demodulation model of speech communication2010
- Author(s)
  N.Minematsu
- Journal Title
  
  Proc.Int.Conf.Speech Prosody
  
  Volume: 1(CD-ROM)
- Related Report
  2010 Annual Research Report
- Peer Reviewed
[Journal Article] A study of Hidden Structure Model and its application to labeling sequences2009
- Author(s)
  Y.Qiao, M.Suzuki, N.Minematsu
- Journal Title
  
  Proc.Int.Workshop on Automatic Speech Recognition and Understanding
  
  Pages: 118-123
- Related Report
  2009 Annual Research Report
- Peer Reviewed
[Journal Article] Sub-structure-based estimation of pronunciation proficiency and classification of learners2009
- Author(s)
  M.Suzuki, N.Minematsu, D.Luo, K.Hiro
- Journal Title
  
  Proc.Int.Workshop on Automatic Speech Recognition and Understanding
  
  Pages: 574-579
- Related Report
  2009 Annual Research Report
- Peer Reviewed
[Journal Article] Implementation of robust speech recognition by simulating infants' speech perception based on the invariant sound shape embedded in utterances2009
- Author(s)
  N.Minematsu, S.Asakawa, Y.Qiao, D.Saito, T.Nishimura
- Journal Title
  
  Proc.Speech and Computer
  
  Pages: 35-40
- Related Report
  2009 Annual Research Report
- Peer Reviewed
[Journal Article] A consideration of ASR based on animal evolution and human development-what should A of ASR stand for2009
- Author(s)
  N.Minematsu
- Journal Title
  
  Proc.Int.Workshop on Computational Models of Language Evolution, Acquisition and Processing (CD-ROM)
- Related Report
  2009 Annual Research Report
- Peer Reviewed
[Journal Article] On invariant structural representation for speech recognition : theoretical validation and experimental improvement2009
- Author(s)
  Y.Qiao, S.Asakawa, N.Minematsu, K.Hirose
- Journal Title
  
  Proc.INTERSPEECH
  
  Pages: 3055-3058
- Related Report
  2009 Annual Research Report
- Peer Reviewed
[Journal Article] Structural analysis of dialects, sub-dialects, and sub-sub-dialects of Chinese2009
- Author(s)
  X.Ma, A.Nemoto, N.Miriematsu, Y.Qiao, K.Hirose
- Journal Title
  
  Proc.INTERSPEECH
  
  Pages: 2219-2222
- Related Report
  2009 Annual Research Report
- Peer Reviewed
[Journal Article] Optimal event search using a structural cost function-improvement structure to speech conversion-2009
- Author(s)
  D.Saito, Y.Qiao, N.Minematsu, K.Hirose
- Journal Title
  
  Proc.INTERSPEECH
  
  Pages: 2047-2050
- Related Report
  2009 Annual Research Report
- Peer Reviewed
[Journal Article] Improved structure-based automatic estimation of pronunciation proficiency2009
- Author(s)
  M.Suzuki, L.Dean, N.Minematsu, K.Hirose
- Journal Title
  
  Proc.ISCA Tutorial and Research Workshop on Speech and Language Technology in Education (CD-ROM)
- NAID
  110007990634
- Related Report
  2009 Annual Research Report
- Peer Reviewed
[Journal Article] Speech structure : a new framework of speech processing inspired from infants 'behaviors and animals' behaviors2009
- Author(s)
  N.Minematsu
- Journal Title
  
  Proc.National Conference on Man-Machine Speech Communication
  
  Pages: 504-509
- Related Report
  2009 Annual Research Report
- Peer Reviewed
[Journal Article] Structural analysis of Chinese dialect speakers and their automatic classification2009
- Author(s)
  X.Ma, N.Minematsu, A.Nemoto, M.Takazawa, Y.Qiao, K.Hirose
- Journal Title
  
  Proc.National Conference on Man-Machine Speech Communication
  
  Pages: 440-445
- Related Report
  2009 Annual Research Report
- Peer Reviewed
[Journal Article] Improvement of structure to speech conversion using iterative optimization2009
- Author(s)
  D.Saito, Y.Qiao, N.Minematsu, K.Hirose
- Journal Title
  
  Proc.Speech and Computer
  
  Pages: 174-179
- Related Report
  2009 Annual Research Report
- Peer Reviewed
[Journal Article] Dialect-based speaker classification of Chinese using structural representation of pronunciation2009
- Author(s)
  X.Ma, N.Minematsu, Y.Qiao, K.Hirose, A.Nemoto, F.Shi
- Journal Title
  
  Proc.Speech and Computer
  
  Pages: 350-355
- Related Report
  2009 Annual Research Report
- Peer Reviewed
[Presentation] 情報の分離と音響モデリング～人間らしい音響モデリング～2011
- Author(s)
  峯松信明
- Organizer
  日本音響学会春季全国大会
- Place of Presentation
  早稲田大学
- Year and Date
  2011-03-10
- Related Report
  2010 Annual Research Report
[Presentation] 声の物理的多様性とその認知的不変性～音声認識技術と自閉症の類似性～2010
- Author(s)
  峯松信明, 西村多寿子, 櫻庭京子
- Organizer
  「コミュニケーションとリハビリテーションの現象学」研究会
- Place of Presentation
  東京大学
- Year and Date
  2010-10-29
- Related Report
  2010 Annual Research Report
[Presentation] 脳科学者と音声工学者が考える言葉との出会いとその演出2010
- Author(s)
  峯松信明, 茂木健一郎
- Organizer
  外国語教育メディア学会50周年記念全国研究大会
- Place of Presentation
  横浜市立横浜サイエンスフロンティア高等学校
- Year and Date
  2010-08-03
- Related Report
  2010 Annual Research Report
[Presentation] 英語発音の物理現象を眺めていて気づくこと2010
- Author(s)
  峯松信明
- Organizer
  外国語教育メディア学会50周年記念全国研究大会
- Place of Presentation
  横浜市立横浜サイエンスフロンティア高等学校
- Year and Date
  2010-08-03
- Related Report
  2010 Annual Research Report
[Presentation] グローバル時代の英語発音とその科学的な分析方法2010
- Author(s)
  峯松信明
- Organizer
  JACET関東支部大会
- Place of Presentation
  東洋学園大学
- Year and Date
  2010-06-27
- Related Report
  2010 Annual Research Report
[Presentation] 話者不変な相対関係特徴を音響単位とする音響モデリングに関する実験的検討2009
- Author(s)
  齋藤大輔, 松浦良, 峯松信明, 広瀬敬吉
- Organizer
  電子情報通信学会音声研究会
- Place of Presentation
  東京大学
- Year and Date
  2009-12-21
- Related Report
  2009 Annual Research Report
[Presentation] 二言語に渡る構造的表象に基づく音声・言語変換の実験的検討2009
- Author(s)
  見原隆介, 齋藤大輔, 峯松信明, 広瀬啓吉
- Organizer
  電子情報通信学会音声研究会
- Place of Presentation
  静岡大学
- Year and Date
  2009-11-01
- Related Report
  2009 Annual Research Report
[Presentation] 構造評価関数を用いた構造的表象からの音声合成系の高精度化2009
- Author(s)
  斎藤大輔, 喬宇, 峯松信明, 広瀬敬吉
- Organizer
  電子情報通信学会音声研究会
- Place of Presentation
  静岡大学
- Year and Date
  2009-11-01
- Related Report
  2009 Annual Research Report
[Presentation] 二言語に渡る構造的表象に基づく音声・言語変換の実験的検討2009
- Author(s)
  見原隆介, 齋藤大輔, 峯松信明, 広瀬啓吉
- Organizer
  日本音響学会秋季全国大会
- Place of Presentation
  日本大学
- Year and Date
  2009-09-01
- Related Report
  2009 Annual Research Report
[Presentation] 発音構造を用いた話者の違いに頑健な発音評定・学習者分類2009
- Author(s)
  鈴木雅之, 羅徳安, 峯松信明, 広瀬啓吉
- Organizer
  日本音響学会秋季全国大会
- Place of Presentation
  日本大学
- Year and Date
  2009-09-01
- Related Report
  2009 Annual Research Report
[Presentation] 音声事象の相対関係を音響単位とした未知語用音響モデルに関する実験的検討2009
- Author(s)
  齋藤大輔, 松浦良, 峯松信明, 広瀬啓吉
- Organizer
  日本音響学会秋季全国大会
- Place of Presentation
  日本大学
- Year and Date
  2009-09-01
- Related Report
  2009 Annual Research Report
[Presentation] Proposal of Hidden Structure Model2009
- Author(s)
  喬宇, 鈴木雅之, 峯松信明
- Organizer
  日本音響学会秋季全国大会
- Place of Presentation
  日本大学
- Year and Date
  2009-09-01
- Related Report
  2009 Annual Research Report
[Presentation] 音声情報処理技術を活用した英語発音の自動分類と発音矯正部位の自動推定2009
- Author(s)
  峯松信明, 山内豊
- Organizer
  外国語教育メディア学会全国研究大会
- Place of Presentation
  経済流通大学
- Year and Date
  2009-08-01
- Related Report
  2009 Annual Research Report
[Presentation] An Investigation of Hiden Structure Model2009
- Author(s)
  喬宇, 鈴木雅之, 峯松信明
- Organizer
  情報処理学会音声言語情報処理研究会
- Place of Presentation
  福島県飯坂温泉
- Year and Date
  2009-07-01
- Related Report
  2009 Annual Research Report
[Presentation] 音声の構造的表象を用いた自動発音評定法の改善2009
- Author(s)
  鈴木雅之, 羅徳安, 峯松信明, 広瀬啓吉
- Organizer
  情報処理学会音声言語情報処理研究会
- Place of Presentation
  福島県飯坂温泉
- Year and Date
  2009-07-01
- Related Report
  2009 Annual Research Report
[Presentation] Structural analysis of Chinese dialects and its experimental application to pronunciation assessment2009
- Author(s)
  X.Ma, N.Minematsu, A.Nemoto, Y.Qiao, K.Hirose
- Organizer
  電子情報通信学会音声研究会
- Place of Presentation
  福島県飯坂温泉
- Year and Date
  2009-07-01
- Related Report
  2009 Annual Research Report
[Book] Development of ERJ (English Read by Japanese) database for CALL research", in Computer processing of Asian spoken languages (in Computer processing of Asian spoken languages, edited by S.Itahashi and C.Tseng)2010
- Author(s)
  N.Minematsu
- Total Pages
  5
- Publisher
  Consideration Books
- Related Report
  2010 Annual Research Report
[Book] 「人間に近づく音声認識」(日経サイエンス6月号)2009
- Author(s)
  峯松信明
- Total Pages
  6
- Publisher
  日経サイエンス
- Related Report
  2009 Annual Research Report

構造不変の定理に基づく音声アフォーダンスの提案とそれに基づく音声認識系の構築

Principal Investigator

峯松 信明 東京大学, 大学院・情報理工学系研究科, 准教授 (90273333)

¥6,900,000 (Direct Cost: ¥6,900,000)

Report

Research Products

[Journal Article] 音声の構造的表象と多段階の重回帰を用いた外国語発音評価2011

Author(s)

Journal Title

NAID

Related Report

[Journal Article] 音声に含まれる言語的情報を非言語的情報から音響的に分離して抽出する手法の提案～人間らしい音声情報処理の実現に向けた一検討～2011

Author(s)

Journal Title

NAID

Related Report

[Journal Article] グローバル時代における英語発音とその科学的な分析方法2011

Author(s)

Journal Title

Related Report

[Journal Article] Speech Structure and its Application to Robust Speech Processing2010

Author(s)

Journal Title

Related Report

[Journal Article] A study of invariance of f-divergence and its application to speech recognition2010

Author(s)

Journal Title

Related Report

[Journal Article] Dialect-based speaker classification using speaker-invariant dialect features2010

Author(s)

Journal Title

Related Report

[Journal Article] Human speech model based on information separation and its application to speech processing2010

Author(s)

Journal Title

Related Report

[Journal Article] Improved generation of speech from its abstract and structural representation2010

Author(s)

Journal Title

Related Report

[Journal Article] Integration of multilayer regression with structure-based pronunciation assessment2010

Author(s)

Journal Title

Related Report

[Journal Article] Pronunciation proficiency estimation based on multilayer regression analysis using speaker-independent strucural features2010

Author(s)

Journal Title

Related Report

[Journal Article] Human speech model based on information separation--collection or separation, that is the question.--2010

Author(s)

Journal Title

Related Report

[Journal Article] A modulation-demodulation model of speech communication2010

Author(s)

Journal Title

Related Report

[Journal Article] A study of Hidden Structure Model and its application to labeling sequences2009

Author(s)

Journal Title

Related Report

[Journal Article] Sub-structure-based estimation of pronunciation proficiency and classification of learners2009

Author(s)

Journal Title

Related Report

[Journal Article] Implementation of robust speech recognition by simulating infants' speech perception based on the invariant sound shape embedded in utterances2009

Author(s)

Journal Title

Related Report

[Journal Article] A consideration of ASR based on animal evolution and human development-what should A of ASR stand for2009

Author(s)

Journal Title

Related Report

[Journal Article] On invariant structural representation for speech recognition : theoretical validation and experimental improvement2009

Author(s)

Journal Title

Related Report

[Journal Article] Structural analysis of dialects, sub-dialects, and sub-sub-dialects of Chinese2009

Author(s)

Journal Title

Related Report

峯松信明東京大学, 大学院・情報理工学系研究科, 准教授 (90273333)