Visual speech recognition using ultrasound tongue and video lip/face images

Research Project

Project/Area Number	23520467
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Multi-year Fund
Section	一般
Research Field	Linguistics
Research Institution	The University of Aizu
Principal Investigator	WILSON Ian 会津大学, コンピュータ理工学部, 教授 (50444930)
Project Period (FY)	2011 – 2013
Project Status	Completed (Fiscal Year 2013)
Budget Amount *help	¥4,810,000 (Direct Cost: ¥3,700,000、Indirect Cost: ¥1,110,000) Fiscal Year 2013: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000) Fiscal Year 2012: ¥1,300,000 (Direct Cost: ¥1,000,000、Indirect Cost: ¥300,000) Fiscal Year 2011: ¥2,210,000 (Direct Cost: ¥1,700,000、Indirect Cost: ¥510,000)
Keywords	ultrasound / video / tongue / articulation / jaw / 超音波 / ビデオ / 舌 / 調音 / acoustics / optical flow / pixel-wise / computer lipreading / stress / pitch
Research Abstract	There are three main results of our research: (1) Related to video data collection of jaw movement, when measuring the amount of skin stretching over the mandible for the vowel in a CVC syllable, the onset consonant (but not the coda consonant) has a significant effect. (2) Related to ultrasound data collection of tongue position when speaking English, native (L1) speakers rest their tongue in a more efficient location (closer to the median position for English speech sounds) than Japanese (L2) speakers do. (3) Related to our focus on how best to construct and interpret a feature space we call MUTIS (midsagittal ultrasound tongue image space), results indicated that higher dimensions of MUTIS are most effective for identifying people, and that primarily the lower dimensions of VSS (vocal sound space) data are most effective for identifying phonemes. Trajectories within the VSS data indicate clear differences between L1 and L2 speakers, but not within the MUTIS data alone.

Report

(4 results)

2013 Annual Research Report Final Research Report ( PDF )
2012 Research-status Report
2011 Research-status Report

Research Products
(22 results)

All 2013 2012 Other

All Journal Article (11 results) (of which Peer Reviewed: 1 results) Presentation (8 results) Remarks (3 results)

[Journal Article] Effect of syllable onset, coda, and nucleus on degree of skin stretching over the mandible.2013
- Author(s)
  Wilson, I. and Erickson, D.（連携研究者）
- Journal Title
  
  Proc. of International Congress of Acoustics.
  
  Volume: 19 Pages: 060269-060269
- DOI
  10.1121/1.4799467
- Related Report
  2013 Annual Research Report 2013 Final Research Report
[Journal Article] Normalization and matching routine for comparing first and second language tongue trajectories2013
- Author(s)
  Moriya, S., Y.Yaguchi, N.Terunuma, T.Sato, & I.Wilson
- Journal Title
  
  Journal of the Acoustical Society of America
  
  Volume: vol.134, No.5, Pt.2 Issue: 5_Supplement Pages: 4244-4244
- DOI
  10.1121/1.4831607
- NAID
  110009889298
- Related Report
  2013 Annual Research Report 2013 Final Research Report
[Journal Article] Coarticulatory effects of lateral tongue bracing in first and second language English speakers.2013
- Author(s)
  Kanada, S., Wilson, I., Gick, B., Erickson, D.（連携研究者）
- Journal Title
  
  Acoustical Society of America, Fall Meeting, San Francisco
  
  Volume: 134 (5) Issue: 5_Supplement Pages: 4244-4244
- DOI
  10.1121/1.4831608
- Related Report
  2013 Annual Research Report 2013 Final Research Report
- Peer Reviewed
[Journal Article] 舌特徴空間における言語学習者の違いを比較するための正規化とマッチング手法2013
- Author(s)
  Moriya, S., Y. Yaguchi, N. Terunuma, T. Sato, & I. Wilson
- Journal Title
  
  IEICE Technical Report
  
  Volume: vol.113, No.308, SP2013-80 Pages: 53-57
- Related Report
  2013 Final Research Report
[Journal Article] 舌特徴空間における言語学習者の違いを比較するための正規化とマッチング手法2013
- Author(s)
  S. Moriya, Y. Yaguchi, N. Terunuma, T. Sato, I. Wilson
- Journal Title
  
  IEICE Technical Report
  
  Volume: 113 Pages: 53-57
- Related Report
  2013 Annual Research Report
[Journal Article] Articulating rhythm in L1 and L2 English : Focus on jaw and F02012
- Author(s)
  Wilson, I., D.Erickson, & N.Horiguchi
- Journal Title
  
  Proceedings of the 2012 Autumn Meeting of the Acoustical Society of Japan (ASJ)
  
  Pages: 319-322
- Related Report
  2013 Final Research Report
[Journal Article] Finding phoneme trajectories in a feature space of sound and midsagittal ultrasound tongue images2012
- Author(s)
  Yaguchi, Y., N.Horiguchi, & I.Wilson
- Journal Title
  
  In IEEE Proceedings of the 4th International Conference on Awareness Science and Technology (iCAST 2012)
  
  Pages: 156-162
- DOI
  10.1109/icawst.2012.6469606
- Related Report
  2013 Final Research Report 2012 Research-status Report
[Journal Article] Video recordings of L1 and L2 jaw movement: Effect of syllable onset on jaw opening during syllable nucleus2012
- Author(s)
  Y. Abe, I. Wilson, D. Erickson
- Journal Title
  
  Journal of the Acoustical Society of America
  
  Volume: 132 Issue: 3_Supplement Pages: 2005-2005
- DOI
  10.1121/1.4755428
- Related Report
  2013 Final Research Report 2012 Research-status Report
[Journal Article] Pitch and intensity in the speech of Japanese speakers of English: Comparison with L1 speakers2012
- Author(s)
  J. Okada, I. Wilson, M. Yoshizawa
- Journal Title
  
  Journal of the Acoustical Society of America
  
  Volume: 132 Issue: 3_Supplement Pages: 2004-2004
- DOI
  10.1121/1.4755421
- Related Report
  2013 Final Research Report 2012 Research-status Report
[Journal Article] Comparing L1 and L2 phoneme trajectories in a feature space of sound and midsagittal ultrasound tongue images2012
- Author(s)
  Sano, K., Y.Yaguchi, & I.Wilson
- Journal Title
  
  Journal of the Acoustical Society of America
  
  Volume: vol.132, No.3, Pt.2 Issue: 3_Supplement Pages: 1934-1934
- DOI
  10.1121/1.4755107
- Related Report
  2013 Final Research Report 2012 Research-status Report
[Journal Article] Articulating rhythm in L1 and L2 English: Focus on jaw and F02012
- Author(s)
  I. Wilson, D. Erickson, N. Horiguchi
- Journal Title
  
  Proceedings of the 2012 Autumn Meeting of the Acoustical Society of Japan
  
  Volume: 2012 Pages: 319-322
- Related Report
  2012 Research-status Report
[Presentation] Lateral tongue bracing in Japanese and English2013
- Author(s)
  Wilson, I., J.Villegas, & T.Doi
- Organizer
  Paper presented at Ultrafest VI
- Place of Presentation
  Edinburgh, Scotland
- Year and Date
  2013-11-08
- Related Report
  2013 Final Research Report
[Presentation] Articulatory and laryngeal contributions to rhythm in English2013
- Author(s)
  Erickson, D. & I.Wilson
- Organizer
  the Joint Research Meeting of the Dept. of Linguistic Theory and Structur
- Place of Presentation
  NINJAL, Tokyo, Japan (Poster presented)
- Year and Date
  2013-03-02
- Related Report
  2013 Final Research Report
[Presentation] Lateral tongue bracing in Japanese and English2013
- Author(s)
  I. Wilson, J. Villegas, T. Doi
- Organizer
  Ultrafest VI
- Place of Presentation
  Edinburgh, Scotland
- Related Report
  2013 Annual Research Report
[Presentation] Articulatory and laryngeal contributions to rhythm in English2013
- Author(s)
  D. Erickson, I. Wilson
- Organizer
  Joint Research Meeting of the Dept. of Linguistic Theory and Structure, NINJAL
- Place of Presentation
  NINJAL (Tachikawa, Tokyo)
- Related Report
  2012 Research-status Report
[Presentation] How accurately people follow articulation instructions2012
- Author(s)
  Wilson, I. & N.Horiguchi
- Organizer
  Paper presented at the 4th Pronunciation in Second Language Learning and Teaching conference (PSLLT 2012)
- Place of Presentation
  Vancouver, Canada
- Year and Date
  2012-08-24
- Related Report
  2013 Final Research Report
[Presentation] 発音習得のための超音波舌画像に対する音素片マッピング [Mapping phonemes to midsagittal tongue images for pronunciation learning]2012
- Author(s)
  Yaguchi, Y., N.Horiguchi, & I.Wilson
- Organizer
  the joint meeting of the Technical Committees for Pattern Recognition and Media Understanding (PRMU) and Signal Processing (SP) of the Institute of Electronics, Information and Communication Engineers (IEICE)
- Place of Presentation
  Sendai, Japan (Paper presented)
- Year and Date
  2012-02-10
- Related Report
  2013 Final Research Report
[Presentation] How accurately people follow articulation instructions2012
- Author(s)
  I. Wilson, N. Horiguchi
- Organizer
  4th Pronunciation in Second Language Learning and Teaching Conference (PSLLT)
- Place of Presentation
  SFU (Vancouver, Canada)
- Related Report
  2012 Research-status Report
[Presentation] 発音習得のための超音波舌画像に対する音素片マッピング
- Author(s)
  矢口勇一、堀口尚哉、イアン・ウィルソン
- Organizer
  社団法人電子情報通信学会　信学技報
- Place of Presentation
  東北大学
- Related Report
  2011 Research-status Report
[Remarks]
- URL
  http://clrlab1.u-aizu.ac.jp/index_j.html
- Related Report
  2013 Final Research Report
[Remarks] CLR Phonetics Lab
- URL
  http://clrlab1.u-aizu.ac.jp
- Related Report
  2013 Annual Research Report
[Remarks]
- URL
  http://clrlab1.u-aizu.ac.jp/index_j.html
- Related Report
  2011 Research-status Report

Visual speech recognition using ultrasound tongue and video lip/face images

Principal Investigator

WILSON Ian 会津大学, コンピュータ理工学部, 教授 (50444930)

¥4,810,000 (Direct Cost: ¥3,700,000、Indirect Cost: ¥1,110,000)

Report

Research Products

[Journal Article] Effect of syllable onset, coda, and nucleus on degree of skin stretching over the mandible.2013

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Normalization and matching routine for comparing first and second language tongue trajectories2013

Author(s)

Journal Title

DOI

NAID

Related Report

[Journal Article] Coarticulatory effects of lateral tongue bracing in first and second language English speakers.2013

Author(s)

Journal Title

DOI

Related Report

[Journal Article] 舌特徴空間における言語学習者の違いを比較するための正規化とマッチング手法2013

Author(s)

Journal Title

Related Report

[Journal Article] 舌特徴空間における言語学習者の違いを比較するための正規化とマッチング手法2013

Author(s)

Journal Title

Related Report

[Journal Article] Articulating rhythm in L1 and L2 English : Focus on jaw and F02012

Author(s)

Journal Title

Related Report

[Journal Article] Finding phoneme trajectories in a feature space of sound and midsagittal ultrasound tongue images2012

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Video recordings of L1 and L2 jaw movement: Effect of syllable onset on jaw opening during syllable nucleus2012

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Pitch and intensity in the speech of Japanese speakers of English: Comparison with L1 speakers2012

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Comparing L1 and L2 phoneme trajectories in a feature space of sound and midsagittal ultrasound tongue images2012

Author(s)

Journal Title

DOI

Related Report

[Journal Article] Articulating rhythm in L1 and L2 English: Focus on jaw and F02012

Author(s)

Journal Title

Related Report

[Presentation] Lateral tongue bracing in Japanese and English2013

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Articulatory and laryngeal contributions to rhythm in English2013

Author(s)

Organizer

Place of Presentation

Year and Date

Related Report

[Presentation] Lateral tongue bracing in Japanese and English2013

Author(s)

Organizer

Place of Presentation

Related Report

[Presentation] Articulatory and laryngeal contributions to rhythm in English2013

Author(s)

Organizer

Place of Presentation

Related Report