Research on Dynamic Visemes of Japanese Speech Utterances

Research Project

Project/Area Number	14580431
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Single-year Grants
Section	一般
Research Field	Intelligent informatics
Research Institution	Kanazawa Institute of Technology
Principal Investigator	HIRAYAMA Makoto J. Kanazawa Institute of Technology, College of Informatics and Human Communication, Associate Professor, 情報フロンティア学部, 助教授 (70329374)
Project Period (FY)	2002 – 2004
Project Status	Completed (Fiscal Year 2004)
Budget Amount *help	¥4,100,000 (Direct Cost: ¥4,100,000) Fiscal Year 2004: ¥600,000 (Direct Cost: ¥600,000) Fiscal Year 2003: ¥1,000,000 (Direct Cost: ¥1,000,000) Fiscal Year 2002: ¥2,500,000 (Direct Cost: ¥2,500,000)
Keywords	Japanese / viseme / visemes / computer graphics animation / articulatory movements / multimedia database / lips / MPEG-4
Research Abstract	To animate realistic lip motions during Japanese speech utterances, high-speed video recording was done while a human subject uttered short Japanese sentences. Two high-speed video cameras captured front and side views of the human subject lips during speech utterances of phonemically balanced short Japanese sentences. Recordings were at 300 or 240 samples per second, 200 by 200 or 256 by 256 pixels, 24 bit RGB colors. Acoustic signals were recorded at 44.1 kHz Mono. Other than natural lips, marking illuminated points on lips and blue coloring on lips were recorded for some sentences to enable easier modeling of extracting lip position parameters. These recordings were compiled as a Japanese viseme video database. These video images were used to make computer graphics model parameters of lip points according to MPEG-4 body and face parameters. Especially, fast movements like explosive phonemes like /ba/ were carefully observed frame by frame to realize precise reproductions of lip movements. By placing these visemes onto key frames on time axis and interpolating middle frames between key frames, a computer graphics animation was done. Not only static visemes but also dynamic visemes, i.e., a set of lip parameters for multiple frames for one viseme, were used and nonlinear interpolations were used. Using this method, more realistic animations of lip motions during speech were realized than traditional simple interpolations of relatively small numbers of visemes.

Report

(4 results)

2004 Annual Research Report Final Research Report Summary
2003 Annual Research Report
2002 Annual Research Report

Research Products
(10 results)

All 2004 2003 Other

All Journal Article (8 results) Publications (2 results)

[Journal Article] 日本語音声発話口形素のCGモデル2004
- Author(s)
  平山亮
- Journal Title
  
  日本音響学会2004年秋季研究発表会講演論文集
  
  Pages: 389-390
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2004 Annual Research Report 2004 Final Research Report Summary
[Journal Article] A High-speed Video Database of Lip Shapes during Speech2004
- Author(s)
  Makoto J.Hirayama
- Journal Title
  
  KIT International Symposium on Brain and Language 2004
  
  Pages: 52-52
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2004 Annual Research Report 2004 Final Research Report Summary
[Journal Article] A computer graphics model of Japanese visemes2004
- Journal Title
  
  2004 Autumn Meeting of Acoustical Society of Japan
  
  Pages: 389-390
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2004 Final Research Report Summary
[Journal Article] A high-speed video database of lip shapes during speech2004
- Journal Title
  
  KIT International Symposium on Brain and Language
  
  Pages: 52-52
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2004 Final Research Report Summary
[Journal Article] Making of a Japanese Viseme Video Database by Multiple High-speed Video Observations2003
- Author(s)
  Makoto J.Hirayama
- Journal Title
  
  15th International Congress of Phonetic Sciences
  
  Pages: 3157-3160
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2004 Final Research Report Summary
[Journal Article] 音声発話時の口唇周辺高速動画データベース2003
- Author(s)
  平山亮
- Journal Title
  
  情報科学フォーラム(FIT)2003
  
  Pages: 257-258
- Description
  「研究成果報告書概要(和文)」より
- Related Report
  2004 Final Research Report Summary
[Journal Article] Making of a Japanese viseme video database by multiple high-speed video observations2003
- Author(s)
  Makoto J.Hirayama
- Journal Title
  
  15th International Congress of Phonetic Sciences
  
  Pages: 3157-3160
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2004 Final Research Report Summary
[Journal Article] A high-speed video database around lips during Japanese speech2003
- Journal Title
  
  Forum on Information Technology
  
  Pages: 257-258
- Description
  「研究成果報告書概要(欧文)」より
- Related Report
  2004 Final Research Report Summary
[Publications] Makoto J.Hirayama: "Making of a Japanese Viseme Video Database by Multiple High-speed Video Observations"15th International Congress of Phonetic Sciences. 3157-3160 (2003)
- Related Report
  2003 Annual Research Report
[Publications] 平山亮: "音声発話時の口唇周辺高速動画データベース"情報科学フォーラム(FIT)2003. 257-258 (2003)
- Related Report
  2003 Annual Research Report

Research on Dynamic Visemes of Japanese Speech Utterances

Principal Investigator

HIRAYAMA Makoto J. Kanazawa Institute of Technology, College of Informatics and Human Communication, Associate Professor, 情報フロンティア学部, 助教授 (70329374)

¥4,100,000 (Direct Cost: ¥4,100,000)

Report

Research Products

[Journal Article] 日本語音声発話口形素のCGモデル2004

Author(s)

Journal Title

Description

Related Report

[Journal Article] A High-speed Video Database of Lip Shapes during Speech2004

Author(s)

Journal Title

Description

Related Report

[Journal Article] A computer graphics model of Japanese visemes2004

Journal Title

Description

Related Report

[Journal Article] A high-speed video database of lip shapes during speech2004

Journal Title

Description

Related Report

[Journal Article] Making of a Japanese Viseme Video Database by Multiple High-speed Video Observations2003

Author(s)

Journal Title

Description

Related Report

[Journal Article] 音声発話時の口唇周辺高速動画データベース2003

Author(s)

Journal Title

Description

Related Report

[Journal Article] Making of a Japanese viseme video database by multiple high-speed video observations2003

Author(s)

Journal Title

Description

Related Report

[Journal Article] A high-speed video database around lips during Japanese speech2003

Journal Title

Description

Related Report

[Publications] Makoto J.Hirayama: "Making of a Japanese Viseme Video Database by Multiple High-speed Video Observations"15th International Congress of Phonetic Sciences. 3157-3160 (2003)

Related Report

[Publications] 平山 亮: "音声発話時の口唇周辺高速動画データベース"情報科学フォーラム(FIT)2003. 257-258 (2003)

Related Report

[Publications] 平山亮: "音声発話時の口唇周辺高速動画データベース"情報科学フォーラム(FIT)2003. 257-258 (2003)