Research on Dynamic Visemes of Japanese Speech Utterances
Project/Area Number |
14580431
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Intelligent informatics
|
Research Institution | Kanazawa Institute of Technology |
Principal Investigator |
HIRAYAMA Makoto J. Kanazawa Institute of Technology, College of Informatics and Human Communication, Associate Professor, 情報フロンティア学部, 助教授 (70329374)
|
Project Period (FY) |
2002 – 2004
|
Project Status |
Completed (Fiscal Year 2004)
|
Budget Amount *help |
¥4,100,000 (Direct Cost: ¥4,100,000)
Fiscal Year 2004: ¥600,000 (Direct Cost: ¥600,000)
Fiscal Year 2003: ¥1,000,000 (Direct Cost: ¥1,000,000)
Fiscal Year 2002: ¥2,500,000 (Direct Cost: ¥2,500,000)
|
Keywords | Japanese / viseme / visemes / computer graphics animation / articulatory movements / multimedia database / lips / MPEG-4 |
Research Abstract |
To animate realistic lip motions during Japanese speech utterances, high-speed video recording was done while a human subject uttered short Japanese sentences. Two high-speed video cameras captured front and side views of the human subject lips during speech utterances of phonemically balanced short Japanese sentences. Recordings were at 300 or 240 samples per second, 200 by 200 or 256 by 256 pixels, 24 bit RGB colors. Acoustic signals were recorded at 44.1 kHz Mono. Other than natural lips, marking illuminated points on lips and blue coloring on lips were recorded for some sentences to enable easier modeling of extracting lip position parameters. These recordings were compiled as a Japanese viseme video database. These video images were used to make computer graphics model parameters of lip points according to MPEG-4 body and face parameters. Especially, fast movements like explosive phonemes like /ba/ were carefully observed frame by frame to realize precise reproductions of lip movements. By placing these visemes onto key frames on time axis and interpolating middle frames between key frames, a computer graphics animation was done. Not only static visemes but also dynamic visemes, i.e., a set of lip parameters for multiple frames for one viseme, were used and nonlinear interpolations were used. Using this method, more realistic animations of lip motions during speech were realized than traditional simple interpolations of relatively small numbers of visemes.
|
Report
(4 results)
Research Products
(10 results)