1996 Fiscal Year Final Research Report Summary

Formulation of Prosodic Features of Speech and its Application to Continuous Speech Recognition

Research Project

Project/Area Number	06452397
Research Category	Grant-in-Aid for Scientific Research (B)
Allocation Type	Single-year Grants
Section	一般
Research Field	Intelligent informatics
Research Institution	University of Tokyo
Principal Investigator	HIROSE Keikichi University of Tokyo, Dept. of Inf. and Commu. Eng., Professor, 大学院・工学系研究科, 教授 (50111472)
Co-Investigator(Kenkyū-buntansha)	OHNO Sumio Tokyo Science University, Dept. of Applied Electronics, Assistant, 基礎工学部, 助手 (80256677)
Project Period (FY)	1994 – 1996
Keywords	Prosodic Features / Continuous Speech Recognition / Fundamental Frequency Contour / Superpositional Model / Syntactic Boundary / Accent Type / Statistical Model with Moraic Transition / Baysian Predictive Classification
Research Abstract	Prosodic feature-based methods were developed for word identification, syntactic boundary detection, and so on. These methods were utillized to aid continuous speech recognition. Followings are the major results. 1. Prosodic rules for read speech were modified and improved. Prosodic rules for dialogue speech were also construcled based on the comparative study with read speech. Prosodic features were clarified for speakers' intention, attitude and emotion. 2. A method was develped to detect syntactic boundaries in continuous speech using fundamental frequency contours and their macroscopic features. 3. A method was develped to extract phrase component onsels from fundamental frequency contours by suppressing local undulations due to accent components. By combining this with AbS method based on the superpositional model, automatic extraction of fundamental frequency contour features was realized and was applied to important word detection successfully. 4. A method was develped to estimate t … More he feasibility of recognition candidates where a fundamental frequency contour was generated for each candidate, and was compared to the observed contour. The method was shown to be effective in detecting recognition error accompanied by accent type changes and/or syntactic boundary changes. 5. A method was develped to model fundamental frequency contours statistically after representing them with several codes in moraic unit. The method was proved to be able to detect syntactic boundaries and to recognize accent types effectively. 6. A method was develped to divide training data phoneme HMM into several clusters by inspecting HMM path of each data. By arranging a new HMM for each cluster, recognition rate was clearly shown to increase. 7. A robust speech recognition method was developed based on Viterbi Baysian predictive classification. Validity of the method was shown with word recognition experiments under noisy conditions, where more than 10% improvement was observed as compared to conventional methods. After incorporating the developed methods above into a continuous speech recognition system, and their positive effects on recognition were proved. Less

Research Products
(47 results)

All Other

All Publications (47 results)

[Publications] 広瀬啓吉: "Analysis and synthesis of fundamental frequency contours for the spoken dialogue in Japanese" Proc.2nd ESCA/IEEE Workshop on Speech Synthesis. 167-170 (1994)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 峯松信明: "Speech recognition using HMM with decreased intra-group variation in the temporal structure" Proc.International Conference on Spoken Language Processing. 1. 187-190 (1994)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 胡新揮: "Recognition of Chinese tones in monosyllabic and disyllabic speech using HMM" Proc.International Conference on Spoken Language Processing. 1. 203-206 (1994)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 藤崎博也: "Prosodic characteristics of spoken dialogues for information query" Proc.International Conference on Spoken Language Processing. 3. 1103-1106 (1994)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 広瀬啓吉: "Use of prosodic features in the recognition of continuous speech" Proc.International Conference on Spoken Language Processing. 3. 1123-1126 (1994)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 大野澄雄: "A method for word spotting in continuous speech using both segmental and contextual likelihood scores" Proc.International conference on Spoken Language Processing. 4. 2199-2202 (1994)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 峯松信明: "Duration modeling with decreased intra-group temporal variation for HMM-based phoneme recognition" IEICE Trans.Information and Systems. E78-D. 654-661 (1995)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 胡新揮: "Tone recognition of Chinese dissyllables using hidden Markov models" IEICE Trans.Information and Systems. E78-D. 685-691 (1995)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 大野澄雄: "A scheme for word detection in continuous speech using likelihood scores of segments modified by their context within a word" IEICE Trans.Information and Systems. E78-D. 725-731 (1995)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 広瀬啓吉: "HMM-based recognition of Chinese trisyllables using double codebooks of fundamental frequency and waveform power" Proc.European Conference on Speech Communication and Technology. 1. 31-34 (1995)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 阪田真弓: "Analysis and synthesis of prosodic features in spoken dialogue of Japanese" Proc.European Conference on Speech Communication and Technology. 2. 1007-1010 (1995)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 広瀬啓吉: "Detection of syntactic boundaries by partial analysis-by-synthesis of fundamental frequency contours" Proc.IEEE International Conference on Acoustics,Speech,& Signal Processing. 2. 809-812 (1996)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 広瀬啓吉: "Synthesizing dialogue speech of Japanese based on the quantitative analysis of prosodic features" Proc.International Conference on Spoken Language Processing. 1. 378-381 (1996)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 桜井淳宏: "Detection of phrase boundaries in Japanese by low-pass filtering of fundamental frequency contours" Proc.International Conference on Spoken Language Processing. 2. 817-820 (1996)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 張勁松: "Adaptive recognition method based on posterior use of distoribution pattern of output probabilities" Proc.International Conference on Spoken Language Processing. 2. 1129-1132 (1996)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 広瀬啓吉: "Use of prosodic features in speech recognition (Invited)" Proc. IEEE Invited Workshop on Pattern Recognition for Multimedia Techniques (IEEE Taegu Section). 99-108 (1996)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 広瀬啓吉: "Posterior use of prosodic features to aid speech recognition (Invited)" Journal of the Acoustical Society of America. 100・4(Pt.2). 2849- (1996)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 広瀬啓吉: "対話音声と朗読音声の韻律的特徴の比較" 電子情報通信学会論文誌. J79-D II. 2154-2162 (1996)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 川波弘道: "対話音声の韻律的特徴の定量的分析による韻律規則の作成" 日本音響学会秋季研究発表会講演論文集. I(発表予定). (1997)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 岩野公司: "HMMによる韻律パターン表現の一手法" 日本音響学会秋季研究発表会講演論文集. I(発表予定). (1997)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 江輝: "Applying Viterbi Bayesian predictive classification to robust recognition of continuous speech" 日本音響学会秋季研究発表会講演論文集. I(発表予定). (1997)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 江輝: "Robust speech recognition based on Viterbi Bayesian predictive classification" Proc.IEEE International Conference on Acoustics,Speech,& Signal Processing. (発表予定). (1997)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 広瀬啓吉: "Disambiguating recognition results by prosodic features (Computing Prosody)" Springer-Verlage(edt.Sagisaka,Compbell and Higuchi,), 16 (1997)
- Description
  「研究成果報告書概要(和文)」より
[Publications] 広瀬啓吉: "音声コミュニケーションにおける感性情報(感性の科学)" サイエンス社(辻三郎編), 5 (1997)
- Description
  「研究成果報告書概要(和文)」より
[Publications] Keikichi Hirose, Mayumi Sakata, Masafune, Osame and Hiroya Fujisaki: "Analysis and synthesis of fundamental frequency contours for the spoken dialogue in Japanese" Proc. 2nd ESCA/IEEE Workshop onSpeech Synthesis. 167-170 (1994)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Nobuaki Minematsu and Keikichi Hirose: "Speech recognition using HMM with decreased intra-group variation in the temperal structure" Proc. International Conference on Spoken Language Processing, 1. 187-190 (1994)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Xinhui Hu and Keikichi Hirose: "Recognition of Chinese tones in monosyllabic and diayllabic speech using HMM" Proc. International Conference on Spoken Language Processing, 1. 203-206 (1994)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Hiroya Fujisaki, Sumio Ohno, Masafune Osame and Keikichi Hirose: "Prosodic characteristics of spoken dialogues for information query" Proc. International Conference on Spoken Language Processing, 3. 1103-1106 (1994)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Keikichi Hirose, Atsuhiro Sakurai and Hiroyuki Konno: "USe of prosodic features in the recognition of continuous speech" Proc. International Conference on Spoken Language Processing, 3. 1123-1126 (1994)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Sumio Ohno, Hiyoya Fujisaki and Keikichi Hirose: "A method for word spotting in continious speech using both segmental and contextual likelihood scores" Proc. International Conference on Spoken Language Processing, 4. 2199-2202 (1994)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Nobuaki Minematsu and Keikichi Hirose: "Duration modeling with decreased intra-group temporal variation for HMM-based phoneme recognition" IEICE Trans. Information and Systems. E78-D-6. 654-661 (1995)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Xinhui Hu and Keikichi Hirose: "Tone recognition of Chinese dissyllables using hidden Markov models" IEICE Trans. Information and Systems. E78-D-6. 685-691 (1995)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Sumio Ohno, Keikichi Hirose and Hiroya Fujisaki: "A scheme for word detection in continuous speech using likelihood scores of segments modified by their context within a word" IEICE Trans. Information and Systems. E78-D-6. 725-731 (1995)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Keikichi Hirose and Xinhui Hu: "HMM-based recognition of Chiness trisyllables using double codebooks of fundamental frequency and waveform power" Proc. European Conference on Speech Communication and Technology, 1. 31-34 (1995)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Mayumi Sakata and Keikichi Hirose: "Analysis and synthesis of prosodic features in spoken dialogue of Japanese" Proc. European Conference on Speech Communication and Technology, 2. 1007-1010 (1995)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Keikichi Hirose and Atsuhiro Sakurai: "Detection of syntactic boundaries by partial analysis-by-synthesis of fundamental frequency contours" Proc. IEEE International Conference on Acoustics, Speech, & Signal Processing, 2. 809-812 (1996)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Keikichi Hirose, Mayumi Sakata and Hiromichi Kawanami: "Synthesizing dialogue speech of Japanese based on the quantitative analysis of prosodic features" Proc. International Conference on Spoken Language Processing, 1. 378-381 (1996)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Atsuhiro Sakurai and Keikichi Hirose: "Detection of phrase boundaries in Japanese by low-pass filtering of fundamental frequency contours" Proc. International Conference on Spoken Language Processing, 2. 817-820 (1996)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Jinsong Zhang, Beiqian Dai, Changfu Wang, Hingkeung Kwan, Keikichi Hirose: "Adaptive recpgnition method based on posterior use of diatribution pattern of output probabilities" Proc. International Conference on Spoken Language Processing, 2. 1129-1132 (1996)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Keikichi Hirose, Kouji Iwano and Atsuhiro Sakurai: "Use of prosodic features in speech recognition (Invited)" Proc. IEEE Invited Workshop on Pattern Recognition for Multimedia Techniques (IEEE Taegu Section). 99-108 (1996)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Keikichi Hirose, Kouji Iwano and Atsuhiro Sakurai: "Posterior use of prosodic features to aid speech recognition (Invited)" Journal of the Acoustical Society of America. 100-4 (Pt. 2). 2849 (1996)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Keikichi Hirose and Mayumi Sakata: "Comparison of prosodic features in dialogue speech and read speech of Japanese" Trans. IEICE. J79-D II-12. 2154-2162 (1996)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Hiromichi Kawanami and Keikichi Hirose: "A quantitative analysis of prosodic features of dialogue speech and generation of prosodic rules" Record of Spring Meeting of Acoustical Society of Japan I. (to be published). (1997)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Hui Jiang, Keikichi Hirose and Qiang Huo: "Applying Viterbi Baysian predictive classification to robust recognition of continuous speech" Record of Spring Meeting of Acoustical Society of Japan I. (to be published). (1997)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Hui Jiang, Keikichi Hirose and Qiang Huo: "Robust speech recognition based on Viterbi Baysian predictive classification" Proc. IEEE International Conference on Acoustics Speech & Signal Processing. (to be published). (1997)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Keikichi Hirose: "Disambiguating recognition results by prosodic features" Computing Prosody, Springer-Verlag. 327-342 (1997)
- Description
  「研究成果報告書概要(欧文)」より
[Publications] Keikichi Hirose: "Kansei information in speech Communication" Science of Kansei, Science Co. Ltd.94-98 (1997)
- Description
  「研究成果報告書概要(欧文)」より

1996 Fiscal Year Final Research Report Summary

Formulation of Prosodic Features of Speech and its Application to Continuous Speech Recognition

Principal Investigator

HIROSE Keikichi University of Tokyo, Dept. of Inf. and Commu. Eng., Professor, 大学院・工学系研究科, 教授 (50111472)

Research Products

[Publications] 広瀬啓吉: "Analysis and synthesis of fundamental frequency contours for the spoken dialogue in Japanese" Proc.2nd ESCA/IEEE Workshop on Speech Synthesis. 167-170 (1994)

Description

[Publications] 峯松信明: "Speech recognition using HMM with decreased intra-group variation in the temporal structure" Proc.International Conference on Spoken Language Processing. 1. 187-190 (1994)

Description

[Publications] 胡新揮: "Recognition of Chinese tones in monosyllabic and disyllabic speech using HMM" Proc.International Conference on Spoken Language Processing. 1. 203-206 (1994)

Description

[Publications] 藤崎博也: "Prosodic characteristics of spoken dialogues for information query" Proc.International Conference on Spoken Language Processing. 3. 1103-1106 (1994)

Description

[Publications] 広瀬啓吉: "Use of prosodic features in the recognition of continuous speech" Proc.International Conference on Spoken Language Processing. 3. 1123-1126 (1994)

Description

[Publications] 大野澄雄: "A method for word spotting in continuous speech using both segmental and contextual likelihood scores" Proc.International conference on Spoken Language Processing. 4. 2199-2202 (1994)

Description

[Publications] 峯松信明: "Duration modeling with decreased intra-group temporal variation for HMM-based phoneme recognition" IEICE Trans.Information and Systems. E78-D. 654-661 (1995)

Description

[Publications] 胡新揮: "Tone recognition of Chinese dissyllables using hidden Markov models" IEICE Trans.Information and Systems. E78-D. 685-691 (1995)

Description

[Publications] 大野澄雄: "A scheme for word detection in continuous speech using likelihood scores of segments modified by their context within a word" IEICE Trans.Information and Systems. E78-D. 725-731 (1995)

Description

[Publications] 広瀬啓吉: "HMM-based recognition of Chinese trisyllables using double codebooks of fundamental frequency and waveform power" Proc.European Conference on Speech Communication and Technology. 1. 31-34 (1995)

Description

[Publications] 阪田真弓: "Analysis and synthesis of prosodic features in spoken dialogue of Japanese" Proc.European Conference on Speech Communication and Technology. 2. 1007-1010 (1995)

Description

[Publications] 広瀬啓吉: "Detection of syntactic boundaries by partial analysis-by-synthesis of fundamental frequency contours" Proc.IEEE International Conference on Acoustics,Speech,& Signal Processing. 2. 809-812 (1996)

Description

[Publications] 広瀬啓吉: "Synthesizing dialogue speech of Japanese based on the quantitative analysis of prosodic features" Proc.International Conference on Spoken Language Processing. 1. 378-381 (1996)

Description

[Publications] 桜井淳宏: "Detection of phrase boundaries in Japanese by low-pass filtering of fundamental frequency contours" Proc.International Conference on Spoken Language Processing. 2. 817-820 (1996)

Description

[Publications] 張勁松: "Adaptive recognition method based on posterior use of distoribution pattern of output probabilities" Proc.International Conference on Spoken Language Processing. 2. 1129-1132 (1996)

Description

[Publications] 広瀬啓吉: "Use of prosodic features in speech recognition (Invited)" Proc. IEEE Invited Workshop on Pattern Recognition for Multimedia Techniques (IEEE Taegu Section). 99-108 (1996)

Description

[Publications] 広瀬啓吉: "Posterior use of prosodic features to aid speech recognition (Invited)" Journal of the Acoustical Society of America. 100・4(Pt.2). 2849- (1996)

Description

[Publications] 広瀬啓吉: "対話音声と朗読音声の韻律的特徴の比較" 電子情報通信学会論文誌. J79-D II. 2154-2162 (1996)

Description

[Publications] 川波弘道: "対話音声の韻律的特徴の定量的分析による韻律規則の作成" 日本音響学会秋季研究発表会講演論文集. I(発表予定). (1997)

Description

[Publications] 岩野公司: "HMMによる韻律パターン表現の一手法" 日本音響学会秋季研究発表会講演論文集. I(発表予定). (1997)

Description

[Publications] 江輝: "Applying Viterbi Bayesian predictive classification to robust recognition of continuous speech" 日本音響学会秋季研究発表会講演論文集. I(発表予定). (1997)

Description

[Publications] 江輝: "Robust speech recognition based on Viterbi Bayesian predictive classification" Proc.IEEE International Conference on Acoustics,Speech,& Signal Processing. (発表予定). (1997)

Description

[Publications] 広瀬啓吉: "Disambiguating recognition results by prosodic features (Computing Prosody)" Springer-Verlage(edt.Sagisaka,Compbell and Higuchi,), 16 (1997)

Description

[Publications] 広瀬啓吉: "音声コミュニケーションにおける感性情報(感性の科学)" サイエンス社(辻三郎編), 5 (1997)

Description

[Publications] Keikichi Hirose, Mayumi Sakata, Masafune, Osame and Hiroya Fujisaki: "Analysis and synthesis of fundamental frequency contours for the spoken dialogue in Japanese" Proc. 2nd ESCA/IEEE Workshop onSpeech Synthesis. 167-170 (1994)

Description

[Publications] Nobuaki Minematsu and Keikichi Hirose: "Speech recognition using HMM with decreased intra-group variation in the temperal structure" Proc. International Conference on Spoken Language Processing, 1. 187-190 (1994)

Description

[Publications] Xinhui Hu and Keikichi Hirose: "Recognition of Chinese tones in monosyllabic and diayllabic speech using HMM" Proc. International Conference on Spoken Language Processing, 1. 203-206 (1994)

Description

[Publications] Hiroya Fujisaki, Sumio Ohno, Masafune Osame and Keikichi Hirose: "Prosodic characteristics of spoken dialogues for information query" Proc. International Conference on Spoken Language Processing, 3. 1103-1106 (1994)

Description

[Publications] Keikichi Hirose, Atsuhiro Sakurai and Hiroyuki Konno: "USe of prosodic features in the recognition of continuous speech" Proc. International Conference on Spoken Language Processing, 3. 1123-1126 (1994)

Description

[Publications] Sumio Ohno, Hiyoya Fujisaki and Keikichi Hirose: "A method for word spotting in continious speech using both segmental and contextual likelihood scores" Proc. International Conference on Spoken Language Processing, 4. 2199-2202 (1994)

Description

[Publications] Nobuaki Minematsu and Keikichi Hirose: "Duration modeling with decreased intra-group temporal variation for HMM-based phoneme recognition" IEICE Trans. Information and Systems. E78-D-6. 654-661 (1995)

Description

[Publications] Xinhui Hu and Keikichi Hirose: "Tone recognition of Chinese dissyllables using hidden Markov models" IEICE Trans. Information and Systems. E78-D-6. 685-691 (1995)

Description

[Publications] Sumio Ohno, Keikichi Hirose and Hiroya Fujisaki: "A scheme for word detection in continuous speech using likelihood scores of segments modified by their context within a word" IEICE Trans. Information and Systems. E78-D-6. 725-731 (1995)

Description

[Publications] Keikichi Hirose and Xinhui Hu: "HMM-based recognition of Chiness trisyllables using double codebooks of fundamental frequency and waveform power" Proc. European Conference on Speech Communication and Technology, 1. 31-34 (1995)

Description

[Publications] Mayumi Sakata and Keikichi Hirose: "Analysis and synthesis of prosodic features in spoken dialogue of Japanese" Proc. European Conference on Speech Communication and Technology, 2. 1007-1010 (1995)

Description

[Publications] Keikichi Hirose and Atsuhiro Sakurai: "Detection of syntactic boundaries by partial analysis-by-synthesis of fundamental frequency contours" Proc. IEEE International Conference on Acoustics, Speech, & Signal Processing, 2. 809-812 (1996)

Description

[Publications] Keikichi Hirose, Mayumi Sakata and Hiromichi Kawanami: "Synthesizing dialogue speech of Japanese based on the quantitative analysis of prosodic features" Proc. International Conference on Spoken Language Processing, 1. 378-381 (1996)

Description

[Publications] Atsuhiro Sakurai and Keikichi Hirose: "Detection of phrase boundaries in Japanese by low-pass filtering of fundamental frequency contours" Proc. International Conference on Spoken Language Processing, 2. 817-820 (1996)