• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Speaker-independent word recognition method in which a qroup of words can freely and easily be constructed

Research Project

Project/Area Number 06650424
Research Category

Grant-in-Aid for General Scientific Research (C)

Allocation TypeSingle-year Grants
Research Field 情報通信工学
Research InstitutionKumamoto University

Principal Investigator

WATANABE Akira  Kumamoto University, Dept.of Electrical Engineering & Computer Science, Professor, 工学部, 教授 (50040382)

Co-Investigator(Kenkyū-buntansha) IKEDA Takashi  Kurume National College of Technology, Assooiate Professor, 助教授 (80222884)
UEDA Yuichi  Kumamoto University, Dept.of Electrical Engineering & Computer Science, Associat, 工学部, 助教授 (00141961)
Project Period (FY) 1994 – 1995
Project Status Completed (Fiscal Year 1995)
Budget Amount *help
¥2,000,000 (Direct Cost: ¥2,000,000)
Fiscal Year 1995: ¥600,000 (Direct Cost: ¥600,000)
Fiscal Year 1994: ¥1,400,000 (Direct Cost: ¥1,400,000)
Keywordsspeaker-independent / word recognition / input parameters / statistical distance / neural network / phoneme template / 単語辞書 / 類似度距離
Research Abstract

This research aims to investigate how to achieve available speaker-independent-word-recognition method which is easily applicable to any word group. The system consists of a standard phoneme template, a word dictionary independent of the template, a phoneme-distance matrix computation and a word-judgment by DP matching.
In order to improve recognition rates in this flexible scheme, new compound parameters of speech have been tested. Those parameters, that is, mel-band filter bank outputs, normalized formant correlates and neural network outputs on a manner of artioulation and voice souroes, may have complementary effects on the improvement. First of all, from practical viewpoint, the consonant template has been made by speech data of only one speaker and Euclidean distance has been used. In the recognition tests using 3 groups of 30 words uttered by 30 speakers, 95-96% of recognition rate has been achieved by the compound parameters, while it by mel-filter bank only or mel-cepstrum parameters only has been 89-90%. Next, the standard phoneme template has been collected from utterances of 20 speakers and we hane tried to test 50 words uttered by 30 speakers different from them for the template. The total distance in the compound psrameters is defined as weighted linear sum of each parameter distance. The weights have been decided to maximize phoneme recognition rates which were examined by phonemes in words uttered by 4 new speakers.
The results of the word recognition tests using two kinds of distances are as follows :
(1) In all combination of the parameters, the recognition rates by Bays distance are higher than by Euclidean distance.
(2) In the case when all of the parameters are used, the recognition rate has been 96.8% in Bays distance metric and 94.7 in Euclidean it.
Thus, it has been concluded that this proposed method is very useful.

Report

(3 results)
  • 1995 Annual Research Report   Final Research Report Summary
  • 1994 Annual Research Report

URL: 

Published: 1994-04-01   Modified: 2016-04-21  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi