2005 Fiscal Year Final Research Report Summary
Universal-Phonetic-Segment-Based Speech Coding and Its Applications to Speech Processing
Project/Area Number |
15300026
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Media informatics/Database
|
Research Institution | University of Tsukuba |
Principal Investigator |
TANAKA Kazuyo University of Tsukuba, Graduate School of Library, Information and Media Studies, Professor, 大学院・図書館情報メディア研究科, 教授 (70344207)
|
Co-Investigator(Kenkyū-buntansha) |
ITOH Yoshiaki Iwate Prefectural University, Faculty of Software and Information Science, Associate Professor, ソフトウエア情報学部, 助教授 (90325928)
OKAWA Shigeki Chiba Institute of Technology, Dept.of Information and Network Science, Associate Professor, 情報科学部, 助教授 (40306395)
KOJIMA Hiroaki National Institute of Advanced Industrial Science and Technology, Research Group Leader, 情報技術研究部門, グループリーダ (80356980)
|
Project Period (FY) |
2003 – 2005
|
Keywords | speech recognition / spoken document retrieval / phonetic code / IPA / Dynamic Programming / phone model / multilingual / open vocabulary |
Research Abstract |
In this project, we present a novel speech processing framework, where all of the acoustic speech samples are once encoded into universal phonetic segment (UPS) sequences and spoken document processing (SDP) systems, such as recognition, retrieval, indexing, are constructed on this UPS domain. Adopting this framework, the SDP systems are separated from the original acoustic correlates or environments. This makes it possible to realize such flexibility that recognition-type processing can be handled by just calculating distances between UPS sequences, and also can be constructed on distributed processing schemes. Through this project, we have developed the following component techniques on this framework : 1)an original fine sub-phonetic segment (SPS) set as the UPS set, which brought high performance recognition and easy processing of multilingual speech, 2)effective DP(dynamic programming)-based sequence matching algorithms, called Shift CDP and Relay CDP. Effectiveness of the processing framework, the SPS set, and DP-based algorithms are evaluated by constructing speech recognition and open vocabulary spoken document retrieval (SDR) systems. Experimental results showed that the proposed SDP systems are superior to those based on conventional methods in performance evaluation. We have finally constructed a real time open vocabulary SDR system for demonstration, in which the system can retrieve broadcast video by user's speech.
|
Research Products
(25 results)
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
[Book] 音声工学2005
Author(s)
板橋 秀一
Total Pages
244
Publisher
森北出版
Description
「研究成果報告書概要(和文)」より
-
-