IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Special Section on Recent Advances in Machine Learning for Spoken Language Processing
Re-Ranking Approach of Spoken Term Detection Using Conditional Random Fields-Based Triphone Detection
Naoki SAWADAHiromitsu NISHIZAKI
Author information
JOURNAL FREE ACCESS

2016 Volume E99.D Issue 10 Pages 2518-2527

Details
Abstract

This study proposes a two-pass spoken term detection (STD) method. The first pass uses a phoneme-based dynamic time warping (DTW)-based STD, and the second pass recomputes detection scores produced by the first pass using conditional random fields (CRF)-based triphone detectors. In the second-pass, we treat STD as a sequence labeling problem. We use CRF-based triphone detection models based on features generated from multiple types of phoneme-based transcriptions. The models train recognition error patterns such as phoneme-to-phoneme confusions in the CRF framework. Consequently, the models can detect a triphone comprising a query term with a detection probability. In the experimental evaluation of two types of test collections, the CRF-based approach worked well in the re-ranking process for the DTW-based detections. CRF-based re-ranking showed 2.1% and 2.0% absolute improvements in F-measure for each of the two test collections.

Content from these authors
© 2016 The Institute of Electronics, Information and Communication Engineers
Previous article Next article
feedback
Top