Budget Amount *help |
¥3,770,000 (Direct Cost: ¥2,900,000、Indirect Cost: ¥870,000)
Fiscal Year 2013: ¥1,820,000 (Direct Cost: ¥1,400,000、Indirect Cost: ¥420,000)
Fiscal Year 2012: ¥1,950,000 (Direct Cost: ¥1,500,000、Indirect Cost: ¥450,000)
|
Research Abstract |
Statistical language models are a fundamental component of speech recognition systems, machine translation systems, and so forth. Presently, the ngram language model is the most widely used approach. This model focuses on sequences of neighboring lexical words, and uses the probabilities of these sequences as model parameters. Due to the full lexicalization of the ngram language model, local features of word sequences can be well modeled. However, an ngram language model cannot capture relatively medium or long-range features, because it regards a sentence as a flat string and ignores its structure. In this research, we proposed a generative dependency ngram language model that integrates a generative dependency structure of a sentence into the original ngram language model. Using an expectation-maximization (EM) algorithm, the probability of arbitrary order dependency ngrams can be estimated by considering all possible dependency structures of a sentence.
|