研究実績の概要 |
The goal of this research project is to obtain the vocabulary of sign language. To do that, we need a highly accurate sign language recognition method. In the last fiscal year, we have proposed a method that achieved state-of-the-art performance. In this fiscal year, by focusing on the temporal information of sign language videos, we explored an idea to improve the performance of sign language recognition.
To be concrete, we modified the network model called the I3D. Improvement is done in three essential design aspects. First, we propose an improved inception module called dilated inception module (DIM) because the inception module of the I3D does not fully extract meaningful features from videos. Second, an attention mechanism-based temporal attention module (TAM) to identify the essential features of signs by focusing on important features. Additionally, we propose to eliminate a loss function that deteriorates performance. We evaluated the proposed method on two public datasets and found that it achieved an improvement of approximately 10%-15% in the top-1accuracy.
Based on the methods we proposed in this research project, we plan to explore the way to obtain the vocabulary of sign language.
|