Vocabulary acquisition and 3D avatar approach for Japanese sign language communication

Research Project

Project/Area Number	19K12023
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Multi-year Fund
Section	一般
Review Section	Basic Section 61010:Perceptual information processing-related
Research Institution	Osaka Prefecture University
Principal Investigator	Roy Partha Pratim 大阪府立大学, 研究推進機構, 客員研究員 (10837222)
Co-Investigator(Kenkyū-buntansha)	岩村雅一大阪府立大学, 工学(系)研究科(研究院), 准教授 (80361129) 井上勝文大阪府立大学, 工学(系)研究科(研究院), 准教授 (50733804)
Project Period (FY)	2019-04-01 – 2022-03-31
Project Status	Completed (Fiscal Year 2021)
Budget Amount *help	¥4,420,000 (Direct Cost: ¥3,400,000、Indirect Cost: ¥1,020,000) Fiscal Year 2021: ¥650,000 (Direct Cost: ¥500,000、Indirect Cost: ¥150,000) Fiscal Year 2020: ¥650,000 (Direct Cost: ¥500,000、Indirect Cost: ¥150,000) Fiscal Year 2019: ¥3,120,000 (Direct Cost: ¥2,400,000、Indirect Cost: ¥720,000)
Keywords	Sign lang. recognition / 3D conv. neural networks / Deep learning / Attention Network / Sign Lang. Recognition / 3D Conv. Network / I3D Network / Temporal information / Multi-stream network / Optical flow / Skelton / Face / Hand / 3D Avatar Model / Machine Learning / Natural Lang. Processing / SyntheticData Generation / Computer Vision / Deep Learning / Sign Language / Data Synthesis
Outline of Research at the Start	Hearing impaired people use Sign language (SL) as the primary way of communication which is performed through the use of hand gestures, movements of arm/body, expressions, etc. To understand the SL by common people, many approaches were proposed for gesture recognition. A limitation of these approaches is that they require a large dataset for preparing the machine learning models which require manual annotation of millions of gestures. To solve this, we propose to develop a 3D avatar model to mimic SL which will be used to generate synthetic data. It will be a robust system for SL recognition.
Outline of Final Research Achievements	To improve the performance of existing word-level Sign Language Recognition (W-SLR), in our first approach, a system with a multi-stream structure focusing on global information, local information, and skeletal information was proposed. The local information comprises of handshape and facial expression. The skeleton information captures hand position relative to the body. By combining these three streams, the proposed method achieves higher recognition performance than the state-of-the-art methods. In the second work, the original I3D network which was originally proposed for action recognition problems has been modified to improve the WSLR performance. The improvement includes an improved inception module named dilated inception module (DIM) and an attention mechanism-based temporal attention module (TAM) to identify the essential features of gestures.
Academic Significance and Societal Importance of the Research Achievements	Word-level Sign Language Recognition (W-SLR) systems overcome the communication barrier between people with speech impairment and those who can hear. In our approach, we combined these local and relative position of body parts and achieved higher performance on most W-SLR datasets.

Report

(4 results)

2021 Annual Research Report Final Research Report ( PDF )
2020 Research-status Report
2019 Research-status Report

Research Products
(3 results)

All 2021

All Journal Article (1 results) (of which Int'l Joint Research: 1 results, Peer Reviewed: 1 results, Open Access: 1 results) Presentation (2 results)

[Journal Article] 3D Avatar Approach for Continuous Sign Movement Using Speech/Text2021
- Author(s)
  Das Chakladar Debashis、Kumar Pradeep、Mandal Shubham、Roy Partha Pratim、Iwamura Masakazu、Kim Byung-Gyu
- Journal Title
  
  Applied Sciences
  
  Volume: 11 Issue: 8 Pages: 3439-3439
- DOI
  10.3390/app11083439
- Related Report
  2021 Annual Research Report
- Peer Reviewed / Open Access / Int'l Joint Research
[Presentation] 局所領域に着目したMulti-stream Neural Networks による手話単語認識2021
- Author(s)
  丸山瑞己，Shuvozit Ghose、井上勝文、Partha Pratim Roy、岩村雅一、吉岡理文
- Organizer
  電子情報通信学会技術研究報告パターン認識・メディア理解(PRMU)研究会
- Related Report
  2021 Annual Research Report
[Presentation] 局所領域に着目したMulti-stream Neural Networks による手話単語認識2021
- Author(s)
  丸山瑞己，Shuvozit Ghose、井上勝文、Partha Pratim Roy、岩村雅一、吉岡理文
- Organizer
  電子情報通信学会技術研究報告パターン認識・メディア理解研究会(PRMU)
- Related Report
  2020 Research-status Report

Vocabulary acquisition and 3D avatar approach for Japanese sign language communication

Principal Investigator

Roy Partha Pratim 大阪府立大学, 研究推進機構, 客員研究員 (10837222)

¥4,420,000 (Direct Cost: ¥3,400,000、Indirect Cost: ¥1,020,000)

Report

Research Products

[Journal Article] 3D Avatar Approach for Continuous Sign Movement Using Speech/Text2021

Author(s)

Journal Title

DOI

Related Report

[Presentation] 局所領域に着目したMulti-stream Neural Networks による手話単語認識2021

Author(s)

Organizer

Related Report

[Presentation] 局所領域に着目したMulti-stream Neural Networks による手話単語認識2021

Author(s)

Organizer

Related Report