Illustration text conversion system for visually impaired students

Research Project

Project/Area Number	23K11376
Research Category	Grant-in-Aid for Scientific Research (C)
Allocation Type	Multi-year Fund
Section	一般
Review Section	Basic Section 62030:Learning support system-related
Research Institution	Kinjo University
Principal Investigator	川邊弘之金城大学, 人間社会科学部, 教授 (60249167)
Co-Investigator(Kenkyū-buntansha)	下村有子金沢大学, 設計製造技術研究所, 研究協力員 (70171006) 瀬戸就一金城大学短期大学部, ビジネス実務学科, 教授 (90196973)
Project Period (FY)	2023-04-01 – 2026-03-31
Project Status	Granted (Fiscal Year 2023)
Budget Amount *help	¥4,680,000 (Direct Cost: ¥3,600,000、Indirect Cost: ¥1,080,000) Fiscal Year 2025: ¥780,000 (Direct Cost: ¥600,000、Indirect Cost: ¥180,000) Fiscal Year 2024: ¥1,950,000 (Direct Cost: ¥1,500,000、Indirect Cost: ¥450,000) Fiscal Year 2023: ¥1,950,000 (Direct Cost: ¥1,500,000、Indirect Cost: ¥450,000)
Keywords	点字翻訳 / 深層学習 / 全盲学生 / 学生支援 / 文章生成 / 画像認識
Outline of Research at the Start	全盲学生は大学に入学すると、教科書、参考書など多くの点字の本が必要になる。我々は深層学習を利用して、学生点訳ボランティアが容易に使える点字への翻訳システムを構築したが、図版、特にグラフの説明の文字化が未着手であった。表の読み上げ機能は既に実用化されているが、グラフではそのタイトルの提示程度で、示された内容の解説がなされていない。本研究では図版、特にグラフの解説文の作成に深層学習を導入する。深層学習をグラフの内容を文章化する部分に適用する。そして、我々の点字翻訳システムをグラフの読解に対応させ、全盲学生が健常学生と同等に学修できるようにする。
Outline of Annual Research Achievements	グラフに含まれる文字情報の検出と読み取りについて実験した。文字情報の検出と認識にはディープラーニングを用いた。OpenCVで学習済みのResNet-50とCRNN-CTCネットワークを用いて、文字の検出・認識を行うpythonスクリプトを作成し、スキャンした図形の文字検出・認識を試みた。一部の文字は検出されなかったが、一部の領域は誤って文字とされた。軸タイトルに隣接する座標軸のスケール番号や、スケール上の一桁の数字が検出されなかった。横軸のタイトル付近の目盛数字、横軸の補助目盛線、プロットされたマーカーで文字誤検出が発生した。縦軸のタイトル付近の目盛りの数字が検出されず、1つの数字の塊として解釈されるのは、文字と数字の間隔が狭いと検出がうまくいかないことを示している。ResNetは1文字検出型のネットワークモデルであるため、上記のような結果になったと思われる。このようなネットワークモデルから多文字同時検出型のネットワークモデルに変更すれば、検出精度の向上が期待できる。なお、プロットされたマーカーを文字とみなさないようにするためには、ネットワークモデルを学習させる教師データの量を増やし、マーカーに対して否定学習を行う必要がある。テキスト認識では、RCNN-CTCは単語辞書に含まれないギリシャ文字や単位記号の認識を試みるが、誤った結果を与えた。単語を構成する文字が正しく検出されれば、単語辞書にない単語以外は正しく認識されたことになる。 CRNN-CTCはうまく機能した。参照する単語辞書を拡張することで、より正確な単語認識を期待できる。
Current Status of Research Progress	Current Status of Research Progress 2: Research has progressed on the whole more than it was originally planned. Reason 当初、１年目は、種々の教科書やウェブサイトからグラフを集め、電子化しそれらにキャプションや座標軸、その数値、凡例等の文字情報、値の傾向やピークや谷の位置等、概形に関する情報を付加し(アノテーション)、教師データを作る予定であった。だが、文字検出・認識に関し、OpenCVで学習済みのResNet-50とCRNN-CTCネットワークが公開されていたため、２年目の予定していた一部の研究を前倒しして、上記ネットワークを用いスキャンした図形の文字検出・認識を試みた。単語辞書の拡充により、より正確な単語認識を期待できることがわかった。
Strategy for Future Research Activity	グラフの教師データをもとに、グラフの概形を識別し、適切な文字表現を与えるネットワークモデルを作る。入力はグラフの図形、出力は「単調に増加している」「ピークがある」「ロングテールになっている」等の文字列である。既存のimage2textのネットワークモデルを我々の教師データでFine Tuningすることで実現する。

Report

(1 results)

2023 Research-status Report

Research Products
(3 results)

All 2023

All Journal Article (1 results) (of which Peer Reviewed: 1 results) Presentation (1 results) (of which Int'l Joint Research: 1 results) Book (1 results)

[Journal Article] Recognizing graph elements using deep learning for visually impaired students2023
- Author(s)
  H. Kawabe, Y. Shimomura and S. Seto
- Journal Title
  
  Proceedings of the 23rd Asia Pacific Industrial Engineering & Management Systems Conference
  
  Volume: - Pages: 302-303
- Related Report
  2023 Research-status Report
- Peer Reviewed
[Presentation] Recognizing graph elements using deep learning for visually impaired students2023
- Author(s)
  H. Kawabe, Y. Shimomura and S. Seto
- Organizer
  The 23rd Asia Pacific Industrial Engineering & Management Systems Conference
- Related Report
  2023 Research-status Report
- Int'l Joint Research
[Book] Proceeding 23rd Asia Pacific Industrial Engineering & Management System2023
- Author(s)
  Mohd Helmi Ali, Asma Qamaliah Abdul Hamid and Mazzlida Mat Deli
- Total Pages
  349
- Publisher
  UKM-Graduate School of Business Universiti Kebangsaan Malaysia
- ISBN
  9789671785614
- Related Report
  2023 Research-status Report

Illustration text conversion system for visually impaired students

Principal Investigator

川邊 弘之 金城大学, 人間社会科学部, 教授 (60249167)

¥4,680,000 (Direct Cost: ¥3,600,000、Indirect Cost: ¥1,080,000)

Current Status of Research Progress

Reason

Report

Research Products

[Journal Article] Recognizing graph elements using deep learning for visually impaired students2023

Author(s)

Journal Title

Related Report

[Presentation] Recognizing graph elements using deep learning for visually impaired students2023

Author(s)

Organizer

Related Report

[Book] Proceeding 23rd Asia Pacific Industrial Engineering & Management System2023

Author(s)

Total Pages

Publisher

ISBN

Related Report

川邊弘之金城大学, 人間社会科学部, 教授 (60249167)