2017 Fiscal Year Annual Research Report

機械可読時代における文字科学の創成と応用展開

Research Project

Project/Area Number	17H06100
Research Institution	Kyushu University
Principal Investigator	内田誠一九州大学, システム情報科学研究院, 教授 (70315125)
Co-Investigator(Kenkyū-buntansha)	柳井啓司電気通信大学, 大学院情報理工学研究科, 教授 (20301179) 牛久祥孝東京大学, 大学院情報理工学系研究科, 講師 (10784142)
Project Period (FY)	2017-05-31 – 2022-03-31
Keywords	文字 / 機械学習 / 情景内文字 / フォント / 深層学習 / 文字科学 / 画像記述
Outline of Annual Research Achievements	「文字」は我々の文化的活動やコミュニケーションを支える最重要メディアである．本提案では，「言語であり画像でもある」という文字の二面性に注目しながら，文字の持つ多様な機能の本質を総合的に解析する新分野「文字科学」を推進する．特にこれまで注目されることのなかった文字の4 機能（周囲の明確化，知識・意味伝達，雰囲気伝達，可読性維持）について，広汎で挑戦的かつ世界にも類例のない基礎的研究群および応用展開研究群を実施している．H29年度も，これら4機能について並列した研究を行い，その概要は以下の通りである．周囲の明確化機能：画像情報と文字情報の相補的な関係を明らかにする研究群を実施した．具体的には，環境内文字情報を反映させた画像キャプション生成，キャプション生成時に不足情報についての質問生成を含めたインタラクティブな画像理解，テキストに含まれる非言語情報（文字画像であれば形状等）からの画像生成などである．知識・意味伝達機能：深層学習(CNN/LSTM)を用いた複数の情景内文字情報の抽出および認識手法を実現した．さらに意図的に文字部分だけを消去する画像変換も実現し，「文字の無い状況」を体験することで，環境中の文字情報の必要性を定性的に検証した．雰囲気伝達機能：書籍画像約20万冊分を対象として，そのタイトルに使用されているフォントの種類や色と書籍ジャンルの相関解析を行い，特定ジャンルに特定フォント・色が頻出するといった関係を世界で初めて定量化した．また，GAN(Generative Adversarial Networks)やNeural style transferを活用した様々なフォント自動合成手法を開発した．可読性維持機能：GANを用い，(既存のアルファベットとは異なる)可読性を持った記号の自動生成を試みた．また深層学習内部で多様な文字がどのように認識されているかを解析した．
Current Status of Research Progress	Current Status of Research Progress 2: Research has progressed on the whole more than it was originally planned. Reason 文字の4 機能（周囲の明確化，知識・意味伝達，雰囲気伝達，可読性維持）の解明について，いずれも十分な進捗が見られるので，上記の評価とした．周囲の明確化機能についても，画像情報と文字情報のインタラクションについて，画像記述と関連した新技術群を開発できた．知識・意味伝達機能については，情景内文字検出・認識手法を開発しただけでなく，最新の手法(EASTやCRNN)も併用することで，従来をはるかに上回る精度での文字情報自動収集を実現している．雰囲気伝達機能については，世界で初めてジャンルとフォント・色の相関関係の大規模データに基づく定量化しに成功した．また自動生成されたフォントも多様性があり，さらに可読性も高いことを検証している．可読性維持機能については，可読性を持った記号の自動生成について，どのような制約や評価を与えるべきかを複数検討した．また深層学習内部で多様な文字がどのように認識されているかについても，プーリングの傾向や，各レイヤでのパターン分布，学習されたフィルタ（重み）の傾向など，多面的な検証を行った．
Strategy for Future Research Activity	H30年度についても，H29年度に行ってきた検討をさらに進め，文字機能の原理解明のための基礎研究を，4 機能全てについて並行して実施する．周囲の明確化機能：H29年度に行った検討結果をもとに，物体認識における文字情報の有効性を評価する．知識・意味伝達機能：H29年度に単一静止画像レベルで行った文字検出・認識実験を動画に展開する．さらにH29年度に培った画像記述技術を用いて，（文字以外の）画像情報の意味解析を行う．雰囲気伝達機能：H29年度に実施した書籍タイトルに使用されているフォント種類と書籍ジャンルの相関関係の解析をさらに進める．また，H29年度に実現したスタイル制御可能な(GANに基づく）フォント合成手法について，そこで学習された制御変数（潜在変数）がフォントのどの部分に対応しているのかを，解析する．可読性維持機能：CNNによる文字認識の挙動解析についてさらに調査を進める．特にCNNが文字の構造をどのように捉えているかについて，多角的な観察を行う．例えばプーリング方向のクラス依存性などについて，調査する．可読性・変形耐性を兼ね備えたアルファベット生成過程の解明のために，どのようなエネルギーを設計すれば既存のアルファベットとは識別可能で，かつ文字らしいパターンが生成可能かを調査する．

Research Products
(34 results)

All 2018 2017 Other

All Int'l Joint Research (4 results) Journal Article (11 results) (of which Int'l Joint Research: 10 results, Peer Reviewed: 11 results) Presentation (18 results) (of which Int'l Joint Research: 12 results, Invited: 5 results) Remarks (1 results)

[Int'l Joint Research] ドイツ人工知能研究所(DFKI)/カイザースラウテルン工科大学(Germany)
- Country Name
  Germany
- Counterpart Institution
  ドイツ人工知能研究所(DFKI)/カイザースラウテルン工科大学
[Int'l Joint Research] KTH Royal Institute of Technology(Sweden)
- Country Name
  Sweden
- Counterpart Institution
  KTH Royal Institute of Technology
[Int'l Joint Research] Wuhan University of Technology(China)
- Country Name
  China
- Counterpart Institution
  Wuhan University of Technology
[Int'l Joint Research] 富士通北京研究所(China)
- Country Name
  China
- Counterpart Institution
  富士通北京研究所
[Journal Article] How do Convolutional Neural Networks Learn Design?2018
- Author(s)
  Shailza Jolly, Brian Kenji Iwana, Ryohei Kuroki, Seiichi Uchida
- Journal Title
  
  Proceedings of the 24th International Conference on Pattern Recognition
  
  Volume: - Pages: 印刷中
- Peer Reviewed / Int'l Joint Research
[Journal Article] CNN Training with Graph-Based Sample Preselection: Application to Handwritten Character Recognition2018
- Author(s)
  Frederic Rayar, Masanori Goto and Seiichi Uchida
- Journal Title
  
  Proceedings of The 13th IAPR International Workshop on Document Analysis Systems
  
  Volume: - Pages: 印刷中
- Peer Reviewed / Int'l Joint Research
[Journal Article] Text Line Extraction based on Integrated K-shortest Paths Optimization2018
- Author(s)
  Liuan Wang, Jun Sun and Seiichi Uchida
- Journal Title
  
  Proceedings of The 13th IAPR International Workshop on Document Analysis Systems
  
  Volume: - Pages: 印刷中
- Peer Reviewed / Int'l Joint Research
[Journal Article] Contained Neural Style Transfer for Decorated Logo Generation2018
- Author(s)
  Gantugs Atarsaikhan, Brian Kenji Iwana and Seiichi Uchida
- Journal Title
  
  Proceedings of The 13th IAPR International Workshop on Document Analysis Systems
  
  Volume: - Pages: 印刷中
- Peer Reviewed / Int'l Joint Research
[Journal Article] Font Creation Using Generative Adversarial Networks with Class Discrimination2017
- Author(s)
  Kotaro Abe, Brian Kenji Iwana, Viktor Gosta Holmer and Seiichi Uchida
- Journal Title
  
  Proceedings of Asian Conference on Pattern Recognition
  
  Volume: - Pages: -
- Peer Reviewed / Int'l Joint Research
[Journal Article] Neural Font Style Transfer2017
- Author(s)
  Atarsaikhan Gantugs、Iwana Brian Kenji、Narusawa Atsushi、Yanai Keiji、Uchida Seiichi
- Journal Title
  
  Proceedings of ICDAR Workshop on Machine Learning
  
  Volume: - Pages: -
- DOI
  10.1109/ICDAR.2017.328
- Peer Reviewed / Int'l Joint Research
[Journal Article] How Does a CNN Manage Different Printing Types?2017
- Author(s)
  Ide Shota、Uchida Seiichi
- Journal Title
  
  Proceedings of The 14th International Conference on Document Analysis and Recognition
  
  Volume: - Pages: 1004-1009
- DOI
  10.1109/ICDAR.2017.167
- Peer Reviewed
[Journal Article] Scene Text Eraser2017
- Author(s)
  Toshiki Nakamura, Anna Zhu, Keiji Yanai and Seiichi Uchida
- Journal Title
  
  Proceedings of The 14th International Conference on Document Analysis and Recognition
  
  Volume: - Pages: 832-837
- DOI
  10.1109/ICDAR.2017.141
- Peer Reviewed / Int'l Joint Research
[Journal Article] Scene Text Relocation with Guidance2017
- Author(s)
  Zhu Anna、Uchida Seiichi
- Journal Title
  
  Proceedings of The 14th International Conference on Document Analysis and Recognition
  
  Volume: - Pages: 1289-1294
- DOI
  10.1109/ICDAR.2017.212
- Peer Reviewed / Int'l Joint Research
[Journal Article] Component Awareness in Convolutional Neural Networks2017
- Author(s)
  Iwana Brian Kenji、Zhou Letao、Tanaka-Ishii Kumiko、Uchida Seiichi
- Journal Title
  
  Proceedings of The 14th International Conference on Document Analysis and Recognition
  
  Volume: - Pages: 394-399
- DOI
  10.1109/ICDAR.2017.72
- Peer Reviewed / Int'l Joint Research
[Journal Article] Partial style transfer using weakly supervised semantic segmentation2017
- Author(s)
  Matsuo Shin、Shimoda Wataru、Yanai Keiji
- Journal Title
  
  Proc. of ICME Workshop on Multimedia Artworks Analysis
  
  Volume: - Pages: -
- DOI
  10.1109/ICMEW.2017.8026228
- Peer Reviewed / Int'l Joint Research
[Presentation] 画像情報学とデータサイエンス～技術動向と応用例2018
- Author(s)
  内田誠一
- Organizer
  平成30年電気学会全国大会
- Invited
[Presentation] How do Convolutional Neural Networks Learn Design?2018
- Author(s)
  Shailza Jolly, Brian Kenji Iwana, Ryohei Kuroki, Seiichi Uchida
- Organizer
  24th International Conference on Pattern Recognition
- Int'l Joint Research
[Presentation] CNN Training with Graph-Based Sample Preselection: Application to Handwritten Character Recognition2018
- Author(s)
  Frederic Rayar, Masanori Goto and Seiichi Uchida
- Organizer
  The 13th IAPR International Workshop on Document Analysis Systems
- Int'l Joint Research
[Presentation] Text Line Extraction based on Integrated K-shortest Paths Optimization2018
- Author(s)
  Liuan Wang, Jun Sun and Seiichi Uchida
- Organizer
  The 13th IAPR International Workshop on Document Analysis Systems
- Int'l Joint Research
[Presentation] Contained Neural Style Transfer for Decorated Logo Generation2018
- Author(s)
  Gantugs Atarsaikhan, Brian Kenji Iwana and Seiichi Uchida
- Organizer
  The 13th IAPR International Workshop on Document Analysis Systems
- Int'l Joint Research
[Presentation] 情景内文字情報を考慮した画像説明文生成2018
- Author(s)
  川口維文, 牛久祥孝, 内田誠一
- Organizer
  電子情報通信学会パターン認識・メディア理解研究会
[Presentation] Wasserstein GANによるスタイル制御可能なフォント生成2018
- Author(s)
  阿部耕太郎, 早志英朗, 内田誠一
- Organizer
  電子情報通信学会パターン認識・メディア理解研究会
[Presentation] 機械可読時代における文字科学の創成と応用展開2017
- Author(s)
  内田誠一
- Organizer
  情報系 WINTER FESTA Episode3
- Invited
[Presentation] Beyond 100%2017
- Author(s)
  Seiichi Uchida
- Organizer
  ICDAR Special Workshop on the Future of Document Analysis and Recognition
- Int'l Joint Research / Invited
[Presentation] ML for DAR, DAR for ML --- How machine learning and document analysis and recognition benefit each other2017
- Author(s)
  Seiichi Uchida
- Organizer
  ICDAR Workshop on Machine Learning
- Int'l Joint Research / Invited
[Presentation] 文字工学から文字科学へ2017
- Author(s)
  内田誠一
- Organizer
  第３回日本語の歴史的典籍国際研究集会
- Invited
[Presentation] Font Creation Using Generative Adversarial Networks with Class Discrimination2017
- Author(s)
  Kotaro Abe, Brian Kenji Iwana, Viktor Gosta Holmer and Seiichi Uchida
- Organizer
  Asian Conference on Pattern Recognition
- Int'l Joint Research
[Presentation] Neural Font Style Transfer2017
- Author(s)
  Gantugs Atarsaikhan, Brian Kenji Iwana, Atsushi Narusawa, Keiji Yanai and Seiichi Uchida
- Organizer
  ICDAR Workshop on Machine Learning
- Int'l Joint Research
[Presentation] How Does a CNN Manage Different Printing Types?2017
- Author(s)
  Shota Ide and Seiichi Uchida
- Organizer
  The 14th International Conference on Document Analysis and Recognition
- Int'l Joint Research
[Presentation] Scene Text Eraser2017
- Author(s)
  Toshiki Nakamura, Anna Zhu, Keiji Yanai and Seiichi Uchida
- Organizer
  The 14th International Conference on Document Analysis and Recognition
- Int'l Joint Research
[Presentation] Scene Text Relocation with Guidance2017
- Author(s)
  Anna Zhu and Seiichi Uchida
- Organizer
  The 14th International Conference on Document Analysis and Recognition
- Int'l Joint Research
[Presentation] Component Awareness in Convolutional Neural Networks2017
- Author(s)
  Brian Kenji Iwana, Letao Zhou, Kumiko Tanaka-Ishii and Seiichi Uchida
- Organizer
  The 14th International Conference on Document Analysis and Recognition
- Int'l Joint Research
[Presentation] 書籍表紙画像におけるフォント形状と書籍ジャンルの相関解析2017
- Author(s)
  品原悠杜, 内田誠一
- Organizer
  電子情報通信学会パターン認識・メディア理解研究会
[Remarks] ヒューマンインタフェース研究室
- URL
  human.ait.kyushu-u.ac.jp

2017 Fiscal Year Annual Research Report

機械可読時代における文字科学の創成と応用展開

Principal Investigator

内田 誠一 九州大学, システム情報科学研究院, 教授 (70315125)

Current Status of Research Progress

Reason

Research Products

[Int'l Joint Research] ドイツ人工知能研究所(DFKI)/カイザースラウテルン工科大学(Germany)

Country Name

Counterpart Institution

[Int'l Joint Research] KTH Royal Institute of Technology(Sweden)

Country Name

Counterpart Institution

[Int'l Joint Research] Wuhan University of Technology(China)

Country Name

Counterpart Institution

[Int'l Joint Research] 富士通北京研究所(China)

Country Name

Counterpart Institution

[Journal Article] How do Convolutional Neural Networks Learn Design?2018

Author(s)

Journal Title

[Journal Article] CNN Training with Graph-Based Sample Preselection: Application to Handwritten Character Recognition2018

Author(s)

Journal Title

[Journal Article] Text Line Extraction based on Integrated K-shortest Paths Optimization2018

Author(s)

Journal Title

[Journal Article] Contained Neural Style Transfer for Decorated Logo Generation2018

Author(s)

Journal Title

[Journal Article] Font Creation Using Generative Adversarial Networks with Class Discrimination2017

Author(s)

Journal Title

[Journal Article] Neural Font Style Transfer2017

Author(s)

Journal Title

DOI

[Journal Article] How Does a CNN Manage Different Printing Types?2017

Author(s)

Journal Title

DOI

[Journal Article] Scene Text Eraser2017

Author(s)

Journal Title

DOI

[Journal Article] Scene Text Relocation with Guidance2017

Author(s)

Journal Title

DOI

[Journal Article] Component Awareness in Convolutional Neural Networks2017

Author(s)

Journal Title

DOI

[Journal Article] Partial style transfer using weakly supervised semantic segmentation2017

Author(s)

Journal Title

DOI

[Presentation] 画像情報学とデータサイエンス～技術動向と応用例2018

Author(s)

Organizer

[Presentation] How do Convolutional Neural Networks Learn Design?2018

Author(s)

Organizer

[Presentation] CNN Training with Graph-Based Sample Preselection: Application to Handwritten Character Recognition2018

Author(s)

Organizer

[Presentation] Text Line Extraction based on Integrated K-shortest Paths Optimization2018

Author(s)

Organizer

[Presentation] Contained Neural Style Transfer for Decorated Logo Generation2018

Author(s)

Organizer

[Presentation] 情景内文字情報を考慮した画像説明文生成2018

Author(s)

Organizer

[Presentation] Wasserstein GANによるスタイル制御可能なフォント生成2018

Author(s)

Organizer

[Presentation] 機械可読時代における文字科学の創成と応用展開2017

内田誠一九州大学, システム情報科学研究院, 教授 (70315125)