2023 Fiscal Year Final Research Report

Task Fusion Learning in Deep Learning

Research Project

PDF

Project/Area Number	22K19808
Research Category	Grant-in-Aid for Challenging Research (Exploratory)
Allocation Type	Multi-year Fund
Review Section	Medium-sized Section 61:Human informatics and related fields
Research Institution	The University of Electro-Communications
Principal Investigator	Yanai Keiji 電気通信大学, 大学院情報理工学研究科, 教授 (20301179)
Project Period (FY)	2022-06-30 – 2024-03-31
Keywords	深層学習 / 継続学習 / 大規模モデル / 視覚言語モデル
Outline of Final Research Achievements	In this study, we started research to demonstrate that Neural Networks have general-purpose capabilities that are more similar to those of the human brain by having a single neural network learn functions for multiple tasks simultaneously, and by combining and superimposing the learned independent functions to realize new functions that are different from the individual single functions that were learned beforehand. The following three specific research projects has beed studied. (1) Superimposition of image transformation tasks using conditional signals. (2) Continuous learning of Vision Transformer(ViT). (3) Stable Diffusion for region segmentation of arbitrary words without learning: Using a large-scale trained image generation model, we extract regions corresponding to words without additional learning.
Free Research Field	メディア情報学
Academic Significance and Societal Importance of the Research Achievements	本研究によって，ニューラルネットワークには，異なる機能の同時学習に関してより柔軟に対応できる能力が備わっていることが示された．また，10億スケールの大規模画像言語ペアデータで学習されたテキストからの画像生成モデルには，テキストと視覚概念をピクセルレベルで対応付ける能力が備わっていることが示され，追加の学習なしで多様なタスクに活用できる可能性が大いに高まったと言える．今後は，この学習なしの能力を多様なタスクに対して実証し，それらの複合的な処理も学習無しで実現することを追求することで，大規模視覚言語モデルの応用可能性を大いに広げることが可能となる．