• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Multi-modal Deep Learning Model by Disentangling Shape and Style for Analysis of Deep 'SHITSUKAN' Analysis and Synthesis

Publicly Offered Research

Project AreaAnalysis and synthesis of deep SHITSUKAN information in the real world
Project/Area Number 21H05812
Research Category

Grant-in-Aid for Transformative Research Areas (A)

Allocation TypeSingle-year Grants
Review Section Transformative Research Areas, Section (IV)
Research InstitutionThe University of Electro-Communications

Principal Investigator

柳井 啓司  電気通信大学, 大学院情報理工学研究科, 教授 (20301179)

Project Period (FY) 2021-09-10 – 2023-03-31
Project Status Completed (Fiscal Year 2022)
Budget Amount *help
¥7,800,000 (Direct Cost: ¥6,000,000、Indirect Cost: ¥1,800,000)
Fiscal Year 2022: ¥3,900,000 (Direct Cost: ¥3,000,000、Indirect Cost: ¥900,000)
Fiscal Year 2021: ¥3,900,000 (Direct Cost: ¥3,000,000、Indirect Cost: ¥900,000)
Keywords深層学習 / 画像生成モデル / 基盤モデル / 画像・言語モデル / 質感 / 特徴分離 / 画像生成
Outline of Research at the Start

本研究では,(1)大量の画像と言語のペアデータから画像の質感部分と言語の質感表現の対応付けを自動的に学習し,画像質感特徴量と言語質感特徴量の共通質感埋め込み空間を構築し,画像と言語の双方向検索(認識)を実現する.(2)さらに質感埋め込みベクトルと画像の形状特徴量を融合させることによって,新たな質感を持つ画像生成を実現する.これを統一的に実現する深層学習モデルを提案することが本研究の目的である.提案モデルを用いることで,(A) 大量のデータを用いた画像及び言語表現に関する「深奥な」質感分析の実現,(B) 言語による微妙な画像質感操作の実現,が可能となる.

Outline of Annual Research Achievements

本研究の当初の目的は,(1)大量の画像と言語のペアデータから画像の質感部分と言語の質感表現の対応付けを自動的に学習し,画像質感特徴量と言語質感特徴量の共通質感埋め込み空間を構築し,画像と言語の双方向検索(認識)を実現,(2)さらに質感埋め込みベクトルと画像の形状特徴量を融合させることによって,新たな質感を持つ画像生成を実現する,ことで,これを統一的に実現する深層学習モデルを提案することを目標としていた.
これに対して,本研究では2年間の研究期間の間に,次の3点の研究成果を得た.(1)クロスモーダルレシピデータセットを用いて,言語と画像双方から埋め込み可能なレシピ情報空間中のレシピベクトルと,食事の形状特徴を融合させることで,任意形状のレシピ情報に基づく食事画像生成を実現した.(2)事前学習済の画像・言語のクロスモーダル巨大モデルCLIPを用いて,画像の質感操作を実現し,その操作の度合を自由に制御する方法を提案した.(3)微分可能レンダラーを用いたフォント生成に対してCLIPを適用して,任意の言葉に対応したスタイルをもつフォント画像の生成手法も提案した.

Research Progress Status

令和4年度が最終年度であるため、記入しない。

Strategy for Future Research Activity

令和4年度が最終年度であるため、記入しない。

Report

(2 results)
  • 2022 Annual Research Report
  • 2021 Annual Research Report
  • Research Products

    (17 results)

All 2023 2022 2021

All Journal Article (2 results) (of which Int'l Joint Research: 1 results,  Peer Reviewed: 2 results,  Open Access: 2 results) Presentation (15 results) (of which Int'l Joint Research: 12 results)

  • [Journal Article] Material Translation Based on Neural Style Transfer with Ideal Style Image Retrieval2022

    • Author(s)
      Benitez-Garcia Gibran、Takahashi Hiroki、Yanai Keiji
    • Journal Title

      Sensors

      Volume: 22 Issue: 19 Pages: 7317-7317

    • DOI

      10.3390/s22197317

    • Related Report
      2022 Annual Research Report
    • Peer Reviewed / Open Access
  • [Journal Article] FASSD-Net: Fast and Accurate Real-Time Semantic Segmentation for Embedded Systems2021

    • Author(s)
      Rosas-Arias Leonel、Benitez-Garcia Gibran、Portillo-Portillo Jose、Olivares-Mercado Jesus、Sanchez-Perez Gabriel、Yanai Keiji
    • Journal Title

      IEEE Transactions on Intelligent Transportation Systems

      Volume: - Issue: 9 Pages: 1-12

    • DOI

      10.1109/tits.2021.3127553

    • Related Report
      2022 Annual Research Report
    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Presentation] Patent Image RetrievalUsing Cross-entropy-based Metric Learning2023

    • Author(s)
      Kotaro Higuchi,Yuma Honbu,Keiji Yanai
    • Organizer
      Proc.of International Workshop on Frontiers of Computer Vision (IW-FCV),
    • Related Report
      2022 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Virtual Try-On Considering Temporal Consistency for Videoconferencing.2023

    • Author(s)
      Daiki Shimizu,Keiji Yanai
    • Organizer
      Proc. of the International Multimedia Modeling Conference (MMM)
    • Related Report
      2022 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Transformer-Based Cross-Modal Recipe Embeddings with Large Batch Training.2023

    • Author(s)
      Jing Yang,Junwen Chen,Keiji Yanai
    • Organizer
      Proc. of the International Multimedia Modeling Conference (MMM)
    • Related Report
      2022 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Zero-shot Font Style Transfer with a Differentiable Renderer2022

    • Author(s)
      Kota Izumi,Keiji Yanai
    • Organizer
      Proc. of ACM Multimedia Asia
    • Related Report
      2022 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Parallel Queries for Human-Object Interaction Detection2022

    • Author(s)
      Junwen Chen,Keiji Yanai
    • Organizer
      Proc. of ACM Multimedia Asia
    • Related Report
      2022 Annual Research Report
    • Int'l Joint Research
  • [Presentation] SetMealAsYouLike: Sketch-based Set Meal Image Synthesis with Plate Annotations2022

    • Author(s)
      Yuma Honbu,Keiji Yanai
    • Organizer
      Proc. of ACMMM Workshop on Multimedia Assisted Dietary Management (MADIMA)
    • Related Report
      2022 Annual Research Report
    • Int'l Joint Research
  • [Presentation] DepthGrillCam: A Mobile Application for Real-time Eating Action Recording Using RGB-D Images2022

    • Author(s)
      Kento Adachi,Keiji Yanai
    • Organizer
      Proc. of ACMMM Workshop on Multimedia Assisted Dietary Management (MADIMA)
    • Related Report
      2022 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Text-based Image Editing for Food Images with CLIP2022

    • Author(s)
      Kohei Yamamoto,Keiji Yanai
    • Organizer
      Proc. of ACMMM Workshop on Multimedia Assisted Dietary Management (MADIMA)
    • Related Report
      2022 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Real Scale 3D Reconstruction of a Dish and a Plate using Implicit Function and a Single RGB-D Image2022

    • Author(s)
      Shu Naritomi,Keiji Yanai
    • Organizer
      Proc. of ACMMM Workshop on Multimedia Assisted Dietary Management (MADIMA)
    • Related Report
      2022 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Continual Learning in Vision Transformer2022

    • Author(s)
      Mana Takeda,Keiji Yanai
    • Organizer
      Proc.of IEEE International Conference on Image Processing (ICIP)
    • Related Report
      2022 Annual Research Report
    • Int'l Joint Research
  • [Presentation] StyleGAN-based CLIP-guided Image Shape Manipulation2022

    • Author(s)
      Yuchen Qian,Kohei Yamamoto,Keiji Yanai
    • Organizer
      Proc.of International Conference on Content-based Multimedia Indexing (CBMI)
    • Related Report
      2022 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Unseen Food Segmentation2022

    • Author(s)
      Yuma Honbu,Keiji Yanai
    • Organizer
      Proc.of ACM International Conference on Multimedia Retrieval (ICMR)
    • Related Report
      2022 Annual Research Report
    • Int'l Joint Research
  • [Presentation] クロスモーダルレシピエンベッティングによるマスクに基づく食事画像生成2022

    • Author(s)
      陳 仲涛,本部勇真,柳井啓司
    • Organizer
      電子情報通信学会 パターン認識・メディア理解研究会(PRMU)
    • Related Report
      2021 Annual Research Report
  • [Presentation] Transformerを用いたクロスモーダルレシピ検索・画像生成2022

    • Author(s)
      楊 景,柳井啓司
    • Organizer
      電子情報通信学会 パターン認識・メディア理解研究会(PRMU)
    • Related Report
      2021 Annual Research Report
  • [Presentation] StyleGANによるCLIP-Guidedな画像形状特徴編集2022

    • Author(s)
      銭 雨晨,柳井啓司
    • Organizer
      電子情報通信学会 パターン認識・メディア理解研究会(PRMU)
    • Related Report
      2021 Annual Research Report

URL: 

Published: 2021-10-22   Modified: 2023-12-25  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi