• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

A study of server management technology for sustaining a large scale distributed neural network

Research Project

Project/Area Number 20K19791
Research Category

Grant-in-Aid for Early-Career Scientists

Allocation TypeMulti-year Fund
Review Section Basic Section 60060:Information network-related
Research InstitutionKindai University

Principal Investigator

Mizutani Kimihiro  近畿大学, 情報学部, 准教授 (40845939)

Project Period (FY) 2020-04-01 – 2024-03-31
Project Status Completed (Fiscal Year 2023)
Budget Amount *help
¥3,120,000 (Direct Cost: ¥2,400,000、Indirect Cost: ¥720,000)
Fiscal Year 2022: ¥780,000 (Direct Cost: ¥600,000、Indirect Cost: ¥180,000)
Fiscal Year 2021: ¥1,430,000 (Direct Cost: ¥1,100,000、Indirect Cost: ¥330,000)
Fiscal Year 2020: ¥910,000 (Direct Cost: ¥700,000、Indirect Cost: ¥210,000)
Keywords広域分散コンピューティング / 分散学習 / 分散ニューラルネットワーク / ネットワーク管理 / 情報ネットワーク / オーバレイネットワーク / 構造化オーバレイネットワーク / P2P / サーバ連携 / 深層学習
Outline of Research at the Start

本研究では,大規模なニューラルネットワークを膨大な数のサーバにて自律的かつ永続的に管理をしつつ,学習の規模拡張性を向上させる分散サーバ連携技術を創出することを目的とする.具体的には,ニューラルネットワークの構成に応じて,自律的にニューラルネットワーク上の計算タスク等をどのサーバに割り当てるかを決定する手法,およびサーバの追加や故障に応じて,サーバ間で計算結果を委譲・復元する手法の確立を目指す.

Outline of Final Research Achievements

In this study, we aim to construct a distributed neural network execution platform by developing core technologies. First, we used structured overlay network technology to quickly restore the distributed platform. This method's strength is in estimating the union of failure nodes and quickly propagating failure information to them. This approach reduces unnecessary failure information propagation and shortens the platform's Mean Time to Repair (MTTR).
Secondly, we integrated distributed federated learning techniques into the platform to manage scalable learning nodes. We proposed an efficient scalable node management tree architecture that balances learning efficiency and high fault tolerance.
Finally, we developed various schemes for traffic data estimation and control within the platform. By combining these technologies, we expect to maintain a robust and fault-tolerant future distributed neural network management platform.

Academic Significance and Societal Importance of the Research Achievements

本研究では,自律的なニューラルネットワークの分散実行基盤の構築において,学習・推論の永続的な実行をサポートするサーバ連携技術および学習状況の管理手法の提案を行った.サーバ連携技術では,構造化オーバレイ技術を活用し,基盤内で発生するサーバの故障対応を高速化する手法を創出した.学習状況の管理手法については,連合学習フレームワーク上で学習・推論の円滑な同時実行を実現する技術を開発した.さらに,分散実行基盤内で発生するデータの制御・解析に関する技術の創出も行った.これらの技術は,当該研究分野において重要な貢献を果たしており,今後のさらなる研究や実用化の基盤となると考えられる.

Report

(5 results)
  • 2023 Annual Research Report   Final Research Report ( PDF )
  • 2022 Research-status Report
  • 2021 Research-status Report
  • 2020 Research-status Report
  • Research Products

    (10 results)

All 2023 2022 2021 2020

All Journal Article (5 results) (of which Peer Reviewed: 5 results,  Open Access: 5 results) Presentation (5 results) (of which Int'l Joint Research: 3 results)

  • [Journal Article] A Comprehensive Evaluation of Generating a Mobile Traffic Data Scheme without a Coarse-Grained Process Using CSR-GAN2022

    • Author(s)
      Tokunaga Tomoki、Mizutani Kimihiro
    • Journal Title

      Sensors

      Volume: 22 Issue: 5 Pages: 1930-1930

    • DOI

      10.3390/s22051930

    • Related Report
      2021 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] Effective TCP Flow Management Based on Hierarchical Feedback Learning in Complex Data Center Network2022

    • Author(s)
      Mizutani Kimihiro
    • Journal Title

      Sensors

      Volume: 22 Issue: 2 Pages: 611-611

    • DOI

      10.3390/s22020611

    • Related Report
      2021 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] A novel distributed deep learning training scheme based on distributed skip mesh list2021

    • Author(s)
      Suzuki Masaya、Mizutani Kimihiro
    • Journal Title

      IEICE Communications Express

      Volume: 10 Issue: 8 Pages: 463-468

    • DOI

      10.1587/comex.2021ETL0023

    • NAID

      130008070802

    • ISSN
      2187-0136
    • Year and Date
      2021-08-01
    • Related Report
      2021 Research-status Report 2020 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] A scheme of estimating mobile traffic data without coarse-grained process using conditional SR-GAN2021

    • Author(s)
      Tokunaga Tomoki、Mizutani Kimihiro
    • Journal Title

      IEICE Communications Express

      Volume: 10 Issue: 8 Pages: 441-446

    • DOI

      10.1587/comex.2021ETL0017

    • NAID

      130008070791

    • ISSN
      2187-0136
    • Year and Date
      2021-08-01
    • Related Report
      2021 Research-status Report 2020 Research-status Report
    • Peer Reviewed / Open Access
  • [Journal Article] Stateless Node Failure Information Propagation Scheme for Stable Overlay Networks2021

    • Author(s)
      Mizutani Kimihiro
    • Journal Title

      IEEE Access

      Volume: 9 Pages: 88737-88745

    • DOI

      10.1109/access.2021.3090028

    • Related Report
      2021 Research-status Report
    • Peer Reviewed / Open Access
  • [Presentation] An Efficient Approach for Training Time Minimization in Distributed Split Neural Network2023

    • Author(s)
      Eigo Yamamoto and Kimihiro Mizutani
    • Organizer
      IEEE GCCE
    • Related Report
      2023 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Accurate Mobile Traffic Generation Scheme without Coarse-grained Data Using Conditional SR-GAN2020

    • Author(s)
      Tomoki Tokunaga, Kimihiro Mizutani
    • Organizer
      ICETC 2020
    • Related Report
      2020 Research-status Report
    • Int'l Joint Research
  • [Presentation] A Novel Distributed Deep Learning Training Scheme Based on Distributed Skip Mesh List2020

    • Author(s)
      Masaya Suzuki, Kimihiro Mizutani
    • Organizer
      ICETC 2020
    • Related Report
      2020 Research-status Report
    • Int'l Joint Research
  • [Presentation] Conditional SR-GANを用いたモバイルトラフィックデータの圧縮・復元2020

    • Author(s)
      徳永 智紀, 水谷 后宏
    • Organizer
      電気関係学会関西連合大会
    • Related Report
      2020 Research-status Report
  • [Presentation] Distributed Skip Mesh Listを用いた大規模ニューラルネットワークの永続的管理手法2020

    • Author(s)
      鈴木 雅也, 水谷 后宏
    • Organizer
      電気関係学会関西連合大会
    • Related Report
      2020 Research-status Report

URL: 

Published: 2020-04-28   Modified: 2025-01-30  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi