Scalable Hybrid-parallelism Design for Mega-Size Deep Learning Model

研究課題

研究課題/領域番号	21K17751
研究種目	若手研究
配分区分	基金
審査区分	小区分60090:高性能計算関連
研究機関	国立研究開発法人産業技術総合研究所
研究代表者	Nguyen Truong 国立研究開発法人産業技術総合研究所, 情報・人間工学領域, 研究員 (60835346)
研究期間 (年度)	2021-04-01 – 2024-03-31
研究課題ステータス	完了 (2023年度)
配分額 *注記	4,680千円 (直接経費: 3,600千円、間接経費: 1,080千円) 2023年度: 1,430千円 (直接経費: 1,100千円、間接経費: 330千円) 2022年度: 1,820千円 (直接経費: 1,400千円、間接経費: 420千円) 2021年度: 1,430千円 (直接経費: 1,100千円、間接経費: 330千円)
キーワード	Distributed Training / Large Model / Large dataset / Large scale system / Deep Learning / Large Scale / Distributed Computing / Non-IID / Large-scale / Distributed computing / Hybrid parallelism
研究開始時の研究の概要	This proposal try to find techniques that help to speed-up the training/inference process of Distributed Deep Learning. The proposed research project includes several research topics: (1) Hybrid-parallelism design:(1.1) Study the limitation of different parallelism strategies and (1.2) find novel fine-grained hybrid parallelism strategies for each type of specific applications (2) Method to reduce communication time via (2.1) optimizing the communication mechanism for each type of network architecture in supercomputers and (2.2)study the method to reduce network contention.
研究成果の概要	大規模なデータセットを使用した大規模なディープラーニングのトレーニングでは、3D 並列処理 (データ + パイプライン + テンソル) が標準になることがわかりました。このトレーニングプロセスを高速化する方法を提案しました。I/O 時間を短縮するために、ローカルシャッフルとペアワイズデータ交換およびモデル交換を使用して、モデルの精度を維持します。計算時間を短縮するために、トレーニング中に重要でないサンプルを削除します。ネットワークアーキテクチャと集団通信を共同設計することで、通信時間を短縮します。論文8件、ポスター2件を発表し、賞を2つ獲得しました。
研究成果の学術的意義や社会的意義	Our research helps to support the research and development of big models. It brings a groundbreaking new solution with the requirements of the urgent AI, e.g.,ChatGPT. It can be ultimately contributing to the advancement of AI models, particularly foundational models, in the context of social 5.0.

報告書

(4件)

研究成果
(10件)

すべて 2024 2023 2022

すべて雑誌論文 (8件) (うち国際共著 8件、査読あり 6件、オープンアクセス 2件) 学会発表 (2件) (うち国際学会 2件)

[雑誌論文] KAKURENBO: Adaptively Hiding Samples in Deep Neural Network Training2024
- 著者名/発表者名
  Truong Thao Nguyen, Balazs Gerofi, Edgar Josafat Martinez-Noriega, Francois Trahay, and Mohamed Wahib
- 雑誌名
  
  37th Conference on Neural Information Processing Systems (NeurIPS 2023)
  
  巻: - ページ: 1-23
- 関連する報告書
  2023 実績報告書
- 査読あり / オープンアクセス / 国際共著
[雑誌論文] FedDCT: Federated Learning of Large Convolutional Neural Networks on Resource-Constrained Devices Using Divide and Collaborative Training2024
- 著者名/発表者名
  Nguyen Quan、Pham Hieu H.、Wong Kok-Seng、Le Nguyen Phi、Nguyen Truong Thao、Do Minh N.
- 雑誌名
  
  IEEE Transactions on Network and Service Management
  
  巻: 21 号: 1 ページ: 418-436
- DOI
  10.1109/tnsm.2023.3314066
- 関連する報告書
  2023 実績報告書
- 査読あり / オープンアクセス / 国際共著
[雑誌論文] A Bandwidth-Optimal All-to-All Communication in Two-Dimensional Fully Connected Network2024
- 著者名/発表者名
  Kien Trung Pham, Thao Nguyen Truong and Michihiro Koibuchi
- 雑誌名
  
  24th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing
  
  巻: - ページ: 1-7
- DOI
  10.1109/ccgrid59990.2024.00010
- 関連する報告書
  2023 実績報告書
- 査読あり / 国際共著
[雑誌論文] SEM: A Simple Yet Efficient Model-agnostic Local Training Mechanism to Tackle Data Sparsity and Scarcity in Federated Learning2023
- 著者名/発表者名
  Pham Quang Ha、Nguyen Nang Hung、Nguyen Thanh Hung、Pham Huy Hieu、Nguyen Phi Le、Nguyen Truong Thao
- 雑誌名
  
  Eleventh International Symposium on Computing and Networking (CANDAR)
  
  巻: - ページ: 120-126
- DOI
  10.1109/candar60563.2023.00023
- 関連する報告書
  2023 実績報告書
- 査読あり / 国際共著
[雑誌論文] Effective Switchless Inter-FPGA Memory Networks2023
- 著者名/発表者名
  Truong Thao Nguyen, Kien Trung Pham, Hiroshi Yamaguchi, Yutaka Urino, Michihiro Koibuchi
- 雑誌名
  
  Journal of Parallel and Distributed Computing
  
  巻: -
- 関連する報告書
  2022 実施状況報告書
- 査読あり / 国際共著
[雑誌論文] CADIS: Handling Cluster-skewed Non-IID Data in Federated Learning with Clustered Aggregation and Knowledge Distilled Regularization2023
- 著者名/発表者名
  Nang Hung Nguyen, Duc Long Nguyen, Trong Bang Nguyen, Thanh-Hung Nguyen, Hieu Pham, Truong Thao Nguyen, Phi Le Nguyen
- 雑誌名
  
  23rd IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing
  
  巻: - ページ: 249-261
- 関連する報告書
  2022 実施状況報告書
- 査読あり / 国際共著
[雑誌論文] Why Globally Re-shuffle? Revisiting Data Shuffling in Large Scale Deep Learning2022
- 著者名/発表者名
  Truong Thao Nguyen, Francois Trahay, Jens Domke, Aleksandr Drozd, Emil Vatai, Jianwei Liao, Mohamed Wahib, Balazs Gerofi
- 雑誌名
  
  36th IEEE International Parallel & Distributed Processing Symposium
  
  巻: 0 ページ: 1-12
- 関連する報告書
  2021 実施状況報告書
- 国際共著
[雑誌論文] Scalable Low-Latency Inter-FPGA Networks2022
- 著者名/発表者名
  Kien Trung Pham, Truong Thao Nguyen, Hiroshi Yamaguchi, Yutaka Urino, Michihiro Koibuchi
- 雑誌名
  
  36th IEEE International Parallel & Distributed Processing Symposium
  
  巻: 0 ページ: 1-12
- 関連する報告書
  2021 実施状況報告書
- 国際共著
[学会発表] Efficient Sample Exchanging for Large-Scale Training Distributed Deep Learning with Local Sampling2024
- 著者名/発表者名
  Truong Thao Nguyen, Yusuke Tanimura
- 学会等名
  International Conference on High Performance Computing in Asia-Pacific Region 2024
- 関連する報告書
  2023 実績報告書
- 国際学会
[学会発表] Efficient Allreduce Algorithm for Large-Scale Deep Learning on Distributed Loop Networks2023
- 著者名/発表者名
  Truong Thao Nguyen, Peng Chen, Yusuke Tanimura
- 学会等名
  International Conference on High Performance Computing in Asia-Pacific Region 2023
- 関連する報告書
  2022 実施状況報告書
- 国際学会

Scalable Hybrid-parallelism Design for Mega-Size Deep Learning Model

研究代表者

Nguyen Truong 国立研究開発法人産業技術総合研究所, 情報・人間工学領域, 研究員 (60835346)

4,680千円 (直接経費: 3,600千円、間接経費: 1,080千円)

報告書

研究成果

[雑誌論文] KAKURENBO: Adaptively Hiding Samples in Deep Neural Network Training2024

著者名/発表者名

雑誌名

関連する報告書

[雑誌論文] FedDCT: Federated Learning of Large Convolutional Neural Networks on Resource-Constrained Devices Using Divide and Collaborative Training2024

著者名/発表者名

雑誌名

DOI

関連する報告書

[雑誌論文] A Bandwidth-Optimal All-to-All Communication in Two-Dimensional Fully Connected Network2024

著者名/発表者名

雑誌名

DOI

関連する報告書

[雑誌論文] SEM: A Simple Yet Efficient Model-agnostic Local Training Mechanism to Tackle Data Sparsity and Scarcity in Federated Learning2023

著者名/発表者名

雑誌名

DOI

関連する報告書

[雑誌論文] Effective Switchless Inter-FPGA Memory Networks2023

著者名/発表者名

雑誌名

関連する報告書

[雑誌論文] CADIS: Handling Cluster-skewed Non-IID Data in Federated Learning with Clustered Aggregation and Knowledge Distilled Regularization2023

著者名/発表者名

雑誌名

関連する報告書

[雑誌論文] Why Globally Re-shuffle? Revisiting Data Shuffling in Large Scale Deep Learning2022

著者名/発表者名

雑誌名

関連する報告書

[雑誌論文] Scalable Low-Latency Inter-FPGA Networks2022

著者名/発表者名

雑誌名

関連する報告書

[学会発表] Efficient Sample Exchanging for Large-Scale Training Distributed Deep Learning with Local Sampling2024

著者名/発表者名

学会等名

関連する報告書

[学会発表] Efficient Allreduce Algorithm for Large-Scale Deep Learning on Distributed Loop Networks2023

著者名/発表者名

学会等名

関連する報告書