2021 Fiscal Year Annual Research Report

Acceleration of large-scale deep learning by optimizing parallel I/O

Research Project

Project/Area Number	20K19811
Research Institution	Institute of Physical and Chemical Research
Principal Investigator	佐藤賢斗国立研究開発法人理化学研究所, 計算科学研究センター, チームリーダー (50739696)
Project Period (FY)	2020-04-01 – 2022-03-31
Keywords	高性能計算 / 深層学習 / 富岳 / Arm / チューニング
Outline of Annual Research Achievements	DNNの学習に利用できる計算量とデータセットの増加に伴いスーパーコンピュータによる高速な分散学習が求められている。しかし、富岳のようなソフトウェアエコシステムがまだ確立されていないシステムでは、ソフトウェアの移植およびチューニングする必要があり膨大な開発労力が必要です。そこで我々は、3D-CNNモデル学習におけるソフトウェアチューニングを例として深層学習フレームワークのチューニングをおこなった。具体的には、(1) 深層学習の計算カーネルをaarch64用のJITトランスレータでチューニング。(2)大規模実行においてCPUコアを有効に活用するためのデータ並列とモデル並列によるハイブリッド学習のサポート。(3)データ圧縮によるデータステージングとキャッシュによるデータローダによるI/Oチューニング。 (4) 最後に学習モデル固有のチューニングによる収束性の向上をおこなった。CosmoFlow 3D-CNNモデルに対してこれらの提案手法を適用し、その結果、この深層学習モデルを8時間16分で637個、1分あたりでは約1.29個のモデル学習を完了することができた、これによりMLPerfHPC v1.0のCosmoflow（弱スケール）部門において、他システムの性能と比較し約1.77倍の処理速度を達成し、機械学習を利用した大規模な科学技術計算の分野において世界最高レベルの性能を有していることが示した。「MLPerf HPC」は、機械学習アプリケーションを実行するシステムの性能リストを作成することを目的に、膨大な時間を要する大規模機械学習計算をスーパーコンピュータで行った際のシステム性能を評価するため、機械学習ベンチマークであり、また世界各国のスーパーコンピュータで利用されており、新たな業界標準として期待されている。

Research Products
(8 results)

All 2022 2021 Other

All Int'l Joint Research (2 results) Journal Article (4 results) (of which Int'l Joint Research: 3 results, Peer Reviewed: 4 results) Presentation (1 results) (of which Int'l Joint Research: 1 results) Remarks (1 results)

[Int'l Joint Research] Sun Yat-Sen University/Xi'an Univ. of Finance and Economics(中国)
- Country Name
  CHINA
- Counterpart Institution
  Sun Yat-Sen University/Xi'an Univ. of Finance and Economics
[Int'l Joint Research] Lawrence Berkeley National Laboratory/Argonne National Laboratory/Oak Ridge National Laboratory(米国)
- Country Name
  U.S.A.
- Counterpart Institution
  Lawrence Berkeley National Laboratory/Argonne National Laboratory/Oak Ridge National Laboratory
[Journal Article] Social Media Driven Big Data Analysis for Disaster Situation Awareness: A Tutorial2022
- Author(s)
  Pal Amitangshu、Wang Junbo、Wu Yilang、Kant Krishna、Liu Zhi、Sato Kento
- Journal Title
  
  IEEE Transactions on Big Data
  
  Volume: - Pages: 1～1
- DOI
  10.1109/TBDATA.2022.3158431
- Peer Reviewed / Int'l Joint Research
[Journal Article] Semi-Synchronous Federated Learning Protocol with Dynamic Aggregation in Internet of Vehicles2022
- Author(s)
  Liang Feiyuan、Yang Qinglin、Liu Ruiqi、Wang Junbo、Sato Kento、Guo Jian
- Journal Title
  
  IEEE Transactions on Vehicular Technology
  
  Volume: - Pages: 1～1
- DOI
  10.1109/TVT.2022.3148872
- Peer Reviewed / Int'l Joint Research
[Journal Article] The 16,384-node Parallelism of 3D-CNN Training on An Arm CPU based Supercomputer2021
- Author(s)
  Tabuchi Akihiro、Shirahata Koichi、Yamazaki Masafumi、Kasagi Akihiko、Honda Takumi、Kurihara Kouji、Kawakami Kentaro、Tabaru Tsuguchika、Fukumoto Naoto、Kuroda Akiyoshi、Fukai Takaaki、Sato Kento
- Journal Title
  
  2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics (HiPC)
  
  Volume: - Pages: 152-161
- DOI
  10.1109/HiPC53243.2021.00029
- Peer Reviewed
[Journal Article] MLPerf HPC: A Holistic Benchmark Suite for Scientific Machine Learning on HPC Systems2021
- Author(s)
  Farrell Steven、Emani Murali、Balma Jacob、Drescher Lukas、Drozd Aleksandr、Fink Andreas、Fox Geoffrey、Kanter David、Kurth Thorsten、Mattson Peter、Mu Dawei、Ruhela Amit、Sato Kento、Shirahata Koichi、Tabaru Tsuguchika、et al.
- Journal Title
  
  2021 IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments (MLHPC)
  
  Volume: - Pages: 33-45
- DOI
  10.1109/MLHPC54614.2021.00009
- Peer Reviewed / Int'l Joint Research
[Presentation] Measurement of I/O Performance on a Hierarchical File System for Distributed Deep Neural Network2022
- Author(s)
  Takaki Fukai, Kento Sato
- Organizer
  The 4th R-CCS International Symposium (RCCS-IS4)
- Int'l Joint Research
[Remarks] High Performance Big Data Research Team
- URL
  https://www.hpbd.r-ccs.riken.jp

2021 Fiscal Year Annual Research Report

Acceleration of large-scale deep learning by optimizing parallel I/O

Principal Investigator

佐藤 賢斗 国立研究開発法人理化学研究所, 計算科学研究センター, チームリーダー (50739696)

Research Products

[Int'l Joint Research] Sun Yat-Sen University/Xi'an Univ. of Finance and Economics(中国)

Country Name

Counterpart Institution

[Int'l Joint Research] Lawrence Berkeley National Laboratory/Argonne National Laboratory/Oak Ridge National Laboratory(米国)

Country Name

Counterpart Institution

[Journal Article] Social Media Driven Big Data Analysis for Disaster Situation Awareness: A Tutorial2022

Author(s)

Journal Title

DOI

[Journal Article] Semi-Synchronous Federated Learning Protocol with Dynamic Aggregation in Internet of Vehicles2022

Author(s)

Journal Title

DOI

[Journal Article] The 16,384-node Parallelism of 3D-CNN Training on An Arm CPU based Supercomputer2021

Author(s)

Journal Title

DOI

[Journal Article] MLPerf HPC: A Holistic Benchmark Suite for Scientific Machine Learning on HPC Systems2021

Author(s)

Journal Title

DOI

[Presentation] Measurement of I/O Performance on a Hierarchical File System for Distributed Deep Neural Network2022

Author(s)

Organizer

[Remarks] High Performance Big Data Research Team

URL

佐藤賢斗国立研究開発法人理化学研究所, 計算科学研究センター, チームリーダー (50739696)