Development of accurate and reproducible matrix computation library for massively parallel environments
Project/Area Number |
19K20286
|
Research Category |
Grant-in-Aid for Early-Career Scientists
|
Allocation Type | Multi-year Fund |
Review Section |
Basic Section 60100:Computational science-related
|
Research Institution | Institute of Physical and Chemical Research |
Principal Investigator |
Mukunoki Daichi 国立研究開発法人理化学研究所, 計算科学研究センター, 研究員 (90742289)
|
Project Period (FY) |
2019-04-01 – 2023-03-31
|
Project Status |
Completed (Fiscal Year 2022)
|
Budget Amount *help |
¥2,340,000 (Direct Cost: ¥1,800,000、Indirect Cost: ¥540,000)
Fiscal Year 2021: ¥780,000 (Direct Cost: ¥600,000、Indirect Cost: ¥180,000)
Fiscal Year 2020: ¥780,000 (Direct Cost: ¥600,000、Indirect Cost: ¥180,000)
Fiscal Year 2019: ¥780,000 (Direct Cost: ¥600,000、Indirect Cost: ¥180,000)
|
Keywords | 高精度 / 再現性 / 行列計算 / 疎行列反復法 / BLAS / 超並列 / 浮動小数点演算 |
Outline of Research at the Start |
コンピュータによる科学技術計算で主として用いられる浮動小数点演算は有限桁であり,演算結果には真の値に対して丸め誤差が生じうる.また結合法則が成り立たないため計算環境に依存して計算順序が変わると計算結果が丸め誤差レベルで変わりうるため,同じ計算結果を再現できないことがある.これらの特性は特にスーパーコンピュータ上で実施されるような大規模かつ複雑な数値計算において,信頼性の担保やソフトウェア開発・保守の障壁となりうる.本研究では科学技術計算の基本演算となる行列計算において,計算の高精度化と再現性を実現し,かつ最新のスーパーコンピュータにおいて高性能を達成できるソフトウェアを開発する.
|
Outline of Final Research Achievements |
In this study, we developed the Basic Linear Algebra Subprograms (BLAS) for massively parallel architectures, which is accurate and can ensure reproducibility of computation results among different environments. Focusing mainly on the Ozaki scheme, we have developed a high-performance implementation of accurate and reproducible BLAS routines, and demonstrated its application to sparse iterative solvers on CPUs and GPUs. As further applications, we proposed an implementation of a single/double precision matrix multiplications using low-precision arithmetic units (Tensor Cores) and a binary128 matrix multiplication using single/double precision matrix multiplications.
|
Academic Significance and Societal Importance of the Research Achievements |
CPUおよびGPUにおいて高精度かつ計算結果の再現が可能なBLASルーチンを実現し,疎行列ソルバーへの応用を示した.既存手法と比べて性能および実装が容易であり,応用数理分野での応用も期待できる.またAI向け低精度演算器を単精度・倍精度の行列計算に応用可能であることを示した.今後のハードウェアデザインへのインパクトも期待できる.
|
Report
(5 results)
Research Products
(38 results)
-
-
-
-
-
-
-
-
[Journal Article] Matrix Engines for High Performance Computing: A Paragon of Performance or Grasping at Straws?2021
Author(s)
Jens Domke, Emil Vatai, Aleksandr Drozd, Peng Chen, Yosuke Oyama, Lingqi Zhang, Shweta Salaria, Daichi Mukunoki, Artur Podobas, Mohamed Wahib, Satoshi Matsuoka
-
Journal Title
Proc. 35th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2021)
Volume: -
Related Report
Peer Reviewed / Int'l Joint Research
-
-
-
-
-
-
-
-
-
-
-
-
-
-
[Presentation] DGEMM using Tensor Cores2021
Author(s)
Daichi Mukunoki, Katsuhisa Ozaki, Takeshi Ogita, Toshiyuki Imamura
Organizer
SIAM Conference on Computational Science and Engineering (CSE21)
Related Report
Int'l Joint Research
-
-
-
-
-
-
[Presentation] Optimizing Precision for High-Performance, Robust, and Energy-Efficient Computations2020
Author(s)
Roman Iakymchuk, Fabienne Jezequel, Stef Graillat, Daichi Mukunoki, Toshiyuki Imamura, Yiyu Tan, Atsushi Koshiba, Jens Huthmann, Kentaro Sano, Norihisa Fujita, Taisuke Boku
Organizer
HPC Asia 2020 (poster session)
Related Report
Int'l Joint Research
-
-
-
-
[Presentation] Minimal-Precision Computing for High-Performance, Energy-Efficient, and Reliable Computations2019
Author(s)
Daichi Mukunoki, Toshiyuki Imamura, Yiyu Tan, Atsushi Koshiba, Jens Huthmann, Kentaro Sano, Fabienne Jezequel, Stef Graillat, Roman Iakymchuk, Norihisa Fujita, Taisuke Boku
Organizer
SC19 (research poster session)
Related Report
Int'l Joint Research
-
[Presentation] Minimal-Precision Computing for High-Performance, Energy-Efficient, and Reliable Computations2019
Author(s)
Daichi Mukunoki, Toshiyuki Imamura, Yiyu Tan, Atsushi Koshiba, Jens Huthmann, Kentaro Sano, Fabienne Jezequel, Stef Graillat, Roman Iakymchuk, Norihisa Fujita, Taisuke Boku
Organizer
France-Japan-Germany trilateral workshop: Convergence of HPC and Data Science for Future Extreme Scale Intelligent Applications (poster presentation)
Related Report
Int'l Joint Research
-
-
-
-
-