• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to previous page

Linear Solvers for Machine Learning Hardware

Research Project

Project/Area Number 18H03248
Research Category

Grant-in-Aid for Scientific Research (B)

Allocation TypeSingle-year Grants
Section一般
Review Section Basic Section 60090:High performance computing-related
Research InstitutionTokyo Institute of Technology

Principal Investigator

Yokota Rio  東京工業大学, 学術国際情報センター, 准教授 (20760573)

Co-Investigator(Kenkyū-buntansha) 大島 聡史  名古屋大学, 情報基盤センター, 准教授 (40570081)
伊田 明弘  東京大学, 情報基盤センター, 特任准教授 (80742121)
Project Period (FY) 2018-04-01 – 2021-03-31
Project Status Completed (Fiscal Year 2020)
Budget Amount *help
¥16,900,000 (Direct Cost: ¥13,000,000、Indirect Cost: ¥3,900,000)
Fiscal Year 2020: ¥4,030,000 (Direct Cost: ¥3,100,000、Indirect Cost: ¥930,000)
Fiscal Year 2019: ¥4,160,000 (Direct Cost: ¥3,200,000、Indirect Cost: ¥960,000)
Fiscal Year 2018: ¥8,710,000 (Direct Cost: ¥6,700,000、Indirect Cost: ¥2,010,000)
Keywords機械学習向けプロセッサ / 階層的低ランク近似法 / TensorCore / 高性能計算 / H行列 / 低精度演算 / テンソルコア / FPGA / Tensor Core / 機械学習向けハードウェア
Outline of Final Research Achievements

The trend in computer architecture has now shifted from general purpose accelerators to specialized hardware for machine learning. The present work focuses on the affinity between hierarchical low-rank approximation methods, and low-precision arithmetic units and tensor product accelerators in machine learning processors to develop a suitable linear algebra library for future architectures. In FY2018, we ported our H-matrix library to use batched MAGMA operations in order to take advantage of the tensor product accelerators. In FY2019, we optimized the inner kernels of the H-matrix by making use of TensorCores. In FY2020, we extended this work to recover the accuracy when using TensorCores and measured the energy efficiency.

Academic Significance and Societal Importance of the Research Achievements

最近のコンピュータは人工知能が高速に動作するように特化しているが,環境,医療,量子,材料などの重点分野で用いられる科学技術計算をこのようなコンピュータ上でいかに高速に動作させるかは大きな課題である.本研究で提案する手法を用いることで,人工知能だけでなく,その他の多くの分野で行なう計算を次世代のコンピュータ上で高速に実行できるようになる.これから量産される高性能な人工知能専用計算機を汎用的な用途で用いることができれば,環境,医療,量子,材料の分野がますます発展することが予想される.

Report

(4 results)
  • 2020 Annual Research Report   Final Research Report ( PDF )
  • 2019 Annual Research Report
  • 2018 Annual Research Report
  • Research Products

    (34 results)

All 2020 2019 2018 Other

All Int'l Joint Research (6 results) Journal Article (6 results) (of which Int'l Joint Research: 3 results,  Peer Reviewed: 6 results,  Open Access: 1 results) Presentation (21 results) (of which Int'l Joint Research: 20 results,  Invited: 2 results) Remarks (1 results)

  • [Int'l Joint Research] Sandia National Laboratories/University of Tennessee(米国)

    • Related Report
      2020 Annual Research Report
  • [Int'l Joint Research] KAUST(サウジアラビア)

    • Related Report
      2020 Annual Research Report
  • [Int'l Joint Research] Sandia National Laboratories(米国)

    • Related Report
      2019 Annual Research Report
  • [Int'l Joint Research] KAUST(サウジアラビア)

    • Related Report
      2019 Annual Research Report
  • [Int'l Joint Research] University of Tennessee/Sandia National Laboratories(米国)

    • Related Report
      2018 Annual Research Report
  • [Int'l Joint Research] KAUST(サウジアラビア)

    • Related Report
      2018 Annual Research Report
  • [Journal Article] Lattice H-matrices for Massively Parallel Micromagnetic Simulations of Current-induced Domain Wall Motion2020

    • Author(s)
      Akihiro Ida, Tadashi Ataka, Atsushi Furuya
    • Journal Title

      IEEE Transactions on Magnetics

      Volume: 56(4) Issue: 4 Pages: 1-4

    • DOI

      10.1109/tmag.2019.2959349

    • Related Report
      2019 Annual Research Report
    • Peer Reviewed
  • [Journal Article] QR Factorization of Block Low-rank Matrices with Weak Admissibility Condition2019

    • Author(s)
      Akihiro Ida, Hiroshi Nakashima, Tasuku Hiraishi, Ichitaro Yamazaki, Rio Yokota, Takeshi Iwashita
    • Journal Title

      Journal of Information Processing

      Volume: 27 Issue: 0 Pages: 831-839

    • DOI

      10.2197/ipsjjip.27.831

    • NAID

      130007762323

    • ISSN
      1882-6652
    • Related Report
      2019 Annual Research Report
    • Peer Reviewed / Open Access / Int'l Joint Research
  • [Journal Article] Distributed Memory Lattice H-matrix Factorization2019

    • Author(s)
      Ichitaro Yamazaki, Akihiro Ida, Rio Yokota and Jack Dongarra
    • Journal Title

      The International Journal of High Performance Computing Applications

      Volume: 33(5) Issue: 5 Pages: 1046-1063

    • DOI

      10.1177/1094342019861139

    • Related Report
      2019 Annual Research Report
    • Peer Reviewed / Int'l Joint Research
  • [Journal Article] Extreme Scale FMM-Accelerated Boundary Integral Equation Solver for Wave Scattering2019

    • Author(s)
      Mustafa AbdulJabbar, Mohammed Al Farhan, Noha Al-Harthi, Rui Chen, Rio Yokota, Hakan Bagci, David Keyes
    • Journal Title

      SIAM Journal on Scientific Computing

      Volume: 4(3) Issue: 3 Pages: C245-C268

    • DOI

      10.1137/18m1173599

    • Related Report
      2019 Annual Research Report
    • Peer Reviewed / Int'l Joint Research
  • [Journal Article] Highly Productive, High-Performance Application Frameworks for Post-Petascale Computing2018

    • Author(s)
      N. Maruyama, T. Aoki, K. Taura, R. Yokota, M. Wahib, M. Matsuda, K. Fukuda, T. Shimokawabe, N. Onodera, M. Muller, S. Iwasaki
    • Journal Title

      Advanced Software Technologies for Post-Peta Scale Computing

      Volume: none Pages: 77-98

    • DOI

      10.1007/978-981-13-1924-2_5

    • ISBN
      9789811319235, 9789811319242
    • Related Report
      2018 Annual Research Report
    • Peer Reviewed
  • [Journal Article] Application of Hierarchical Matrices to Large-Scale Electromagnetic Field Analyses of Coils Wound With Coated Conductors2018

    • Author(s)
      Tominaga Naoki、Mifune Takeshi、Ida Akihiro、Sogabe Yusuke、Iwashita Takeshi、Amemiya Naoyuki
    • Journal Title

      IEEE Transactions on Applied Superconductivity

      Volume: 28 Issue: 3 Pages: 1-5

    • DOI

      10.1109/tasc.2017.2780821

    • Related Report
      2018 Annual Research Report
    • Peer Reviewed
  • [Presentation] Distributed Memory Task-Based Block Low Rank Direct Solver2020

    • Author(s)
      Sameer Deshmukh, Rio Yokota
    • Organizer
      ISC High Performance 2020 (Research Poster)
    • Related Report
      2020 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Randomized SVD on TensorCores2020

    • Author(s)
      Hiroyuki Ootomo, Rio Yokota
    • Organizer
      ISC High Performance 2020, (Research Poster)
    • Related Report
      2020 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Effect of Mixed Precision Computing on H-Matrix Vector Multiplication in BEM Analysis2020

    • Author(s)
      Rise Ooi, Takeshi Iwashita, Takeshi Fukaya, Akihiro Ida, Rio Yokota
    • Organizer
      HPC Asia 2020
    • Related Report
      2020 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Distributed Memory Task-Based Block Low Rank Direct Solver2020

    • Author(s)
      Sameer Deshmukh, Rio Yokota
    • Organizer
      HPC Asia 2020 (poster)
    • Related Report
      2020 Annual Research Report
    • Int'l Joint Research
  • [Presentation] QR Decomposition of Block Low-Rank Matrices2020

    • Author(s)
      Muhammad Ridwan Apriansyah, Rio Yokota
    • Organizer
      HPC Asia 2020 (poster)
    • Related Report
      2020 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Numerical Linear Algebra Based on Lattice H-Matrices2020

    • Author(s)
      Akihiro Ida, Ichitaro Yamazaki, Rio Yokota, Satoshi Ohshima, Tasuku Hiraishi, Takeshi Iwashita, Tetsuya Hoshino, and Toshihiro Hanawa
    • Organizer
      International Conference on High Performance Computing in Asia-Pacific Region (HPC Asia 2020)
    • Related Report
      2019 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Optimization of Numerous Small Dense-Matrix Vector Multiplications in H-matrix Arithmetic on GPU2019

    • Author(s)
      Satoshi Ohshima, Ichitaro Yamazaki, Akihiro Ida, Rio Yokota
    • Organizer
      Auto-Tuning for Multicore and GPU (ATMG) In conjunction with the IEEE MCSoC-19
    • Related Report
      2019 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Distributed Memory Task-Based Block Low Rank Direct Solver2019

    • Author(s)
      Sameer Deshmukh, Rio Yokota
    • Organizer
      HPC Asia 2020 (poster)
    • Related Report
      2019 Annual Research Report
    • Int'l Joint Research
  • [Presentation] QR Decomposition of Block Low-Rank Matrices2019

    • Author(s)
      Muhammad Ridwan Apriansyah, Rio Yokota
    • Organizer
      HPC Asia 2020 (poster)
    • Related Report
      2019 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Runtime System for GPU-based Hierarchical LU factorization2019

    • Author(s)
      Qianxing Ma, Rio Yokota
    • Organizer
      The International Conference for High Performance Computing, Networking, Storage, and Analysis (poster)
    • Related Report
      2019 Annual Research Report
    • Int'l Joint Research
  • [Presentation] TSQR on TensorCores2019

    • Author(s)
      Hiroyuki Ootomo, Rio Yokota
    • Organizer
      The International Conference for High Performance Computing, Networking, Storage, and Analysis (best poster candidate)
    • Related Report
      2019 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Tensorコアを用いたTSQR2019

    • Author(s)
      大友 広幸, 横田 理央
    • Organizer
      日本応用数理学会年会
    • Related Report
      2019 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Application of the Fast Micromagnetic Simulation to Thin Spintronic Devices2019

    • Author(s)
      Tadashi Ataka, Akihiro Ida, Atsushi Furuya, Koichi Shimizu, Jun Fujisaki, Tomohiro Tanaka and Hirotaka Oshima
    • Organizer
      22nd International Conference on the Computation of Electromagnetic Fields
    • Related Report
      2019 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Tensorコアを用いたBatched QR分解2019

    • Author(s)
      大友広幸, 横田理央
    • Organizer
      第81回情報処理学会全国大会
    • Related Report
      2018 Annual Research Report
  • [Presentation] Analyzing Performance of BiCGStab with Hierarchical Matrix on GPU clusters2018

    • Author(s)
      Ichitaro Yamazaki, Ahmad Abdelfattah, Akihiro Ida, Satoshi Ohshima, Stanimire Tomov, Rio Yokota, Jack Dongarra
    • Organizer
      32nd IEEE International Parallel & Distributed Processing Symposium
    • Related Report
      2018 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Optimization of Hierarchical Matrix Computation on GPU2018

    • Author(s)
      Satoshi Ohshima, Ichitaro Yamazaki, Akihiro Ida, Rio Yokota
    • Organizer
      SC Asia
    • Related Report
      2018 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Accelerating Convolutional Neural Networks Using Low Precision Arithmetic2018

    • Author(s)
      Hiroki Naganuma, Rio Yokota
    • Organizer
      HPC Asia
    • Related Report
      2018 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Energy Conserving Fast Multipole Methods for the Calculation of Long-range Interactions2018

    • Author(s)
      Rio Yokota
    • Organizer
      Mathematics in Action: Modeling and analysis in molecular biology and electrophysiology
    • Related Report
      2018 Annual Research Report
    • Int'l Joint Research / Invited
  • [Presentation] Can we use Hierarchical Low-Rank Approximation for Deep Learning?2018

    • Author(s)
      Rio Yokota
    • Organizer
      HPC Saudi
    • Related Report
      2018 Annual Research Report
    • Int'l Joint Research / Invited
  • [Presentation] Design of Parallel BEM Analyses Framework for SIMD Processors2018

    • Author(s)
      Tetsuya Hoshino, Akihiro Ida, Toshihiro Hanawa, Kengo Nakajima
    • Organizer
      The International Conference on Computational Science
    • Related Report
      2018 Annual Research Report
    • Int'l Joint Research
  • [Presentation] Lattice H-Matrices on Distributed-Memory Systems2018

    • Author(s)
      Akihiro Ida
    • Organizer
      32nd IEEE International Parallel & Distributed Processing Symposium
    • Related Report
      2018 Annual Research Report
    • Int'l Joint Research
  • [Remarks] 横田研究室webpage

    • URL

      https://www.rio.gsic.titech.ac.jp/jp/index.html

    • Related Report
      2019 Annual Research Report

URL: 

Published: 2018-04-23   Modified: 2022-01-27  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi