2017 Fiscal Year Annual Research Report

Overcoming Tribrid Programming with Latency Hiding Oriented Description Model

Research Project

Project/Area Number	15K12008
Research Institution	Osaka University
Principal Investigator	伊野文彦大阪大学, 情報科学研究科, 教授 (90346172)
Co-Investigator(Kenkyū-buntansha)	水谷泰治大阪工業大学, 情報科学部, 准教授 (10411414)
Project Period (FY)	2015-04-01 – 2018-03-31
Keywords	高性能計算 / CUDA / MPI / GPU / 並列処理
Outline of Annual Research Achievements	本研究の目的は，グラフィクスハードウェアGPU（Graphics Processing Unit）における遅延隠蔽指向の記述モデルを，分散メモリ型並列計算機上に展開し，そのプログラミング労力を軽減することである．そのために，GPUの統合開発環境としてもっとも普及しているCUDA（Compute Unified Device Architecture）を，クラスタなどの分散メモリ型並列計算機向けに拡張し，その記述のみでノード間通信を伴う超並列処理を実現することを目指している．前年度までに，分散CUDAの実現手法として，GPU向けのカーネル関数をクラスタ上で動作させる再利用方式の優位性を示した．最終年度は，再利用方式の性能を評価するために，LogGPSモデルやルーフラインモデルに基づく性能モデルを開発した．開発した性能モデルは，GPU主導通信，すなわちGPU上で動作するCUDAプログラムから呼び出されるノード間通信を主な対象として，クラスタ全体の性能を予測する．Infinibandネットワークで相互接続された4台のGPUクラスタにおいて，GPU主導通信を用いるステンシル計算を評価した結果，ピーク性能に対して93%程度の高い実行効率を達成できることを確認できた．ただし，その高い実行効率を達成するためには，Infiniband Verbsのような低遅延の通信プロトコルによる通信と計算のオーバラップが不可欠であった．また，効率のよいオーバラップを実現するためのタスク粒度，通信帯域幅や通信遅延を性能モデル上で明らかにした．このように，GPU向けの遅延隠蔽指向のプログラム記述を用いてノード間通信を伴う超並列処理を実現するためには，ノード間およびGPU内のそれぞれに対して適切なタスク粒度を設定する必要があり，その設定を支援するための性能モデルの有用性を確認できた．

Research Products
(12 results)

All 2018 2017 Other

All Journal Article (2 results) (of which Peer Reviewed: 2 results) Presentation (9 results) (of which Int'l Joint Research: 6 results) Remarks (1 results)

[Journal Article] PACC: A Directive-based Programming Framework for Out-of-Core Stencil Computation on Accelerators2018
- Author(s)
  Nobuhiro Miki, Fumihiko Ino, and Kenichi Hagihara
- Journal Title
  
  International Journal of High Performance Computing and Networking
  
  Volume: 印刷中 Pages: 印刷中
- Peer Reviewed
[Journal Article] Parallelizing Exact and Approximate String Matching via Inclusive Scan on a GPU2017
- Author(s)
  Yasuaki Mitani, Fumihiko Ino, and Kenichi Hagihara
- Journal Title
  
  IEEE Transactions on Parallel and Distributed Systems
  
  Volume: 28 Pages: 1989-2002
- DOI
  10.1109/TPDS.2016.2645222
- Peer Reviewed
[Presentation] An Automated Method for Generating Training Sets for Deep Learning Based Image Registration2018
- Author(s)
  Masato Ito and Fumihiko Ino
- Organizer
  11th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2018)
- Int'l Joint Research
[Presentation] Transparent Avoidance of Redundant Data Transfer on GPU-enabled Apache Spark2018
- Author(s)
  Ryo Asai, Masao Okita, Fumihiko Ino, and Kenichi Hagihara
- Organizer
  11th Workshop on General Purpose Processing Using GPU (GPGPU 2018)
- Int'l Joint Research
[Presentation] GPU主体のノード間通信を評価するための性能モデル2018
- Author(s)
  酒井亮太郎, 伊野文彦
- Organizer
  電子情報通信学会2018総合大会
[Presentation] 教育向け並列プログラミング言語におけるステップ実行を用いた理解支援ツール2018
- Author(s)
  岸本優斗, 近藤一輝, 中川翔太, 水谷泰治
- Organizer
  教育システム情報学会学生研究発表会(関西地区)
[Presentation] 並列プログラミングの学習における教育用並列プログラミング言語の適用2018
- Author(s)
  水谷泰治, 西口敏司, 橋本渉
- Organizer
  情報処理学会第80回全国大会
[Presentation] RLAGPU: High-performance Out-of-Core Randomized Singular Value Decomposition on GPU2017
- Author(s)
  Yuechao Lu, Fumihiko Ino, Yasuyuki Matsushita, and Kenichi Hagihara
- Organizer
  8th GPU Technology Conference (GTC 2017)
- Int'l Joint Research
[Presentation] cuShiftOr: String Matching with Prefix Summing on a GPU2017
- Author(s)
  Fumihiko Ino, Yasuaki Mitani, and Kenichi Hagihara
- Organizer
  8th GPU Technology Conference (GTC 2017)
- Int'l Joint Research
[Presentation] An Out-of-Core Branch and Bound for Solving the 0-1 Knapsack Problem on a GPU2017
- Author(s)
  Jingcheng Shen, Kentaro Shigeoka, Fumihiko Ino, and Kenichi Hagihara
- Organizer
  17th International Conference on Algorithms and Architectures for Parallel Processing (ICA3PP 2017)
- Int'l Joint Research
[Presentation] Accelerating Scoring Computation of Smith-Waterman Algorithm with Mixed Word Length2017
- Author(s)
  Kazuki Yasui and Fumihiko Ino
- Organizer
  4th International Workshop on High Performance Computing on Bioinformatics (HPCB 2017)
- Int'l Joint Research
[Remarks] 大阪大学大学院情報科学研究科並列処理工学講座
- URL
  http://www-ppl.ist.osaka-u.ac.jp/

2017 Fiscal Year Annual Research Report

Overcoming Tribrid Programming with Latency Hiding Oriented Description Model

Principal Investigator

伊野 文彦 大阪大学, 情報科学研究科, 教授 (90346172)

Research Products

[Journal Article] PACC: A Directive-based Programming Framework for Out-of-Core Stencil Computation on Accelerators2018

Author(s)

Journal Title

[Journal Article] Parallelizing Exact and Approximate String Matching via Inclusive Scan on a GPU2017

Author(s)

Journal Title

DOI

[Presentation] An Automated Method for Generating Training Sets for Deep Learning Based Image Registration2018

Author(s)

Organizer

[Presentation] Transparent Avoidance of Redundant Data Transfer on GPU-enabled Apache Spark2018

Author(s)

Organizer

[Presentation] GPU主体のノード間通信を評価するための性能モデル2018

Author(s)

Organizer

[Presentation] 教育向け並列プログラミング言語におけるステップ実行を用いた理解支援ツール2018

Author(s)

Organizer

[Presentation] 並列プログラミングの学習における教育用並列プログラミング言語の適用2018

Author(s)

Organizer

[Presentation] RLAGPU: High-performance Out-of-Core Randomized Singular Value Decomposition on GPU2017

Author(s)

Organizer

[Presentation] cuShiftOr: String Matching with Prefix Summing on a GPU2017

Author(s)

Organizer

[Presentation] An Out-of-Core Branch and Bound for Solving the 0-1 Knapsack Problem on a GPU2017

Author(s)

Organizer

[Presentation] Accelerating Scoring Computation of Smith-Waterman Algorithm with Mixed Word Length2017

Author(s)

Organizer

[Remarks] 大阪大学 大学院情報科学研究科 並列処理工学講座

URL

伊野文彦大阪大学, 情報科学研究科, 教授 (90346172)

[Remarks] 大阪大学大学院情報科学研究科並列処理工学講座