2017 Fiscal Year Annual Research Report

大規模グラフで表現された不規則・複雑な対象を高速にシミュレーションする方法の研究

Research Project

Project/Area Number	15H01687
Research Institution	Osaka University
Principal Investigator	萩原兼一大阪大学, 情報科学研究科, 特任教授(常勤) (00133140)
Co-Investigator(Kenkyū-buntansha)	伊野文彦大阪大学, 情報科学研究科, 教授 (90346172) 置田真生大阪大学, 情報科学研究科, 助教 (50563988)
Project Period (FY)	2015-04-01 – 2020-03-31
Keywords	超高速情報処理 / アルゴリズム / 生体機能シミュレータ / 自動並列化 / 自動プログラム生成 / 負荷均衡 / ベクトル処理
Outline of Annual Research Achievements	生体機能シミュレータFlintがベクトル型スーパーコンピュータSX-ACE向けに生成する実行コード（以下，SX-ACEコード）に関して，SX-ACE特有の観点からメモリ参照効率を向上した．Flintは不規則な参照パターンを含むプログラムに対するベクトル化効率を最大化するため，配列の間接参照を多用したSX-ACEコードを生成する．データの依存関係を明示することで間接参照のベクトル化が可能であるが，直接参照するベクトル化と比較するとメモリ参照のコストが大きい．そこで，参照パターンを解析し，不必要な間接参照を自動的に削減する手法を提案した．さらに，間接参照が減少するようにデータ配置とベクトル化されるループ内の参照順を最適化した．心筋細胞の膜電位モデルから生成したSX-ACEコードに対して，提案手法はSX-ACE上のメモリバンド幅ピーク性能比を15%から20%に向上し，実行時間を最大25%削減した．また，昨年度に提案した計算式の実行順序変更によるシミュレーション高速化手法を発展させ，最大で1.25倍高性能なコードを生成する手法を提案した．具体的には，メモリ参照パターンの解析結果に基づいてより細かい粒度で実行順序を調整することで，キャッシュ利用効率を向上した．この手法は実行アーキテクチャによらず有効であり，CPU実行およびGPU実行の両方において性能向上を達成した．Intel Xeon CPUを用いた実行ではキャッシュミス数を66～85%に削減し，最大1.17倍の速度向上を達成した．最新のGPUであるNVIDIA Tesla V100（640個のTensorコア）を用いた並列実行ではキャッシュミス数を概ね90%に削減し，最大1.25倍の速度向上を達成した．
Current Status of Research Progress	Current Status of Research Progress 2: Research has progressed on the whole more than it was originally planned. Reason 平成29年度交付申請書の研究実施計画に記載した内容に関して，研究実績の概要に記載した通り実施できたため．
Strategy for Future Research Activity	研究計画調書の研究計画・方法に記載した通りに実施可能と考える．

Research Products
(9 results)

All 2018 2017 Other

All Journal Article (2 results) (of which Peer Reviewed: 2 results) Presentation (6 results) (of which Int'l Joint Research: 6 results) Remarks (1 results)

[Journal Article] PACC: A Directive-based Programming Framework for Out-of-Core Stencil Computation on Accelerators2018
- Author(s)
  Nobuhiro Miki, Fumihiko Ino, and Kenichi Hagihara
- Journal Title
  
  International Journal of High Performance Computing and Networking
  
  Volume: 印刷中 Pages: 印刷中
- Peer Reviewed
[Journal Article] Parallelizing Exact and Approximate String Matching via Inclusive Scan on a GPU'2017
- Author(s)
  Yasuaki Mitani, Fumihiko Ino, and Kenichi Hagihara
- Journal Title
  
  IEEE Transactions on Parallel and Distributed Systems
  
  Volume: 29 Pages: 1989-2002
- DOI
  10.1109/TPDS.2016.2645222
- Peer Reviewed
[Presentation] An Automated Method for Generating Training Sets for Deep Learning Based Image Registration2018
- Author(s)
  Masato Ito and Fumihiko Ino
- Organizer
  the 11th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2018)
- Int'l Joint Research
[Presentation] Transparent Avoidance of Redundant Data Transfer on GPU-enabled Apache Spark2018
- Author(s)
  Ryo Asai, Masao Okita, Fumihiko Ino, and Kenichi Hagihara
- Organizer
  the 11th Workshop on General Purpose Processing Using GPU (GPGPU 2018)
- Int'l Joint Research
[Presentation] RLAGPU: High-performance Out-of-Core Randomized Singular Value Decomposition on GPU2017
- Author(s)
  Yuechao Lu, Fumihiko Ino, Yasuyuki Matsushita, and Kenichi Hagihara
- Organizer
  the 8th GPU Technology Conference (GTC 2017)
- Int'l Joint Research
[Presentation] cuShiftOr: String Matching with Prefix Summing on a GPU2017
- Author(s)
  Fumihiko Ino, Yasuaki Mitani, and Kenichi Hagihara
- Organizer
  the 8th GPU Technology Conference (GTC 2017)
- Int'l Joint Research
[Presentation] An Out-of-Core Branch and Bound for Solving the 0-1 Knapsack Problem on a GPU2017
- Author(s)
  Jingcheng Shen, Kentaro Shigeoka Fumihiko Ino, and Kenichi Hagihara
- Organizer
  the 17th International Conference on Algorithms and Architectures for Parallel Processing (ICA3PP 2017)
- Int'l Joint Research
[Presentation] Accelerating Scoring Computation of Smith-Waterman Algorithm with Mixed Word Length2017
- Author(s)
  Kazuki Yasui and Fumihiko Ino
- Organizer
  the 4th International Workshop on High Performance Computing on Bioinformatics (HPCB 2017)
- Int'l Joint Research
[Remarks] 大阪大学　大学院情報科学研究科　並列処理工学講座
- URL
  http://www-ppl.ist.osaka-u.ac.jp/

2017 Fiscal Year Annual Research Report

大規模グラフで表現された不規則・複雑な対象を高速にシミュレーションする方法の研究

Principal Investigator

萩原 兼一 大阪大学, 情報科学研究科, 特任教授(常勤) (00133140)

Current Status of Research Progress

Reason

Research Products

[Journal Article] PACC: A Directive-based Programming Framework for Out-of-Core Stencil Computation on Accelerators2018

Author(s)

Journal Title

[Journal Article] Parallelizing Exact and Approximate String Matching via Inclusive Scan on a GPU'2017

Author(s)

Journal Title

DOI

[Presentation] An Automated Method for Generating Training Sets for Deep Learning Based Image Registration2018

Author(s)

Organizer

[Presentation] Transparent Avoidance of Redundant Data Transfer on GPU-enabled Apache Spark2018

Author(s)

Organizer

[Presentation] RLAGPU: High-performance Out-of-Core Randomized Singular Value Decomposition on GPU2017

Author(s)

Organizer

[Presentation] cuShiftOr: String Matching with Prefix Summing on a GPU2017

Author(s)

Organizer

[Presentation] An Out-of-Core Branch and Bound for Solving the 0-1 Knapsack Problem on a GPU2017

Author(s)

Organizer

[Presentation] Accelerating Scoring Computation of Smith-Waterman Algorithm with Mixed Word Length2017

Author(s)

Organizer

[Remarks] 大阪大学 大学院情報科学研究科 並列処理工学講座

URL

萩原兼一大阪大学, 情報科学研究科, 特任教授(常勤) (00133140)

[Remarks] 大阪大学　大学院情報科学研究科　並列処理工学講座