2018 Fiscal Year Final Research Report
Research on Integration of Communication and Computation by Tightly Coupled Accelerators
Project/Area Number |
15K00166
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Multi-year Fund |
Section | 一般 |
Research Field |
High performance computing
|
Research Institution | The University of Tokyo |
Principal Investigator |
|
Project Period (FY) |
2015-04-01 – 2019-03-31
|
Keywords | FPGA / 演算加速装置 / PCI Express / OpenCL / OpenACC |
Outline of Final Research Achievements |
Tightly Coupled Accelerators (TCA) architecture, which realizes direct communication among accelerators such as GPUs, is effective for improving strong-scaling performance thanks to low-latency of TCA. In the present study, the feasibility study was performed for the purpose of realization of highly efficient computation by fusion of fast communication using TCA and FPGA computation. Several kernels including numerical algorithms were described for accelerator by OpenCL, and higher performance could be achieved by further modification toward highly pipelined manner. Automatic conversion from OpenACC to OpenCL was also investigated. However, since drastic modification is required from traditional description manner, it is considered that automatic optimization is too complicated.
|
Free Research Field |
高性能計算
|
Academic Significance and Societal Importance of the Research Achievements |
演算加速器向けのプログラミング言語であるOpenCLを用いたFPGA実装がFeasibleであることを示した。しかしGPUのようにデータ並列の記述では性能が得られず、FPGAのアーキテクチャを考慮し記述する必要がある。OpenCLのカーネルを分割し、パイプライン方式での制御に変更することで行列積については高い性能が得られた。 また、通常のソフトウェア最適化技術と逆行する、冗長な記述や、ループ中での分岐などがFPGAで有効である。 今後に向けた最新インタフェースとして、CPUとキャッシュ一貫性を持つFPGA接続、3次元積層メモリに関して性能確認を行い、現状の各5倍、30倍程度のバンド幅が期待できる。
|