2019 Fiscal Year Annual Research Report
ExaPath: Hierarchical Routing for Next-Gen Supercomputers and Beyond
Project/Area Number |
19H04119
|
Research Institution | Institute of Physical and Chemical Research |
Principal Investigator |
ドンケ イェンス 国立研究開発法人理化学研究所, 計算科学研究センター, 特別研究員 (70815480)
|
Project Period (FY) |
2019-04-01 – 2024-03-31
|
Keywords | HPC interconnects |
Outline of Annual Research Achievements |
We achieved a milestone in the area of HPC interconnects by developing a large-scale proof-of-concept of the HyperX interconnection network. The HyperX topology has been designed to reduce the point-to-point latency and cost of state-of-the-art fat-tree networks, and our HyperX real-life prototype demonstrated that HyperX topologies are a viable alternative to fat-trees even without adaptive routing. Our novel Pattern-Aware Routing for hyperX (PARX) routing circumvents the bottleneck arising from applying a shortest-path, static routing to a HyperX. We submitted two publications (one short version in 26th Symposium on High-Performance Interconnects (HOTI 26), and a full paper in the International Conference for High Performance Computing, Networking, Storage and Analysis (SC ’19)) which introduced the novel routing algorithm specifically tailored but not limited to the HyperX topology. Furthermore, we collaborated with researchers of ETH Zurich to develop a routing for Slimfly, resulting in a Bachelor thesis with the title: "Design and Implementation of Multipath Switching in InfiniBand Slimfly Networks", which developed a multipath-routing algorithm based on PARX underlying principles and adapted to Slimfly which can switch between minimal path or the non-minimal one depending on the message size. Last but not least, we disseminated our research findings through invited talks internally, via RIKEN R-CCS Cafe, and externally, via the High Performance Consortium for Advanced Scientific and Technical Computing (HP-CAST 32).
|
Current Status of Research Progress |
Current Status of Research Progress
3: Progress in research has been slightly delayed.
Reason
Minor administrative barriers prevented us from hiring a qualified undergraduate student from a domestic university to assist with the research, and hence the initial research plan was slightly delayed.
|
Strategy for Future Research Activity |
The future direction of the research will primarily match the initially outlined plan in the project proposal. We will try to establish more international and domestic collaborations to develop a suitable HPC routing library which hopefully can be interfaced with the OpenFabrics Management Framework (OFMF) and other interconnection management frameworks. And we plan to develop new, and assist in the development (through collaborations) of new, routing algorithms for current and future HPC installations.
|
Research Products
(7 results)