研究実績の概要 |
In FY2020, the second year of the ExaPath project, we conducted two distinct studies for routing in HPC interconnects. The first published paper of this FY is a survey of data center and supercomputer networks, which investigates various aspects related to how multi-pathing is implemented in those systems, what type of routing they deploy, and how effectively utilize them for extensive communication loads. The survey with the title "High-Performance Routing with Multipathing and Path Diversity in Supercomputers and Data Centers" was published in the IEEE Transactions on Parallel and Distributed Systems journal. The second published work, a peer-reviewed poster, is based on a Bachelor's thesis of our intern from Tokyotech which was presented at the 3rd R-CCS International Symposium. This thesis and poster tackled the fault resiliency of lossless interconnects and how to perform rerouting of the network while preserving certain properties, such as deadlock-freedom. Furthermore, we collaborated with researchers of ETH Zurich to develop a real Slimfly testbed and deploy the routing we developed in the previous FY. Simultaneously, we supervised with a colleague from ETH a second Bachelor's thesis with the topic of routing low-diameter topologies. Lastly, we disseminated our research findings through invited talks at the ISC High Performance conference (ISC'20) in a focus session on 'Photonics & Interconnects' and discussed our work and related routing and network topics with colleagues from academia and industry at various meetings and conference.
|
現在までの達成度 (区分) |
現在までの達成度 (区分)
3: やや遅れている
理由
The original plan is slightly delayed, because COVID caused major disturbances in the research community as well as conference schedules. Hence, opportunities to seek new collaborators and chances to discuss and disseminate our research findings were fewer than expected.
|
今後の研究の推進方策 |
The future direction of the research will primarily match the initially outlined plan in the project proposal. We will try to establish more international and domestic collaborations to develop a suitable HPC routing library which hopefully can be interfaced with the OpenFabrics Management Framework (OFMF) and other interconnection management frameworks. And we plan to develop new, and assist in the development (through collaborations) of new, routing algorithms for current and future HPC installations.
|