• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to project page

2016 Fiscal Year Final Research Report

Fault tolerant computing based on a multi-SPMD programming/execution environment

Research Project

  • PDF
Project/Area Number 26730064
Research Category

Grant-in-Aid for Young Scientists (B)

Allocation TypeMulti-year Fund
Research Field High performance computing
Research InstitutionInstitute of Physical and Chemical Research

Principal Investigator

Tsuji Miwako  国立研究開発法人理化学研究所, 計算科学研究機構, 研究員 (80466466)

Project Period (FY) 2014-04-01 – 2017-03-31
Keywords耐故障性 / プログラミングモデル
Outline of Final Research Achievements

In this research, we have supported fault tolerance features in a multi-SPMD programming/execution environment, where tasks in a workflow are executed in distributed parallel. The programming environment adopts multi-programming methodologies across multi-architectural levels, such as Numa-core groups in a node, nodes in a cluster, a cluster of clusters, to realize scalability. To achieve a fault tolerance and resilience mechanism without any modification of the application’s source code, we have developed middleware to detect errors in remote programs and extended workflow scheduler to realize fault resilience.

Free Research Field

高性能計算

URL: 

Published: 2018-03-22  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi