• Search Research Projects
  • Search Researchers
  • How to Use
  1. Back to project page

1998 Fiscal Year Final Research Report Summary

Fault Tolerance of Loosely Coupled Parallel Computer by Program Translation

Research Project

Project/Area Number 09680332
Research Category

Grant-in-Aid for Scientific Research (C)

Allocation TypeSingle-year Grants
Section一般
Research Field 計算機科学
Research InstitutionTOKYO INSTITUTE OF TECHNOLOGY (1998)
Japan Advanced Institute of Science and Technology (1997)

Principal Investigator

YOKOTA Haruo  Tokyo Institute of Technology, Graduate School of Information Science and Engineering, Department of Computer Science, Associate Professor, 大学院・情報理工学研究科, 助教授 (10242570)

Co-Investigator(Kenkyū-buntansha) SUGINO Eiji  Iwate Prefecture University, Lecturer, ソフトウェア情報学部, 講師 (10293391)
Project Period (FY) 1997 – 1998
KeywordsMassively Parallel System / Fault Tolerant Software / Program Translation / Primary-backup Method / State-machine Method / Replication
Research Abstract

Corresponding to practicality of massively parallel systems, requirements for fault tolerance of the parallel systems becomes very large. We proposed a method which masks a fault of a component processor in a massively parallel system by parallel software without assuming dedicated hardware or operating systems.
In the case of a component processor fault, the fault-tolerant parallel software detects the fault, and continue the job without the faulted processor by combining the primary-backup method and state-machine method. Because to write parallel programs with concerning fault tolerance requires heavy burden for programmers, we provide a mechanism which automatically converts an original parallel program into fault-tolerant parallel software masking a fault of a component processor by using parallel logic programming language.
Since components of a fault-tolerant parallel system are generally used as redundancies for implementing the fault tolerance, system performance would be decreased by enhancement of fault tolerance. Moreover, overhead of software fault tolerance also decreases its performance. It is not enough to show the improvement of reliability, but it is required to show the balance between the reliability and performance. Therefore, we introduce a criterion to evaluate both the system reliability and performance, and consider execution environment in which the fault-tolerant parallel software becomes effective.

  • Research Products

    (14 results)

All Other

All Publications (14 results)

  • [Publications] 杉野 栄二: "疎結合並列システム向け耐故障化並列プログラムの実行オーバーヘッド" 情報処理学会 並列処理シンポジウムJSPP'97 論文集. 361-368 (1997)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] 杉野 栄二: "疎結合並列計算機における耐故障並列プログラムの実行性能に関する考察" 電子情報通信学会 信学技法. FTS97-26. 71-78 (1997)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] 味松 康行: "耐故障並列ディスクシステムにおける通信衝突の影響" 電子情報通信学会 信学技法. FTS97-31. (1997)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] 狩野 光徳: "分散型データベースにおける複製データの動的配置制御" 電子情報通信学会 第9回データ工学ワークショップ論文集. 414-419 (1998)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] Haruo Yokota: "Techniques for Implementing Sophisticated parallel Information Servers" Advanced Database Systems for Integration of Media and User Environments'98. 167-172 (1998)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] 杉野 栄二: "耐故障並列ソフトウェアの性能と信頼性に関する解析" 電子情報通信学会論文誌(D-I). J81-D-I, 11. 1219-1227 (1998)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] Haruo Yokota: "Fat-Btree : An Update-Conscious parallel Directory Structure" IEEE Inernational Conference on Data Engineering'99. (1999)

    • Description
      「研究成果報告書概要(和文)」より
  • [Publications] Eiji Sugino, Haruo Yokota: "Execution Overhead of Fault Tolerant Parallel Software for MPPs" Proc.of Joint Symposium on Parallel Processing 97. 361-368 (1997)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] Eiji Sugino, Haruo Yokota: "Consideration for Performance of Fault Tolerant Parallel Software for MPPs" Technical Report of IEICE. FTS97-26. (1997)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] Yasuyuki Mimatsu, Haruo Yokota: "The Influence of Communication Contention in Fault Tolerant Parallel Disk Systems" Technical Report of IEICE. FTS97-31. (1997)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] Mitunori Karino, and Haruo Yokota: "Improvement on the Lazy Replication Method for Distributed Databases by Migrating the Master Site" Proc.of IEICE Data Engineering Workshop 98. 414-419 (1998)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] Haruo Yokota: "Techniques for Implementing Sophisticated Parallel Information Servers" Advanced Database Systems for Integration of Media and User Environments '98, World Scientific. 167-172 (1998)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] Eiji Sugino, Haruo Yokota: "Analysis for Performance and Reliability of Fault Tolerant Parallel Software" Transactions of IEICE D-I,Vol.J81-D-I,No.11, IEICE. 1219-1227 (1998)

    • Description
      「研究成果報告書概要(欧文)」より
  • [Publications] Haruo Yokota, Yasuhiko Kanemasa, Jun Miyazaki: "Fat-Btree : An Update-Conscious Parallel Directory Structure" Proc.of 15th International Conference on Data Engineering. (to appear). (1999)

    • Description
      「研究成果報告書概要(欧文)」より

URL: 

Published: 1999-12-08  

Information User Guide FAQ News Terms of Use Attribution of KAKENHI

Powered by NII kakenhi