Project/Area Number |
23220003
|
Research Category |
Grant-in-Aid for Scientific Research (S)
|
Allocation Type | Single-year Grants |
Research Field |
Computer system/Network
|
Research Institution | Tokyo Institute of Technology |
Principal Investigator |
Matsuoka Satoshi 東京工業大学, 学術国際情報センター, 教授 (20221583)
|
Co-Investigator(Kenkyū-buntansha) |
Hideyuki Jitsumoto 東京工業大学, 学術国際情報センター, 助教 (00545311)
|
Co-Investigator(Renkei-kenkyūsha) |
Toshio Endo 東京工業大学, 学術国際情報センター, 准教授 (80396788)
Hitoshi Sato 東京工業大学, 学術国際情報センター, 特任助教 (00550633)
Naoya Maruyama 理化学研究所, 計算科学研究機構, チームリーダ (60532801)
Shinichiro Takizawa 理化学研究所, 計算科学研究機構, 研究員 (80550483)
Kento Sato Lawrence Livermore National Laboratory, Postdoctoral Research Staff (50739696)
|
Research Collaborator |
Leonardo Bautista Gomez Barcelona Supercomputing Center, Senior Researcher
Jens Domke Technische Universitat Dresden, ZIH, Research Associate
|
Project Period (FY) |
2011-04-01 – 2016-03-31
|
Project Status |
Completed (Fiscal Year 2015)
|
Budget Amount *help |
¥213,720,000 (Direct Cost: ¥164,400,000、Indirect Cost: ¥49,320,000)
Fiscal Year 2015: ¥22,490,000 (Direct Cost: ¥17,300,000、Indirect Cost: ¥5,190,000)
Fiscal Year 2014: ¥66,560,000 (Direct Cost: ¥51,200,000、Indirect Cost: ¥15,360,000)
Fiscal Year 2013: ¥22,100,000 (Direct Cost: ¥17,000,000、Indirect Cost: ¥5,100,000)
Fiscal Year 2012: ¥65,780,000 (Direct Cost: ¥50,600,000、Indirect Cost: ¥15,180,000)
Fiscal Year 2011: ¥36,790,000 (Direct Cost: ¥28,300,000、Indirect Cost: ¥8,490,000)
|
Keywords | ハイパフォーマンスコンピューティング / エクサスケールコンピューティング / 耐故障性技術 / データ圧縮 / チェックポイント・リスタート / バーストバッファ / 耐故障技術 / 国際研究者交流(ドイツ・アメリカ) / 国際情報交換(ドイツ・アメリカ) / バースト・バッファー / ヘテロジニアスアーキテクチャ / 耐障害性 / 国際研究者交流(アメリカ) |
Outline of Final Research Achievements |
Fault tolerance has been recognized as an indispensable technique for exascale computing as supercomputers grow towards billion-way of parallelism. For future exascale supercomputers, we proposed advanced fault tolerant infrastructures. The advanced fault tolerant infrastructures include a scalable checkpoint/restart library, a fault tolerant messaging interface and a highly resilient burst buffer architecture. We validated the effectiveness based on mathematical statistics. We also released the software and made impact to the community.
|
Assessment Rating |
Verification Result (Rating)
A
|
Assessment Rating |
Result (Rating)
A: Progress in the research is steadily towards the initial goal. Expected research results are expected.
|