2007 Fiscal Year Final Research Report Summary
High-Performance Parallel Simulation Technology for Advanced Information System Development
Project/Area Number |
17300015
|
Research Category |
Grant-in-Aid for Scientific Research (B)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
Computer system/Network
|
Research Institution | Kyoto University (2006-2007) Toyohashi University of Technology (2005) |
Principal Investigator |
NAKASHIMA Hiroshi Kyoto University, Academic Center for Computing and Media Studies, Professor (10243057)
|
Co-Investigator(Kenkyū-buntansha) |
TSUMURA Tomoaki Nagoya Institute of Technology, Graduate School of Engineering, Associate Professor (00335233)
NAKADA Takashi Nara Institute of Science and Technology, Graduate School of Information Science, Assistant Professor (00452524)
|
Project Period (FY) |
2005 – 2007
|
Keywords | simulation engineering / computer systems / system-on-chip / high-performance computing |
Research Abstract |
The research has been pursued aiming at the parallelization and performance improvement of cycle accurate simulation (CAS) for advanced IT systems with microprocessors as their key components. The major achievements of the research are the following simulation techniques. 1. Our parallelized CAS is combined with an in-order execution simulator to produce approximated partial results of the parallel simulation. In order to avoid that this sequential in-order simulator became a bottleneck, we devised an acceleration technique to generate a simulator specific to each workload automatically. This technique achieves up to 34-fold performance improvement for instruction level simulation. 2. We devised a new parallel simulation method in which the process of a simulation is divided into intervals along with time axis so that the intervals me executed in parallel The problem arisen from the dependency between intervals is solved by speculatively executing each interval with approximated partial result of preceding intervals using in-order simulation. As a result, our parallel execution with an eight-node PC cluster achieves up to 5.8-fold speed-up. 3. To estimate the worst-case delay caused by interrupts, which is required in the design of real-time systems, we devised O (FN) algorithms to analyze the worst-case miss increments of caches and branch predictors for a workload with N instructions and F interrupts. Furthermore, we devised a O (Nlog N)algorithm to calculate the delay itself accurately using CAS, which can be accelerated by parallelization, up to 9-fold with an eight-node PC cluster.
|
Research Products
(29 results)