Research on Software Controlled Integrated Memory Architecture for Large Scientific Computing
Project/Area Number |
10680335
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
計算機科学
|
Research Institution | Research on Software Controlled Integrated Memory Architecture for Large Scientific Computing |
Principal Investigator |
NAKAMURA Hiroshi The University of Tokyo, Research Center for Advanced science and Technology, Associate Professor, 先端科学技術研究センター, 助教授 (20212102)
|
Co-Investigator(Kenkyū-buntansha) |
NANYA Takashi The University of Tokyo, Research Center for Advanced Science and Technology, Professor, 先端科学技術研究センター, 教授 (80143684)
|
Project Period (FY) |
1998 – 1999
|
Project Status |
Completed (Fiscal Year 1999)
|
Budget Amount *help |
¥3,400,000 (Direct Cost: ¥3,400,000)
Fiscal Year 1999: ¥1,300,000 (Direct Cost: ¥1,300,000)
Fiscal Year 1998: ¥2,100,000 (Direct Cost: ¥2,100,000)
|
Keywords | Processor Architecture / Memory Hierarchy / Scientific Computing / High Performance Computing / プロセッサアーキテクチュ |
Research Abstract |
Sufficient memory throughput is indispensable for high performance computing in large scale scientific computing. In this research, we propose a new architecture for providing sufficient memory throughput by integrating software controllable memory into processor chip. Since the integrated On-Chip Memory is explicitly addressed by software, only the required data is pre-transferred into the On-Chip Memory without flushing out other required data caused by unfortunate conflicts which occurs frequently in conventional cache. At this point, On-Chip Memory is better to exploit temporal locality than cache. We developed a clock-level simulator for the proposed architecture and evaluated it by using practical scientific applications. In the simulation, various configurations can be explored including the number of functional units, latency and issue rate for each operation, the structure of data cache and On-Chip Memory, and throughput and latency of off-chip memory. The evaluation results by using QCD(Quantum Chromo Dynamics) computation reveals that the proposed architecture decreases off-chip memory traffic compared with conventional cache-only architecture. When the latency of off-chip memory is 40 CPU cycles, the architecture achieves 2.7 times improvement in performance. The degree of the performance improvement increases for longer off-chip memory latency. Off-Chip Memory latency is expected to increase and Off-Chip Memory bandwidth is expected to decrease. Therefore, the results indicate that the effectiveness of the proposed architecture will continue to grow in the future.
|
Report
(3 results)
Research Products
(21 results)