Project/Area Number |
10680348
|
Research Category |
Grant-in-Aid for Scientific Research (C)
|
Allocation Type | Single-year Grants |
Section | 一般 |
Research Field |
計算機科学
|
Research Institution | Nagoya University |
Principal Investigator |
ANDO Hideki Graduate School of Engineering, Nagoya University, Associate Professor, 工学研究科, 助教授 (40293667)
|
Project Period (FY) |
1998 – 2000
|
Project Status |
Completed (Fiscal Year 2000)
|
Budget Amount *help |
¥3,300,000 (Direct Cost: ¥3,300,000)
Fiscal Year 2000: ¥600,000 (Direct Cost: ¥600,000)
Fiscal Year 1999: ¥600,000 (Direct Cost: ¥600,000)
Fiscal Year 1998: ¥2,100,000 (Direct Cost: ¥2,100,000)
|
Keywords | microprocessor / multiprocessor / instruction-level parallelism / thread-level parallelism |
Research Abstract |
The purposes of this study are to propose an architecture of a multi-processor that can efficiently exploit instruction-level parallelism for integer programs, and to develop compiler technologies that make the best use of that architecture. The achievements of this study are as follows. First, we have proposed an architecture integrated in a single chip that exploits globally distributed instruction-level rarallelism from multiple threads. Particularly, we have found a mechanism that can reduce the overhead of communication and synchronization among threads. Furthermore, we have improved branch predictors that are key components of a single processor, and have also improved both-path execution mechanisms to not excessively rely on branch prediction. Second, we have developed a compiler for our architecture so that the computer with our proposed architecture is widely used. Our compiler parallelizes a sequential program and optimizes it. The feature of our compiler is to extract parallelism at the basic block level unlike conventional compilers that extract parallelism at the loop level. Also, we have investigated the limit of parallelism in a program. From our investigation, we have confirmed that the amount of parallelism extracted at the basic block level is much larger than that extracted at the conventional loop level. We have also found that the performance of our compiler is much lower than the limit, leaving much room to improve.
|