#### 科学研究費助成事業

研究成果報告書



令和 2 年 7 月 6 日現在

機関番号: 17104 研究種目: 若手研究 研究期間: 2018~2019 課題番号: 18K18026 研究課題名(和文)Interactive Logic Diagnosis of Unpredicted Defects in Logic Circuits

研究課題名(英文)Interactive Logic Diagnosis of Unpredicted Defects in Logic Circuits

研究代表者

Holst Stefan (Holst, Stefan)

九州工業大学・大学院情報工学研究院・助教

研究者番号:40710322

交付決定額(研究期間全体):(直接経費) 2,900,000円

研究成果の概要(和文):本研究では、チップから高信頼診断データを収集し、複雑なタイミングを伴う欠陥を 診断することで大きな成果を上げた。まず、潜在的なIRドロップとクロックスキューを考慮してテスト応答を確 実に収集できる2つの新しい手法が提案した。1つは静的な構造回路解析に、もう1つは正確なGPU加速タイミン グシミュレーションに基づくものである。また、内部欠陥の検出・診断可能なソフトエラー耐性ラッチを提案し た。更に、実際の欠陥と様々な遅延変動の複合効果を含む高圧縮製造テスト応答を分析できる新しい微小遅延故 障診断手法、及び、隠れた遅延欠陥を識別してそれらが発生する前に初期寿命の故障から学習できる診断手法を 提案した。

#### 研究成果の学術的意義や社会的意義

Finding root causes of failing chips through logic diagnosis is essential to ensure and improve reliability and safety of electronic systems. This research enabled diagnosis of complex timing defects previous methods were unable to find and thus contributes to more reliable and safe systems.

研究成果の概要(英文): This project made significant progress in collecting reliable diagnosis data from chips and enabling diagnosis of defects with complex timing behavior. Two new methods for reliably gathering test responses in the face of potential IR-drop and clock skew issues were developed. One method is based on static structural circuit analysis and the other method is based on accurate GPU-accelerated timing simulation. Furthermore, a new soft-error tolerant latch was published that enables testing and diagnosing of latch-internal production defects for the first time.

Two new diagnosis algorithms were developed. First, a new small delay fault diagnosis approach that is able to analyze highly compressed production test responses that contain the combined effects of the actual defect and omnipresent and unknown delay variations. Second, a diagnosis approach that can for the first time identify hidden delay defects to learn from early-life failures even before they occur.

研究分野: Accelerated simulation, VLSI test and diagnosis

キーワード: VLSI Logic Diagnosis Small-Delay Defects IR-Drop Response Compression Process Variations Soft-Error Tolerance GPU Computing

科研費による研究は、研究者の自覚と責任において実施するものです。そのため、研究の実施や研究成果の公表等については、国の要請等に基づくものではなく、その研究成果に関する見解や責任は、研究者個人に帰属されます。

# 1. 研究開始当初の背景

Logic diagnosis is the process of identifying defective transistors or wires within a VLSI chip based on its erroneous behavior [1]. It is an essential part failure analysis that is required for improving chip production yield and meeting the stringent reliability and safety levels required in medical, automotive and aerospace domains [2]. Automated diagnosis methods rely on a-priori assumptions about defect mechanisms in order to produce a list of fault candidates that best explains the observed errors [3].

As VLSI technology complexity and diversity increases, defect mechanisms became much more diverse and more complex in their behavior [4]. Diagnosis algorithms with inaccurate or oversimplifying a-priori defect assumptions may mislead engineers in their quest to understand the true issues by producing wrong fault candidates or no candidates at all.

With the rise of Fin-FET technologies, small delay defects and hidden delay defects became the most prominent example of a defect with rather complex behavior [5]. Current diagnosis approaches provide only very limited support for such defects and cannot be used during high-volume production. In addition, if a circuit under test contains timing issues, collecting diagnostic test responses in itself can be unreliable due to process variations or power supply noise.

# 2. 研究の目的

The purpose of this research is to boost the success rate of logic diagnosis for defects with complex behavior. The key challenge is to use diagnosis algorithms with exactly the right a-priori assumptions. As shown in figure 1, if too many assumptions restrict the diagnosis process, misleading results are generated leading engineers into the wrong direction. If too few assumptions are made, diagnosis results can be too unspecific and not actionable by engineers.



Fig. 1: Impact of assumptions on logic diagnosis results.

#### To overcome the challenge of mismatching

assumptions, this project proposed an Interactive Logic Diagnosis approach in which automatic diagnosis is combined directly with human reasoning of expert engineers. This was to enable world-leading basic research on two fronts: (1) Achieve fast runtime performance of automated logic diagnosis to enable interactivity, and (2) a new way of interacting with logic diagnosis systems.

#### 3. 研究の方法

The key research work in this project is the development of methods and algorithms for test response acquisition and logic diagnosis. The algorithms are developed based on the targeted diagnosis problems, predominant defect types and well-known restrictions and best practices for potential industrial application. Within this project, the algorithms have been prototyped in software. Since actual diagnosis data from real chips are usually not accessible to academic researchers, experimental evaluations and performance measurements were made using common and widely-accepted benchmark circuits and simulation models.

Most of the developed algorithms in this project require significant compute power. To ensure feasibility for real-world applications, the most compute intensive parts of the algorithms (e.g. gate-level timing and power simulations) were developed with GP-GPU acceleration in mind. Funds of this project were used to purchase and install the necessary GPU-based acceleration hardware to enable this development and prove the viability of the new approaches to simulation and logic diagnosis.

Research was conducted in international collaboration with Prof. Wunderlich from the Institute of Computer Architecture and Computer Engineering (ITI), University of Stuttgart, Germany and Prof. Wen at Kyushu Institute of Technology (Kyutech), Iizuka, Japan. The research at ITI is mainly focusing on the accelerated timing simulation engine itself that was used in this project. The other research at Kyutech is primarily dealing with power supply noise during testing, which is an important source of timing defects in circuits and informed the diagnosis work in this project.

During this project, **a new collaboration was established** with the research group of Prof. Hellebrand, University of Paderborn, Germany. Her research focuses on hidden delay defects, an important indicator of reliability problems in safety-relevant systems.

# 4. 研究成果

This project resulted in five key research findings that will be presented below. Four of these findings have already been formally published at international conferences, and the latest result is currently under review for an international conference later this year. The wide dissemination of the research results at the leading test conferences in the USA and Europe greatly strengthened the visibility of Japanese research in the field of VLSI test and diagnosis. In 2019, two papers funded by this project were the only academic contributions from Japan accepted at the International Test Conference (ITC) and the European Test Conference (ETS) for full talks, respectively. Furthermore, related research has been presented and discussed in 7 talks at domestic and international workshops and one poster at an international conference to foster and strengthen collaborations with other research groups in Japan and Europe.

During the in-depth studies of assumptions made by current diagnosis algorithms, it turned out that one important type of assumption was not adequately represented in the original project proposal. This assumption is the robustness of the test infrastructure used to collect diagnostic data. It turned out that especially marginal chips that are relevant for in-depth diagnosis can also be more prone to shift timing errors that can change test responses completely unrelated to the defect of interest. To improve the relevance of the conducted research and improve the publication output of this project, it was decided to put more emphasis on test infrastructure in place of the more esoteric aspects of interactivity. Based on the strong foundation developed in this project, we plan to cover more advanced topics of interactive logic diagnosis in future projects.

The key research findings are as follows:

# (1) Clock-Skew-Aware Scan Chain Grouping for Mitigating Shift Timing Failures in Low-Power Scan Testing

High scan shift power often leads to excessive heat as well as shift timing failures. Partial shift (shifting a subset of scan chains at a time as shown in Fig. 2(b)) is a widely-adopted approach for avoiding excessive heat by reducing global switching activity. We have shown for the first time that it may actually cause excessive IR-drop on some clock buffers and worsen shift clock skews, thus increasing the risk of shift timing failures. This paper addresses this problem with an innovative method, namely Clock-Skew-Aware Scan Chain Grouping (CSA-SCG). CSA-SCG properly groups scan chains to be shifted simultaneously so as to reduce the imbalance of switching activity around the clock paths for neighboring scan flip-flops in scan chains. Experiments on large ITC'99 benchmark circuits demonstrated the



Fig. 2: Partial shift leading to clock skew that can corrupt test patterns and responses.

effectiveness of CSA-SCG for reducing scan shift clock skews to lower the risk of shift timing failures in partial shift.

This research was conducted in international collaboration with University of Stuttgart in Germany and with Advanced Micro Devices (AMD) in Sunnyvale, USA. The results were presented at the IEEE Asian Test Symposium (ATS) 2018 in Hefei, China.

#### (2) STAHL: A Novel Scan-Test-Aware Hardened Latch Design

As modern technology nodes become more susceptible to soft errors, many radiation hardened latch designs have been proposed. We have shown for the first time that redundant circuitry used to tolerate soft errors in such hardened latches also reduces the test coverage of cell-internal manufacturing defects. One example is shown in Fig. 3. A production defect was injected into a hardened latch (Fig. 3(a)), yet its behavior does not change (Fig. 3(b)) since the effect of this defect is masked by the cell-internal redundancy used to tolerate soft-errors. This defect would escape the test and cannot be diagnosed, which in turn results in unexpected soft error vulnerability and reliability issues. This paper proposed a novel Scan-Test-Aware Hardened Latch (STAHL) that enables test and diagnosis of cell-internal defects in hardened latches for the first time. Simulation results showed that STAHL has superior defect coverage compared to previous hardened latches while maintaining full radiation hardening in function mode.



(c) Scan chain structure with hardened latch (HL)

Fig. 3: Inherent redundancy of hardened latches prevents test and diagnosis of cell-internal defects.

The research was conducted with international collaboration of Anhui University, China. The results were presented at the IEEE European Test Symposium (ETS) 2019 in Baden-Baden, Germany. <u>This paper was the only Japanese contribution that was accepted for a full talk at ETS 2019</u>.

#### (3) Variation-Aware Small Delay Fault Diagnosis on Compressed Test Responses

With today's tight timing margins, increasing manufacturing variations, and new defect behaviors in FinFETs, effective yield learning requires detailed information on the population of small delay defects in fabricated chips. Small delay fault diagnosis for yield learning faces two main challenges: (1) production test responses are usually highly compressed reducing the amount of available failure data, and (2) failure signatures not only depend on the actual defect but also on omnipresent and unknown delay variations. Fig. 4 shows some of these challenges that prevent common diagnosis algorithms from identifying the culprit. This work presented the very first diagnosis algorithm specifically designed to diagnose timing issues on compressed test responses and under process variations. An innovative combination of variation-invariant structural analysis, GPU-accelerated time-simulation, and variationtolerant syndrome matching for compressed test responses allows the proposed algorithm to cope with both challenges. Experiments on large benchmark circuits clearly demonstrated the scalability and superior accuracy of the new diagnosis approach. Fig. 5 shows that the proposed diagnosis algorithm has a higher success rate and maintains this high diagnosis quality much better for higher response compression ratios than previous diagnosis methods.

This research was conducted in international collaboration with University of Stuttgart in Germany. Preliminary results were presented at the Design Automation Conference (DAC)



Fig. 4: Various effects diagnosis has to consider when trying to identify a smalldelay fault from a compressed signature.



Fig. 5: Superior success rate of the proposed diagnosis algorithm.

2019 in Las Vegas, USA as a work-in-progress poster and the final results were presented at the IEEE International Test Conference (ITC) 2019 in Washington DC, USA. ITC is the premier conference for VLSI test and diagnosis. <u>This paper was the only academic contribution from Japan that was accepted for a full talk at ITC 2019</u>.

#### (4) Targeted Partial-Shift For Mitigating Shift Switching Activity Hot-Spots During Scan Test

Shifting scan chains during testing causes high switching activity in the combinational logic. Excessive shift switching activity can give rise to severe, localized IR-drop that may invalidate the test and the diagnostic test responses by corrupting the contents of scan flip-flops or inducing excessive shift clock skew. In this work, we proposed new methods to (1) quickly analyze all shift cycles of a given scan design and a test set for potential shift switching activity hot-spots and to (2) avoid them by targeted partial shifting of the scan chains (see Fig. 6). The results on ITC'99 benchmark circuits show the computational feasibility of the analysis and demonstrate the effectiveness of targeted partial-shift for mitigating test data corruption risk with minimal impact on test time.

These results were presented at the Pacific Rim International Symposium on Dependable Computing (PRDC) 2019 in Kyoto, Japan.

## (5) Logic Fault Diagnosis of Hidden Delay Defects

Hidden delay defects (HDDs) are small delay defects that pass all at-speed tests at nominal capture time. They are an important indicator of latent defects that lead to early-life failures and aging problems that are serious especially in autonomous and medical applications (see Fig. 7(a)). An effective way to screen out HDDs is to use Faster-than-at-Speed Testing (FAST) to observe outputs of sensitized noncritical paths which are expected to be stable earlier than nominal capture time. To improve the reliability of current and future designs, it is important to learn about the population of HDDs using logic diagnosis. We proposed the very first logic fault diagnosis technique (FAST Diagnosis, Fig. 7(b)) that is able to identify HDDs by analyzing fail-logs produced by FAST. Even with aggressive FAST testing, HDDs generate only very few failing test response bits. To



Fig. 6: IR-drop hotspot analysis and targeted partial shifting to mitigate test response corruption.



Fig. 7: Faster-than-At-Speed testing flow compared to conventional at-speed testing and how diagnosis of hidden delay defects enables quick reliability improvement.

overcome this severe challenge, we propose new backtracing and response matching methods that yield high diagnostic success rates even with very limited amount of failure data. The performance and scalability of our HDD diagnosis method is validated using fault injection campaigns with large benchmark circuits. This diagnosis technique enables rapid reliability improvements and at the same time avoids costly and possibly dangerous early life failures in the field.

The research was conducted with international collaboration of University of Paderborn and University of Stuttgart in Germany. A proposal based on this research has been submitted and is currently under review for an international conference that is scheduled to take place in 2020.

#### References

- L.-T. Wang, C.-W. Wu, and X. Wen, "VLSI Test Principles and Architectures: Design for Testability (Systems on Silicon)." Morgan Kaufmann Publishers, 2006.
- [2] L.C. Wagner, "Handbook of Semiconductor Manufacturing Technology." CRCPress London, 2008, ch. 29: Failure Analysis.
- [3] L.M. Huisman, "Diagnosing Arbitrary Defects in Logic Designs Using Single Location at a Time (SLAT), "IEEE Trans. on Computer Aided Design of Integrated Circuits and Systems, vol. 23, no. 1, pp. 91–101, Jan. 2004.
- [4] W. Arden, M. Brillout, P. Cogez, M. Graef, B. Huizing, and R. Mahnkopf, "More-than-Moore White Paper," International Technological Roadmap for Semiconductors, Annual Report, Tech. Rep., 2011. [Online]. Available: http://www.itrs2.net/itrs-models-and-papers.html
- [5] H. Yan and A. D. Singh, "Experiments at Detecting Delay Faults using Multiple Higher Frequency Clocks and Results from Neighboring Die," in Proc. IEEE Int. Test Conf. (ITC), Oct. 2003, pp. 105–111.

#### 5.主な発表論文等

#### 〔雑誌論文〕 計0件

#### 〔学会発表〕 計12件(うち招待講演 1件/うち国際学会 5件)

1.発表者名 Stefan Holst

#### 2.発表標題

Interactive Logic Diagnosis of Unpredicted Defects in Logic Circuits

#### 3 . 学会等名

FTC Workshop Jul. 2018, Sakura-shi, Tochigi-ken, Japan

4 . 発表年 2018年

# 1.発表者名

Yucong Zhang

## 2.発表標題

Clock-Skew-Aware Scan Chain Grouping for Mitigating Shift Timing Failures in Low-Power Scan Testing

#### 3.学会等名

IEEE Asian Test Symposium (ATS) 2018(国際学会)

# 4 . 発表年

2018年

#### 1.発表者名 Stefan Holst

Steran Horst

# 2.発表標題

Small Delay Fault Diagnosis with Compacted Responses

#### 3 . 学会等名

FTC Workshop Jan. 2019, Kitakyushu-shi, Fukuoka-ken, Japan

4 . 発表年

2019年

# 1.発表者名

Stefan Holst

#### 2.発表標題

Small Delay Fault Diagnosis with Compacted Responses

## 3 . 学会等名

South European Test Seminar 2019, St. Leonhard, Pitztal, Austria

4 . 発表年

2019年

# 1.発表者名

Ruijun Ma

# 2.発表標題

STAHL: A Novel Scan-Test-Aware Hardened Latch Design

3 . 学会等名

IEEE European Test Symposium (ETS) 2019(国際学会)

4.発表年 2019年

1.発表者名 Stefan Holst

#### 2.発表標題

Small Delay Fault Diagnosis with Compacted Responses

3 . 学会等名

ACM Design Automation Conference (DAC) 2019 Work-In-Progress Poster(国際学会)

4.発表年 2019年

1.発表者名

Stefan Holst

2.発表標題

Logic Fault Diagnosis of Hidden Delay Defects

3 . 学会等名

FTC Workshop Jul. 2019, Daigo-machi, Kuji-gun, Ibaraki-ken, Japan

4.発表年 2019年

1.発表者名 Stefan Holst

2.発表標題

Accelerated Timing Simulation and Its Applications

3 . 学会等名

Dagstuhl Workshop "Intelligent Methods for Test and Reliability" Sep. 2019, Dagstuhl, Germany(招待講演)

4 . 発表年 2019年

# 1.発表者名

Stefan Holst

# 2.発表標題

Variation-Aware Small Delay Fault Diagnosis on Compressed Test Responses

# 3 . 学会等名

IEEE International Test Conference (ITC) 2019, Washington DC, USA(国際学会)

#### 4.発表年 2019年

20194

1.発表者名 Shiling Shi

# 2.発表標題

Targeted Partial-Shift For Mitigating Shift Switching Activity Hot-Spots During Scan Test

#### 3 . 学会等名

IEEE Pacific Rim International Symposium on Dependable Computing (PRDC) Dec. 2019, Kyoto, Japan(国際学会)

#### 4.発表年 2019年

2013-

# 1. 発表者名

Stefan Holst

# 2.発表標題

Diagnosing Hidden Delay Defects from Faster-Than-At-Speed Test Responses

# 3 . 学会等名

FTC Workshop Jan. 2020, Yurihama, Tottori-ken, Japan

4.発表年 2020年

\_\_\_\_

1.発表者名 Stefan Holst

#### 2.発表標題

Diagnosing Hidden Delay Defects from Faster-Than-At-Speed Test Responses

# 3 . 学会等名

South European Test Seminar 2020, ObergurgI, Austria

4.発表年 2020年 〔図書〕 計0件

# 〔産業財産権〕

〔その他〕

6 . 研究組織

\_

| 0 |                           |                       |    |
|---|---------------------------|-----------------------|----|
|   | 氏名<br>(ローマ字氏名)<br>(研究者番号) | 所属研究機関・部局・職<br>(機関番号) | 備考 |
|   |                           |                       |    |