# Assigning Ranges of Post-Silicon Tunable Buffers and Efficient Test for Yield Improvement ## INTERNSHIP REPORT 2013-2014 Author's name: Kayed Ghattas Supervisor: Prof. Dr.-Ing. Ulf Schlichtmann Program: Master of Science in Communication Engineering Department: Institute for Electronic Design Automation University: Technische Universität München ## Abstract This report is a short description of my five month internship carried out within the Institute for Electronic Design Automation (Technische Universität München) in 2013. The project is about Assigning Ranges of Post-Silicon Tunable Buffers and Efficient Test for Yield Improvements. The presence of clock skew uncertainty is causing increase concern in the semiconductor industry. Our work focuses on clock scheduling methodology that improves timing and yield through post-silicon clock tuning. Programmable delay elements are used to balance the clock skew that can be caused by process variations where linear programming is used to achieve the goal. ## Acknowledgements I would like to express my appreciation to Prof. Dr.-Ing. Ulf Schlichtmann for giving me the opportunity to do an internship within the Institute for Electronic Design Automation. I am particularly grateful to my research supervisor Dr.-Ing. Bing Li for his patient guidance, encouragements and useful advices. I would like to thank my parents for their support throughout my study. For me it was a unique experience which helped me have brighter sight into my future career. # Table of Contents | Introduction | | 6 | |---------------|----------------------------------------------|----| | Methodologies | | 6 | | | 1- Measure combinational delays | 6 | | | 2- Dff-based circuits with PST clock buffers | 8 | | | 3- Assign PST ranges for low test cost | 11 | | | 4- Total profit optimization | 12 | | | 5- Summary | 12 | | Discussion | | 13 | | Conclusion | | 13 | | References | | 14 | "Tomorrow is often the busiest day of the week." - Spanish proverb ## Introduction The propagation delay of the combinational logic network between two consecutive flip-flops determines the maximum allowable clock frequency in a digital circuit. With the aggressive technology scaling, the timing behavior of integrated circuits (ICs) is increasingly sensitive to PVT variations (process, voltage and temperature). Carefully assigning intentional clock arrival times (Clock skew) to flip-flops in a synchronized sequential circuits can improve the timing performance and the yield. This approach is known as the clock skew optimization. Recently, various post-silicon clock skew tuning techniques are presented to improve circuit timing performance. They target good clock schedule that optimizes the timing slack for all paths. To accomplish this goal, several challenges are to work on: - 1- Measure combinational delays - 2- Assign post-silicon tunable (PST) clock buffer ranges for low test cost - 3- Configure post-silicon tunable (PST) clock buffer values after manufacturing - 4- Total profit optimization ## Methodology # 1- Measure combinational delays With the small technology fabrication process, the chip complexity is increasing, the gate counts are growing and the number of timing-based defects is getting larger. Because of process variations, voltage and temperature fluctuation, it is difficult to accurately estimate the real delay values before manufacturing. For instance, some paths may not be considered as critical paths in pre-silicon estimation, become critical after fabrication due to process variations. Measuring combinational delays is important to achieve our goal. Speed binning methodology is often considered to measure the maximum operating frequency of a particular chip. Then, these chips are sold for different markets and with different prices. The main idea is to employ scan-based delay test using scan chains where vectors are shifted at low speeds, and only the sample cycle (launch-capture vector pair) is at functional clock speed. One approach of applying the launch and capture events is the launch-on-shift approach. It is to use the last shift as the launch event before capture. The figure below shows an example of LOS approach. The scan enable (SE) signal is high during shift mode and low during functional mode. LOS results in faster ATPG runtime. But the disadvantage is the scan shift enable is a critical signal as shown in the figure above. An alternative approach of applying the launch and capture events is the launch on capture. With LOC, the scan shift enable transition is no longer critical by adding extra dead cycles between shift and launch-capture application as show in the Figure below. But the trade-off here is the ATPG runtime. ATPG generally shift at lower frequencies because shifting at high speed is a known power problem. In addition, shifting at high speed would force high-speed design requirements for scan chain. Measuring combinational delays after manufacturing is still a problem because binary search of all combinational delays is too expensive. However, for a given clock period, not all delays need to be measured, or to be measured accurately. Therefore, measurement effort can be reduced. ## 2- Dff-based circuits with PST clock buffers Post-silicon tunable clock buffers are widely used to counter the process variations in high performance designs. PST buffers can effectively reduce the yield loss of digital circuits caused by timing faults by allowing path delays to compensate each other across consecutive flip-flops stages. (1) $$clk_i = clk + x_i$$ ; (2) $r_{l,i} \le x_i \le r_{r,i}$ In a digital circuit, the clock signal is routed to FFs through a clock distribution network. Placing the PST buffers on the clock routing to FFs leads to shift in the clock edges. The Figure below shows an example. $x_i$ , $x_i$ : delays of the PST buffers; s<sub>i</sub>: setup time; ## T: clock period For circuit optimization, the timing analysis tool reports the critical paths that are critical to circuit performance. The presence of the PST buffers allows delay compensation between flip-flop stages. The set of critical paths may change across the flip-flop stages during optimization. The figure below shows how such set of critical paths might change. The inverters represent the combinational delays between flip-flops stages, where the number above the gate is the combinational delay unit value. In the case of "No PST buffers" the critical paths is between FF2 and FF3. However, the presence of PST buffers allows a minimum clock period of 5. This is because the PST buffers have a range of up to 3 unit time. As a result, the critical paths, in which the minimum clock period is constrained, become the paths between FF4 and FF6. # No PST Buffer Clock T≥8 # With PST Buffers Clock T≥6 This optimization problem can be achieved using linear programming. # Minimize T Subject to: $x_i + d_{ij} \le T + x_j - s_j$ for all DFF pair (i,j), $$r_{l,i} \leq x_i \leq r_{l,i}$$ ; $r_{r,j} \leq x_j \leq r_{r,j}$ Where: $x_i, x_j$ : delays of the PST buffers T: clock period $d_{ij} {:} \quad \mbox{maximum combinational delay} \qquad r_{l,i}$ , $r_{l,j} {:} \mbox{ left-side ranges of PST value}$ $s_i$ , $s_j$ : setup time $r_{r,i}$ , $r_{r,j}$ : right-side ranges of PST value ## But, SSTA calculates only the perfect yield, because each chip is assumed to be tuned with the perfect PST configuration which may be too expensive to be found. So, **fast** configuration of PST values after manufacturing remains a challenge. ## 3- Assign PST ranges for low test cost The ranges of PST buffers inserted in design phase should be selected properly to reduce adjustment number after manufacturing. Scenarios below illustrate the idea of assigning PST ranges for low test cost, $T \ge 6$ . If the adjustment direction is clockwise (3 adjustments needed) However, if the adjustment direction is anticlockwise (1 adjustment is needed) In order to achieve the anticlockwise idea, PST buffer has to have an initial delay value set during design phase. For example, 0 to 6 PST range with initial value of 3 is similar to PST range of -3 to 3 with initial value of 0. - 4- Total Profit Optimization - Net\_profit = revenue test\_cost other\_cost - Revenue = price\_pro\_chip \* quantity\_of\_chips More accurate PST configuration means: - Smaller clock period - Higher price - Higher revenue - Higher test cost So an optimal profit configuration exists. # 5- Summary ## Discussion During the first few weeks, I performed detailed survey of publications in recent years. My second assignment was researching and comparing alternative testing methodologies for efficient test for yield improvements. I came up with various proposals and ideas. After that, my supervisor and I are able to propose a systematic framework which is both effective and efficient when compared with other state-of-art approaches. ## Conclusion This internship was excellent, unique and rewarding learning opportunity. I had the opportunity to be surrounded by professionals in the field that I am interested in. Moreover, I had the chance to learn from them by asking questions and astound them with eagerness for knowledge. I had the chance to substantiate my technical and practical knowledge and gain necessary skills to engage in complex and demanding projects. Through the internship, I learned my strengths and weaknesses by receiving feedback from my supervisor. I have improved my management skills as well as self-motivation. I have benefit from seeing the theory, I have been learning in class, put to action. Finally, my career commitment is mostly what is going to decide upon how much I can have an impact in my community, and I am sure commitment was, is and will always be my companion. ## References - Bing Li, Ning Chen, Ulf Schlichtmann, Fast statistical timing analysis for circuits with post-silicon tunable clock buffers, Proceedings of the International Conference on Computer-Aided Design, November 07-10, 2011, San Jose, California - 2. Feng Yuan, Yannan Liu, Wen-Ben Jone, Qiang Xu, On testing timing-speculative circuits, Proceedings of the 50th Annual Design Automation Conference, May 29-June 07, 2013, Austin, Texas - 3. Jeng-Liang Tsai, DongHyun Baik, Charlie Chung-Ping Chen, Kewal K. Saluja, A yield improvement methodology using pre- and post-silicon statistical clock scheduling, Proceedings of the 2004 IEEE/ACM International conference on Computer-aided design, p.611-618, November 07-11, 2004 - 4. Jeng-Liang Tsai, Lizheng Zhang, Statistical timing analysis driven post-silicon-tunable clock-tree synthesis, Proceedings of the 2005 IEEE/ACM International conference on Computer-aided design, p.575-581, November 06-10, 2005, San Jose, CA - J. Zeng, M. S. Abadir, G. Vandling, L.-C. Wang, S. Karako, J. A. Abraham, On Correlating Structural Tests with Functional Tests for Speed Binning of High Performance Design, Proceedings of the Fifth International Workshop on Microprocessor Test and Verification, p.103-109, September 09-10, 2004 - 6. Rong Ye, Feng Yuan, Hai Zhou, Qiang Xu, Clock skew scheduling for timing speculation, Proceedings of the Conference on Design, Automation and Test in Europe, March 12-16, 2012, Dresden, Germany