# Linear Compositional Delay Model for the Timing Analysis of Sub-Powered Combinational Circuits

Jiaoyan Chen<sup>#1</sup>, Christian Spagnol<sup>#1</sup>, Satish Grandhi<sup>#1</sup>, Emanuel Popovici<sup>#1</sup>, Sorin Cotofana<sup>#2</sup>, Alexandru Amaricai<sup>#3</sup> Department of Electrical and Electronic Engineering, University College Cork, Cork, Ireland<sup>#1</sup>

Department of Computer Engineering, TU Delft, Delft, the Netherlands<sup>#2</sup></sup>

Department of Computer Science, University Politehnica Timisoara, Timisoara, Romania<sup>#3</sup>

Abstract—With the advent of deep submicron CMOS technology, process parameter statistical variations are increasing resulting in unpredictable device behaviour. The issue is even aggravated by low power requirements which are stretching transistor operation into near/sub threshold regime. Consequently, traditional delay models fail to accurately capture the circuit behaviour. In view of this we introduce an Inverse Gaussian Distribution (IGD) based delay model, which accurately captures the delay distribution under process variations at ultra low, near or below threshold, power supply values. We demonstrate that the IGD model captures the transistor delay distribution with a greater accuracy than the traditional Gaussian one. Moreover it exhibits linear compositionality such that the key model parameters can be straightforward propagated form device/gate level to circuit level. Our simulations indicate that, when compared with Monte Carlo SPICE simulation results, it provides high accuracy, e.g., an average error less than 0.8%, 1.2%, and 1.7% for Majority Voter, XOR gate, and 16-bit Ripple Carry Adder, respectively, while providing orders of magnitude simulation time reductions.

# Keywords—Timing Analysis; Near/Sub Threshold Operation; CMOS Process Variations; Delay Model, Statistical Modelling;

# I. INTRODUCTION

Accurate timing analysis is a crucial step in the evaluation of digital Integrated Circuits (ICs) behavior and in reducing post fabrication functional errors. However, process variations and voltage scaling associated with deep submicron CMOS fabrication technology are increasing the complexity of such analysis. As transistor sizes reach tens of nanometers, local variations [1], i.e., intra-die variations, have a higher impact on their behavior, resulting in unpredictable delay properties [2]. Moreover, to reduce energy consumption, circuits are powered by a supply voltage near or below the MOSFET threshold voltage, which results in the further intensification of the process variation impact on the circuit delay [3]. Consequently, accurate circuit delay estimation is becoming more and more complex [4] as parameters like threshold voltage V<sub>th</sub>, channel length 'L', and oxide thickness T<sub>ox</sub>, make the timing analysis more difficult [5].

For synchronous designs, corner analysis is a popular approach of dealing with delay variation. In corner analysis multiple Process, Voltage, and Temperature (PVT) corners are simulated through Static Timing Analysis, however this approach is mostly overly pessimistic or optimistic [6]. Additionally, due to the PVT variations high sensitivity of deep sub-micron devices, more accurate delay-calculation methods are needed. SPICE simulators can provide high accuracy but are exceedingly time consuming for large circuits. Therefore, a simple yet accurate mathematical model to evaluate the propagation delay through nanometer CMOS circuits is highly desirable.

Statistical Static Timing Analysis (SSTA) has been recently proposed to quickly compute propagation delays and signal timing violation on circuit critical paths [7]. However, SSTA lacks accuracy as it disregards the circuit delay dependency on input values, and requires irksome efforts to automate the methodology. To address these obstacles, Monte Carlo Static Timing Analysis (MCSTA) [7] and Dynamic Timing Analysis (DTA) [8] have been proposed. MCSTA requires the one-off generation of Variation Cell Library for standard cells; the library is utilized to perform static timing analysis to create thousands of randomized gate-level net-lists. MCSTA can be regarded as a trade-off between the timeconsuming Monte Carlo SPICE simulation and relatively inaccurate SSTA. A statistical DTA approach that employs the normal Gaussian approximation to model the propagation delay on the basis of distinguishable input patterns was presented in [9]. While accurate, the approach can be costly in terms of processing time, as its accuracy directly depends on the number of considered input vectors.

This paper proposes a novel delay approximation model, based on the Inverse Gaussian Distribution (IDG). Moreover, a method to obtain the key IDG parameters  $(u, \lambda)$  for generic circuits is presented. Our approach is significantly faster and more accurate than the one proposed in [9] and its link with the underlying physical phenomena is better understood. The main idea behind our proposal is to first gather the basic gate key parameters by means of Monte Carlo simulations and then linearly extrapolate (propagate) them through the logic network at the circuit level. This approach is significantly faster than the state-of-the-art since only the basic cells have to be fully simulated (with process and voltage supply variations) in order to obtain the key model parameters and the delay model for complex circuits. Unlike other techniques or tools, which demand large look-up tables or complicated proposed approach is remarkably calculations, the straightforward.

To verify the practicability of our statistical approach, comparisons between the delay estimation based on our model and Monte Carlo simulations for several circuits are carried out. Our simulations justify that the linear compositionality of the key parameters is sufficient to obtain output delay estimations for complex circuits. Moreover, our method is highly accurate, e.g., average error is less than 0.8%, 1.2%, and 1.7% for Majority Voter, XOR gate, and 16-bit Ripple Carry Adder, respectively, while providing orders of magnitude simulation time reductions.

This paper is organized as follows. In Section II, the proposed delay model is introduced and compared with related work. Next, the scalability of our model is discussed and demonstrated in Section III. In Section IV sample circuits, i.e., 3-input Majority voter, 3-input XOR gate, and a 16-bit Ripple Carry Adder (RCA), are analyzed. Finally, conclusions and future work are discussed in Section V.

## II. INVERSE GAUSSIAN DELAY MODEL

This section introduces the proposed delay model and explains why Inverse Gaussian Distribution (IGD) is better suited than Gaussian Distribution (GD) in capturing nanometer CMOS gate time behavior. We also illustrate how reduced power supply values along with other variations may re-shape the propagation delay distribution.

We note that GD was introduced for CMOS circuits delay estimation in [9], where a close match was found between the measured propagation delay profile and the Gaussian Probability Density Function (PDF). Based on their model, the authors also presented a propagation delay estimation algorithm. However, the choice of approximating the delay PDF with a normal distribution was based on the fitting of only two Monte Carlo simulations. Being just a fitting procedure, no theoretical explanation was provided to support the conclusions.

## A. Inverse Gaussian Approximation

Several GD characteristics hint to its inadequacy to capture delay data distribution. First, by definition, GD is represented by a function with the field of real numbers as its support, which means that it assumes non-zero value also for negative time values. This is a clear mismatch with the real situation since no signal propagation delay can be negative. Furthermore, the normal distribution is symmetric around its mean value. Simulations presented in this subsection demonstrate that this is not a correct assumption for the cases of interest.

A probability distribution that can overcome both shortfalls is the Inverse Gaussian Distribution, IGD (u,  $\lambda$ ). The PDF for an IGD is expressed in Eq. (1), where  $\mu$  is the mean and  $\lambda$  the shape parameter. The distribution support is  $[0, \infty]$  and it can be symmetric or asymmetric around  $\mu$ .

$$f(x,\mu,\lambda) = \left[\frac{\lambda}{2\pi x^3}\right]^{1/2} \exp\left(\frac{-\lambda(x-\mu)^2}{2\mu^2 x}\right); x > 0 \quad (1)$$

Moreover, there is an intuitive reason why IGD fits with CMOS delay propagation data under various PVT variations.



Fig. 1. IGD vs GD approximation for 2-input AND gate @0.9V Vdd.



Fig. 2. IGD vs GD approximation for 2-input AND gate @0.3V Vdd.

The carrier particles in an electronic circuit in steady state can be assumed to perform random movements modeled by Brownian motion, the also called Wiener process [10]. For particles under Brownian motion GD captures the motion distribution of all particles at a given moment in time, while IGD reflects the particle motion when a drift is applied.

In particular IGD provides the number of particles, in random motion with a positive drift, that reach a fixed level in a certain time. In electronic devices, the drift can be seen as a voltage difference between device terminals producing an electric field, thus inducing carrier movements. We note that the IDG shape can change significantly depending on its two parameters. It is also possible to obtain normal distribution shapes.

In [9] a 2-input AND gate was evaluated by means of SPICE simulation while the threshold voltage variation, as the most dominant element of all process variations, was GD based modeled. 32nm Predictive Transistor Models (PTM) under the nominal supply voltage of 0.9V was considered for the Monte Carlo simulations. To validate the advantage of our IGD model, the experiment in [9], a 2-input AND gate with inputs switching from 00 to 11, has been reproduced. The threshold voltage (V<sub>th</sub>) variation is generated following the GD, where the mean value is the nominal  $V_{th}$ ,  $V_{thn}$ =0.322V for nFETs and  $V_{thp}$ =-0.302V for pFETs, and the standard deviation is set to 50mV, which is sufficient to reflect the threshold voltage variation in real circuits. In this case, both GD and IGD are used to fit the propagation delay data profile. The results depicted in Fig. 1 indicate that they both have similar shape and fit the data well.

This similarity in fitting capability does not hold true however for gates operated in the near threshold regime. To demonstrate this we have repeated the same experiment for the same  $V_{th}$  distribution and a  $V_{dd}$  of 0.3V. Fig. 2 presents the delay histogram and the GD and IGD fittings. It can be observed that the IGD almost perfectly fits the delay histogram and that its shape is not symmetric, with a steep slope towards the left side and a long tail towards infinity. On the other hand,



Fig. 3. IGD vs GD approximation for AND gates with Gaussian distribution on  $V_{dd}$  and  $V_{th}$  @0.9V  $V_{dd}$ 



Fig. 4. IGD approximation for five cascaded inverters with uniform distribution on  $V_{dd}$  and  $V_{th} \ @0.9V \ V_{dd}$ 

the GD fitting is not acceptable, as the delay shape is not balanced, even despite of the fact that the  $V_{th}$  variation follows the normal distribution. From the above one can infer that IGD PDF based fitting is more appropriate and allows for greater flexibility in capturing different types of variations distribution and accuracy due to its asymmetric shape.

To further demonstrate the IGD fitting accuracy, a chain of 5 AND gates (except for the first AND gate, each gate are fed by the output of the previous gate), with primary inputs switching from 11 to 00 has been simulated for the above mentioned  $V_{th}$  variations and a  $V_{dd}$  variation with a standard deviation of 50mV at 0.9V V<sub>dd</sub>, which reflects real circuits power supply voltage fluctuations. In Fig. 3, the PDFs and their corresponding fittings of the 3<sup>rd</sup> and 5<sup>th</sup> AND gates are depicted (the other stages are omitted for clarity). Also in these cases it is evident that IGD better fits the experimental data than normal GD. To further prove our approach capability to accommodate other distribution types, we simulate a 5-inverter chain operating at 0.3V  $V_{dd}$  and assume that  $V_{th}$  variations follow a uniform distribution, which has constant probability across a fixed range of 50mV around the nominal (center) V<sub>th</sub> value. Five delay sets are depicted in Fig. 4 capturing the switching occurrence at each inverter output.

Based on our simulations we can conclude that IGD accurately capture gates and circuits propagation delays under various types of  $V_{th}$  and  $V_{dd}$  values and distribution types. Given this, in the next section, we introduce a method to compute/propagate the key IGD parameters for generic circuits.

## III. MODEL SCALABILITY

The proposed delay model is straightforward in terms of calculation and has the potential to be easily scaled. More specifically, key parameters ( $\mu$ ,  $\lambda$ ) of an entire circuit have a linear relationship with the parameters of the basic cells involved. We first consider how the proposed model can represent a chain of identical components. Scalability is one of

the IDG properties [11] and implies that for any t > 0, the following holds true:

$$X \bowtie IGD(\mu, \lambda) \Longrightarrow tX \bowtie IGD(t\mu, t\lambda)$$
(2)

Eq. (2) states that the IGD fitting the output of a chain of identical gates have parameters that are multiples of  $\mu$  and  $\lambda$  of the single gate IDG. Similarly, the delay PDF at the output of a circuit composed of identical sub-circuits can be represented as a scaled version of the delay PDFs of each of the components of the sub-circuit.

The aim of this section is to investigate how real measurements fit with PDF whose parameters have been computed by scaling. To this end we investigate a number of example circuits composed by identical components. It is worth mentioning that CMOS gates may have different values of  $\mu$  and  $\lambda$  corresponding to different output switching cases (i.e., 100, 001).

# A. 7 Inverter Chain

A 7-inverter chain is simulated in HSPICE utilizing the Monte Carlo method, in the presence of both  $V_{dd}$  and process variations modeled by normal distributions as follows: (i)  $V_{dd}$  - mean value 0.3V and deviation 50mV; (ii)  $V_{th}$  - mean value 0.322V for nFETs and -0.302V for pFETs, and standard deviation 50mV; (iii) 10%  $T_{OX}$  deviation for both nMOS and pMOS transistors.

In a 7-inverter chain 4 of them are switching from 001 (charging) and 3 from 100 (discharging). Given that the charging and discharging events IDG key parameters are slightly different, to avail of the scalability property, we compute the average between the various events.

Fig. 5 depicts the per-stage IGD fitted PDFs (the delay histograms are omitted to improve readability) while Table I summarizes the corresponding  $\mu$  and  $\lambda$ . It should be noted that the IGD approximations match well the Monte Carlo simulation results. In Fig. 6, the  $\mu$  and  $\lambda$  growth trend over the 7 adjacent elements is presented. The data listed in Table I are based on the primary input changes from 1 to 0, which means the first inverter (INV1) output is charging and the second one (INV2) is discharging and so on.

From Fig. 6, it is clear that the  $\mu$  increment is linear (blue solid line), which clearly reflects the IGD delay model scalability, while  $\lambda$  presents a stair-wise shape evolution (green dashed line) that can be related to the different nMOS and pMOS



Fig. 5. IGD approximation for seven cascaded inverters.



Fig. 6. 7-Inverter chain  $\mu$  and  $\lambda$  trend.

characteristic, discharging and charging, respectively.

However, when considering charging and discharging events separately, the model scalability is still preserved. In fact, of the charging event  $\Delta\mu$  is only slightly higher than that of the discharging event, while the  $\Delta\lambda$  difference is more significant ( $\Delta\lambda$  is around 0.2E-9 for discharging event and 0.1E-10 - about one order of magnitude smaller - for charging event). The delay parameters dependence on event nature, i.e., charging or discharging, reflects the importance of data-dependent analysis.

# B. 5 AND Gate Chain

A Monte Carlo simulation with the parameter variation distributions in Section III-A has been performed on 5 cascaded AND gates connected as described in Section II. While the inverter could experience two input switching events only, i.e., 001 and 100, the AND gate has six possible input sequences, which should result in output changes, i.e., 00011, 10011, 01011, 11001, 11010, 11000. Through extensive simulation we found that 11010 and 11001 (leading to 100 output switching) and 01011 and 10011 (leading to 001 output switching) are equivalent. Moreover, the probability that both inputs change at the exact same time (00011, 11000) is quite low as it is statically unlikely that unrelated path delays are sufficiently close to infer such events, even though the results are similar. Under these assumptions, only 10011 and 11001 are considered and utilized in our estimations.

TABLE I. 7-INTERTER CHAIN  $\mu$  AND  $\Lambda$ 

|                       | Input Switch     | μ (e <sup>-10</sup> ) | $\Delta \mu (e^{-10})$ | $\lambda (e^{-10})$ | $\Delta\lambda(e^{-10})$ |
|-----------------------|------------------|-----------------------|------------------------|---------------------|--------------------------|
| INV1                  | 1-0(charging)    | 0.65                  | -                      | 0.37                | -                        |
| INV2                  | 0-1(discharging) | 1.3                   | 0.65                   | 2.6                 | 2.23                     |
| INV3                  | 1-0(charging)    | 2.3                   | 1                      | 2.4                 | -0.2                     |
| INV4                  | 0-1(discharging) | 3.0                   | 0.7                    | 4.5                 | 2.1                      |
| INV5                  | 1-0(charging)    | 4.0                   | 1.0                    | 4.8                 | 0.3                      |
| INV6                  | 0-1(discharging) | 4.7                   | 0.7                    | 6.4                 | 1.6                      |
| INV7 1-0(charging)    |                  | 5.5                   | 0.8                    | 6.6                 | 0.2                      |
| Average (charging)    |                  | -                     | 0.9                    | -                   | 0.1                      |
| Average (discharging) |                  | -                     | 0.7                    | -                   | 2.0                      |



Fig. 7. IGD approximation for five cascaded AND gates.



Fig. 8. 5-AND gate chain  $\mu$  and  $\lambda$  trend.

 TABLE II.
 5-AND GATE CHAIN μ AND Λ

|         | 10-11 (e <sup>-10</sup> ) |     |      |     | 11-01 (e <sup>-10</sup> ) |     |      |     |
|---------|---------------------------|-----|------|-----|---------------------------|-----|------|-----|
|         | μ                         | Δμ  | λ    | Δλ  | μ                         | Δμ  | λ    | Δλ  |
| AND1    | 2.3                       | -   | 3.4  | -   | 1.9                       | -   | 3.4  | -   |
| AND2    | 4.4                       | 2.1 | 5.9  | 2.5 | 4.4                       | 2.5 | 5.3  | 1.9 |
| AND3    | 6.5                       | 2.1 | 8.4  | 2.5 | 6.3                       | 1.9 | 8.2  | 2.9 |
| AND4    | 8.7                       | 2.2 | 10.8 | 2.4 | 8.4                       | 2.2 | 10.6 | 2.4 |
| AND5    | 10.6                      | 1.9 | 13.3 | 2.5 | 10.6                      | 2.2 | 12.9 | 2.3 |
| Average | -                         | 2.1 | -    | 2.5 | -                         | 2.2 | -    | 2.4 |

In Fig. 7, the AND gate delay IGD fitted PDFs (input 01011) are presented (for readability the delay histograms are omitted). In Fig. 8, the corresponding  $\mu$  and  $\lambda$  after each adjacent element are presented. Again, the distinct linear increment echoes the IGD model scalability. Table II summarizes the corresponding  $\mu$  and  $\lambda$  for 10011 and 11001 input transitions and consistency in terms of  $\Delta\mu$  and  $\Delta\lambda$  can be observed. The biggest variation in terms of  $\mu$  and  $\lambda$  occurs between AND2 and AND3, even though it does not affect the average value of  $\Delta\mu$  and  $\Delta\lambda$ .

# C. Ripple Carry Adder (RCA)

RCAs are widely used in low cost and narrow operand arithmetic units and can be considered as a good test case to validate the efficiency of the proposed IGD model.

The basic RCA building block is the Full-Adder (FA), which is relatively more complex than basic Boolean gates. To obtain the FA  $\Delta\mu$  and  $\Delta\lambda$  values a 5-bit RCA is simulated in HSPICE using Monte Carlo method under the aforementioned V<sub>dd</sub> and process variations. We are interested in the longest

propagation path (worst case scenario) for STA purpose. As the FA delay from Cin to Sum is longer than that from Cin to Cout in a 5-bit RCA the longest delay occurs when the inputs A, B, Cin switch from all 0s to A=01111, B=00000, and Cin=1, resulting in a 10000 output. The IGD fitted PDFs of the delays after propagating through each FA are presented in Fig. 9. In Fig.10, the  $\mu$  and  $\lambda$  growth trend over 5 adjacent elements is presented. Once again, the scalability can be obviously observed in Fig. 10. The corresponding key parameters,  $\mu$  and  $\lambda$ , along with their increments are summarized in Table III. The FA  $\Delta\mu$  and  $\Delta\lambda$  are 5.8E-10 and 5.7E-10, respectively.



Fig. 9. 5-bit RCA IGD approximation.



Fig. 10. 5-bit RCA  $\mu$  and  $\lambda$  trend.

| (e <sup>-10</sup> ) | Carry Switch | Sum Switch | μ    | Δμ  | λ    | Δλ  |
|---------------------|--------------|------------|------|-----|------|-----|
| FA1                 | 001          | 000        | 5.3  | /   | 5.1  | /   |
| FA2                 | 001          | 000        | 11.1 | 5.8 | 10.6 | 5.5 |
| FA3                 | 001          | 000        | 16.9 | 5.8 | 16.3 | 5.7 |
| FA4                 | 001          | 000        | 22.8 | 5.9 | 21.9 | 5.6 |
| FA5                 | 000          | 001        | 28.5 | 5.7 | 27.7 | 5.8 |
|                     | Average      |            |      | 5.8 | -    | 5.7 |





Fig. 11. Majority voter and XOR gate AIGs.

# IV. MODEL VALIDATION FOR COMBINATIONAL CIRCUITS

In the previous two sections, a delay model is proposed and investigated. Based on the collected data,  $\Delta \mu$  and  $\Delta \lambda$  scalability for different events and circuits has been demonstrated. This property is of interest if the evaluation of complex circuits is targeted. To investigate the feasibility/correctness of the proposed method on complex circuits, processor-design and communications relevant circuits, i.e., 3-input Majority and XOR gates, and 16-bit RCA, are considered. The former two circuits are described by means of And-Inverter Graphs (AIGs), which are often utilized for logic synthesis [12].

The propagation delay estimations obtained with our model are compared with Monte Carlo SPICE simulation results. The Cumulative Distribution Functions (CDF), which is the PDF integral, is utilized to more clearly quantify the differences. When talking about delay distributions, CDF gives the probability of the switch to have happened at a chosen time, while PDF gives the probability of the switching happening at that instant. Given that, the probability of the switch to have happened is the one determining the clock speed, the difference between the measured and the computed CDF is a better metric to evaluate our model correctness. Moreover CDF is less sensitive to errors due to limited Monte Carlo runs since the sum of each error made for each bean should tend to zero.

# A. Majority Voter and XOR Gate Based on AIG

In Fig. 11, the 3-bit Majority and XOR gate AIGs are presented, where a circle represents an AND gate, a dotted line an inverter, and a solid line a wire. The XOR gate longest path includes 4 AND gates and 4 inverters while the Majority gate longest path spans over 3 AND gates and 2 inverters. A 10,000 samples Monte Carlo SPICE simulation has been carried out assuming the parameter variation models and ranges in Section III. As already mentioned, only the worst-case scenario is of interest.

The worst-case scenario happens, when the inputs change from 001 to 101, and from 011 to 001 for the Majority and XOR gate, respectively. In Fig. 12, the CDFs of the two gates are plotted for a delay range between 0ns and 7ns in steps of 100ps. It is clear that the estimated CDFs match with the results obtained by means of highly time-consuming Monte Carlo simulations. GD fitting is also reported, the inadequacy of such choice is obvious. Table IV presents the error percentage corresponding to different sampling instants and indicates that, on average, our method provide estimates within 0.8% and 1.2% away from Monte Carlo results, for Majority and XOR gates, respectively.



(a) Majority



Fig. 12. Majority and XOR gate CDFs.

TABLE IV. MAJORITY AND XOR GATE CDF DEVIATIONS

| Deviation      | 1ns  | 3ns | 5ns  | 7ns  | Average (0-10ns) |
|----------------|------|-----|------|------|------------------|
| Majority Voter | 3%   | 1%  | 0.3% | 0.1% | 0.8%             |
| XOR            | 0.2% | 2%  | 1.4% | 0.7% | 1.2%             |



Fig. 13. 16-bit RCA CDFs.

TABLE V. 16-BIT RCA CDF DEVIATIONS

| Deviation  | 15ns | 30ns | 45ns | 60ns | Average (0-60ns) |
|------------|------|------|------|------|------------------|
| 16-bit RCA | 1.9% | 0.6% | 0.7% | 0.9% | 1.7%             |

# B. 16-bit Ripple Carry Adder

Given the FA key parameters derived in Section III-C, the corresponding  $\mu$  and  $\lambda$  for a 16-bit RCA can be easily calculated. The same type of input vector mentioned in Section III-C is applied to generate the longest propagation delay and 10,000 Monte Carlo simulations are executed. In Fig. 13, the 16-bit RCA CDFs derived by Monte Carlo simulations and IGD based estimation as well as GD fitting are presented for a delay range between 0ns and 60ns (600ps step size). The matching between two curves, Monte Carlo and IGD, is very close with an average mismatch of 1.7% while the mismatch from GD fitting is visibly worst. CDF deviations for the 15ns to 60ns range with a 15ns step are summarized in Table V, case in which the highest deviation is 1.9% at 15ns while all the others are below 1%.

## V. CONCLUSIONS

In this paper, an accurate with negligible execution time delay approximation model was proposed and compared with the state of the art. Our model is highly accurate (provides a less than 2% average error when compared with Monte Carlo simulations), but also exhibits significant flexibility across various V<sub>dd</sub> and different types of PVT variation values. We demonstrate that it exhibits linear compositionality such that the key model parameters can be straightforward propagated form device/gate level to circuit level. Our simulations indicated that, when compared with Monte Carlo simulation results, it provides high accuracy, e.g., an average error less than 0.8%, 1.2%, and 1.7% for Majority Voter, XOR gate, and 16-bit Ripple Carry Adder, respectively, while providing orders of magnitude simulation time reductions. Given that the approximation and propagation proposed procedure demonstrated remarkable simplicity and accuracy. Future work includes investigation on the practicability of our model on sequential elements, fan-out effects, wire delay and higher complexity circuits.

#### ACKNOWLEDGMENT

This work has been sponsored by the European Commission FP7 FET-Open iRISC (Innovative Reliable Chip Designs from Unreliable Components) project as well as the Science Foundation Ireland Project No 07/IN.1/I977.

#### References

- Gill, B. S., Papachristou, C., & Wolff, F. G. (2006, March). Soft delay error analysis in logic circuits. In *Design, Automation and Test in Europe, 2006. DATE'06. Proceedings* (Vol. 1, pp. 1-6). IEEE.
- [2] Rithe, R., Gu, J., Wang, A., Datla, S., Gammie, G., Buss, D., & Chandrakasan, A. (2010, March). Non-linear operating point statistical analysis for local variations in logic timing at low voltage. In *Design, Automation & Test in Europe Conference & Exhibition (DATE)*, 2010 (pp. 965-968). IEEE.
- [3] Agarwal, A., Blaauw, D., Zolotov, V., Sundareswaran, S., Zhao, M., Gala, K., & Panda, R. (2002, June). Path-based statistical timing analysis considering inter-and intra-die correlations. In *Proc. TAU* (pp. 16-21).
- [4] Berkelaar, M. R. C. M. (1997, December). Statistical delay calculation, a linear time method. In *Proceedings of TAU* (Vol. 97, pp. 4-5).
- [5] Tang, X., De, V. K., & Meindl, J. D. (1997). Intrinsic MOSFET parameter fluctuations due to random dopant placement. *Very Large Scale Integration (VLSI) Systems, IEEE Transactions on*, 5(4), 369-376.
- [6] Hwang, M. E. (2011). Supply-voltage scaling close to the fundamental limit under process variations in nanometer technologies. *Electron Devices, IEEE Transactions on*, 58(8), 2808-2813.
- [7] Merrett, M., Asenov, P., Wang, Y., Zwolinski, M., Reid, D., Millar, C. & Asenov, A. (2011, March). Modelling circuit performance variations due to statistical variability: Monte Carlo static timing analysis. In *Design, Automation & Test in Europe Conference & Exhibition* (*DATE*), 2011 (pp. 1-4). IEEE.
- [8] Wan, L., & Chen, D. (2010, November). Analysis of circuit dynamic behavior with timed ternary decision diagram. In *Proceedings of the International Conference on Computer-Aided Design* (pp. 516-523). IEEE Press.
- [9] Zaynoun, S., Khairy, M. S., Eltawil, A. M., Kurdahi, F. J., & Khajeh, A. (2012, September). Fast error aware model for arithmetic and logic circuits. InComputer Design (ICCD), 2012 IEEE 30th International Conference on (pp. 322-328). IEEE.
- [10] Mörters, P., & Peres, Y. (2010). Brownian motion (Vol. 30). Cambridge University Press.
- [11] Chhikara, R. (1988). The Inverse Gaussian Distribution: Theory: Methodology, and Applications (Vol. 95). CRC Press.
- [12] Brayton, A. M. R. Scalable Logic Synthesis using a Simple Circuit Structure.