[Downloaded from www.aece.ro on Wednesday, July 02, 2025 at 02:35:27 (UTC) by 108.162.216.79. Redistribution subject to AECE license or copyright.]

# An Evolutionary Approach to the Soft Error Mitigation Technique for Cell-Based Design

Jong Kang PARK, Jong Tae KIM

School of Electronics and Electrical Eng., Sungkyunkwan Univ., 300 Cheoncheon-dong Jangan-gu, Suwon, Gyeonggi-do 440-746, South Korea jtkim@skku.edu

Abstract—In this paper, we present a soft error mitigation algorithm that searches for the proper gate sizes within constrained gate-level designs. The individual gate sizing has an impact on the former optimization results and degrades the quality of the solution. In order to address this inefficiency, we utilize a modified topological sort that preserves the preceding local optima. Using a new local searcher, a hybrid genetic optimization technique for soft error mitigation is proposed. This evolutionary search algorithm has general genetic operators: the initialization of the population, crossover, mutation and selection operators. The local searcher consists of two subsequent heuristics. These search algorithms make the individual chromosome move to better search regions in a short time and then, the population acquires various candidates for the global optimum with the help of other genetic operators. The experiments show that the proposed genetic algorithm achieves an approximately 90% reduction in the number of soft errors when compared to the conventional greedy approach with at most 30% overhead for the area and critical path delay.

# *Index Terms*—soft error mitigation, single event transient, cell-based design, cell sizing, hybrid genetic algorithm

#### I. INTRODUCTION

The soft errors, which commonly originate from neutrons or alpha particles, have a worse effect on the system reliability as the feature size decreases, even in a groundlevel application [1,2]. The single event upsets (SEUs) which appear in the memory elements, and represent one of the major phenomena leading to these soft errors, have already been taken into account. It was for this reason that the SRAM and DRAM technologies used in aerospace engineering, such as for aircraft, spacecraft and satellites, had to be improved by several radiation-hardening techniques [3]. However, single event transients(SETs), which are transient faults appearing at the outputs of the logic elements, have a greater possibility of impacting the device soft error rate (SER) than SEUs in the memory elements in recent CMOS technologies [4]. As the geometry shrinks, the node capacitances and power supply get smaller and the circuit speed becomes faster, with the result that the SETs generated by even a small charge are prone to be propagated to the flip-flops or latches, which are not masked by electrical and temporal properties among the intermediate circuit nodes. Eventually, these technical advances expose sea-level electronic systems to those kinds of radiation defects. It is expected that the current and near-future devices will have system SERs of more than  $10^4$  FITs (failures-in-time) [4]. 1 FIT equals a single soft error per  $10^9$  hours. This implies that for an electronic device whose SER is close to  $10^6$  FITs in the terrestrial region, the user will confront one run-time error per 41 days, which might not be recognized directly. This can be a serious problem in commercial products such as personal computers, mobile phone and PDAs. Therefore, radiation-hardening techniques will be more general in terrestrial device designs in the future.

The existing techniques [5-12,15] for evaluating the soft error rates of combinational circuits take into account the effects of electrical, logical and temporal masking for transient faults. Electrical masking for SETs means that the transient noise is suppressed by the input-to-output transfer characteristics of the CMOS gates. Because of the different propagation delays and load capacitances, the effect of electrical masking must be varied in the entire circuit. Logical masking is dependent on the logic gate's controllability and its logic values. For example, a SET is propagated through the logic gates if the other input values are non-control values. Temporal masking is related to the storage time of the F/Fs, which is determined by the clock period, setup/hold time and attenuated SET. Generally, the total SER is the summation of SERs at the F/Fs, whereas the SETs are assumed to be generated independently at all possible circuit nodes (reverse-biased PN junctions).

In order to mitigate soft errors, several radiationhardening techniques have been reported ranging from the process to the algorithm and system-levels. In spite of the architectural overhead, the concept of the triple modular redundancy (TMR), including the voting function, has been one of the most famous and traditional methods used in various levels of abstraction. In particular, gate sizing techniques are also suitable for the cell-based designs that are common in today's cell-based design flow. In this approach, without changing the circuit topology, we select the optimal driving capacity for each logic gate. Gate sizing changes the effect of electrical masking for the logic gates inherently, but its order of traversal is determined by all three masking effects of the SET. It has the merit of a faster execution time than low-level optimization, such as a SPICE or layout-level one.

In this paper, we present a soft error mitigation algorithm that searches for the proper gate sizes within constrained gate-level designs. The target designs have constraints for the marginal circuit area and path delay. Generally, the total block SER is the cumulated probability for the SETs generated from the internal logic gates. Logic gates, which

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2013R1A1A2060954).

Digital Object Identifier 10.4316/AECE.2015.01005

[Downloaded from www.aece.ro on Wednesday, July 02, 2025 at 02:35:27 (UTC) by 108.162.216.79. Redistribution subject to AECE license or copyright.]

give a higher SER to F/Fs (Flip-Flops) and POs (Primary Outputs), can be a preferred candidate for optimization. The individual gate sizing has an impact on the former optimization results and degrades the quality of the solution. To address this inefficiency, we use a modified topological sort that preserves the preceding local optima during gate sizing traversal. The experimental section of this paper shows that the proposed technique selects a better solution than the existing greedy approach under large margins of design constraints and maintains solid performance. The greedy approach, which preferentially visits the logic gates with high individual SERs, cannot preserve the former local search results. With this local searcher a hybrid genetic optimization technique for soft error mitigation is also proposed. This evolutionary search algorithm has general genetic operators: the initialization of the population, crossover, mutation and selection operators. The crossover operator selects the logic gate with the lower individual SER between two parent genes. A mutation is made by changing the driving capacity of the individual gates within a predefined probability. The roulette wheel selection with elitist approach helps the individual to come close to the global optimum in a later generation. The local searcher consists of two subsequent heuristics, which are reduce gate size and reduce\_ISER, respectively. These searchers make the individual chromosome move to better search regions in a short time and, then, the population acquires various candidates for the global optimum with the help of other genetic operators. The experiments show that the proposed genetic algorithm achieves an approximately 90% reduction in the number of soft errors when compared to the conventional greedy approach with at most 30% overhead for the area and critical path delay.

# II. RELATED WORKS

Gate sizing techniques [9-13] for the reduction of the SER are based on a heuristic search and are easily applicable to today's cell-based design flow. Because there are concurrent design metrics, such as the path delays, power consumption and circuit size (area), it is difficult to obtain a globally optimal solution which also includes the minimization in SERs, without degrading the other design metrics. In [8], the individual contribution to the total SER (soft error rate) for each logic gate has a limited value, gate sizing is performed as all logic gates satisfy this threshold. This method is suitable for minimizing the total SER of the design, but it is difficult to determine the proper threshold value when the other design metrics have to be optimized simultaneously. As a more algorithmic approach, within predefined delay, area and power overheads, the SERs can be minimized by priority-based traversing using a priority queue [13,14]. The visiting priority of the logic gate is defined as the contribution to the block SER; a gate with a larger individual SER is preferable to traverse during incremental optimization. Since soft error minimization with a constrained gate-level design is an NP-hard problem [14], such heuristics must be effective in terms of their execution time and quality of the solution. This greedy approach could suffer from the inter-dependency of the incremental optimization, especially in the case where large margins exist for the design metrics. Dhillon et al. introduced a circuit optimization technique [15] which optimizes the average path delay to the POs, energy and critical path delay using an MCS (multi-level coordinate search). Maximizing the average path delay to the POs makes the propagation probability of the SETs small, because of increased degree of electrical attenuation. The resultant SETs are prone to be masked electrically, due to the increased node-to-node propagation delay. However, the individual SER for each gate is dependent not only on its electrical properties, but also on its logical propagation characteristics. The gates have an equal opportunity to be re-sized, regardless of the distance to the POs (or degree of severity). The solution might be degraded, compared to the case where SER is directly used as the cost metric.

The method employed for the SER analysis in this paper is based on our previous two-pass evaluation framework [7]. Since the  $2^{nd}$  stage of the SER evaluation is only dependent on the effects of logical masking, this paper mostly uses the pass-*I* results for the given benchmark circuit optimization.

### III. MITIGATING SOFT ERRORS WITH HYBRID GA

The genetic algorithm (GA) is one of the most famous meta-heuristic tools employed to obtain near-optimum solutions to many search and optimization problems. Basically, the GA handles a *population* or group of solutions, not just a single solution. In spite of its large computing complexity, each solution, called a chromosome in the GA, can be refined iteratively from generation to generation by an evolutionary approach, which mainly consists of mutation, crossover and selection operators. The hybrid GA [18,19] improves the speed of convergence compared to the original GA by using a local search. This mixed algorithm finds the local optima for those offspring which come from crossover and mutation in the current generation and, as a result, the *population* has more candidates to approach the global optimum.

Because of the nonlinearity in the electrical transfer characteristics of each CMOS gate, it is hard to suppress the size of the SET effectively with a fixed directed search. Our technique of topological traversal introduced in previous work [16,17], is applied to the local searcher in the hybrid GA. This leads to better soft error hardness in constrained cell-based designs. This section explains the structure of the proposed evolutionary approach, including its parameters, genetic operators and local heuristics.

# A. Problem Formulation

The technology-mapped netlist of design *B* consists of logic gates,  $g_k \in G$ ,  $0 \le k \le |G|$ -1. We define the netlist of *B* with regard to the specific gate sizing as follows:

 $X = \{x_k \mid g_k \text{'s driving strength}, 0 \le k \le |G| - 1\}$ (1)

where  $x_k = \{1, 2, 4, ...\}$ , as defined by the target cell libraries and the types of logic cells. For example, if a NAND gate can be one of the three types of different driving strengths, then it has different transistor sizes and input-to-output characteristics. Fig. 1 shows the gate sizing result which replaces driving strength  $x_1$  with  $x_2$  at  $g_1$ , without changing its logical function. Here, the objective function for minimizing the number of soft errors, can be written as follows:

#### Advances in Electrical and Computer Engineering

# $x_{1}=2$ $g_{1}=\{1,2,3\}$ $g_{0}$ $x_{1}=1$ $g_{1}$ $g_{2}$ $g_{2}$ $x_{initial}=\{1,1,1\}$ $X_{new}=\{1,2,1\}$

Figure 1. Replacement for driving strength in gate-sizing

Minimize 
$$C(X) = \sum_{i}^{|V|} ISER_i$$
 (2)

s.t. 
$$d_{crit} \leq d_{orig} \cdot M_d$$
,  $A_X \leq A_{orig} \cdot M_A$ 

where *ISER*<sub>i</sub> denotes the individual SER from gate *i* to the POs and F/Fs. This is a similar metric to the MEI (mean error impact) [13] and EPP (error propagation probability) [15].  $d_{crit}$ ,  $A_X$ ,  $M_d$  and  $M_A$  are the critical path delay, circuit area (=gate size) and constraint margins for the delay and circuit area, respectively. These design parameters are marginally constrained to the original critical path delay,  $d_{orig}$  and circuit area,  $A_{orig}$ . C(X) in (2) is equivalent to the block SER, but the definition of *ISER* indicates that the optimization procedure is based on the gate-by-gate traversal. For a given total neutron flux  $F_n$  and technology-independent rate parameter  $\alpha$ , we define *ISER*<sub>i</sub> from the redefinition of the first state SER(*j*) in [7] as follows:

$$ISER_{i}(j) = F_{n} \cdot \alpha \cdot \sum_{SETi \ q} f_{Q}(q) \cdot A_{i}(SET) \cdot GP(SET) \cdot LP_{ij} \cdot EP_{ij} \cdot LW_{ij}$$
(3)  
$$ISER_{i} = \sum^{|PO|+|FF|} ISER_{i}(j)$$
(4)

where  $ISER_i(j)$  denotes the SER at port *j*, which is either a PO or an F/F and its SETs are confined to those generated at gate *i*.  $f_O(q)$ ,  $A_i(SETi)$  and GP(SETi) denote a probability density function for the collected charge q, a region of the faulty site of the logic gate g and the logical generation probability for a SETi from gate i, respectively. Also, LP<sub>ij</sub>,  $EP_{ii}$  and  $LW_{ii}$  are the probabilities of logical propagation, electrical attenuation and latching window from gate i to port *j*, respectively. GP(SETi), which is directly correlated with  $A_i(SETi)$ , is determined by the statistics obtained from the gate-level simulation results. For example, in a NAND gate, as shown in Fig. 2, SETs having different widths, peak voltages and transition time would be generated under different input bias conditions and faulty sites, even for the same collected charge, q [7,22]. Therefore, we can define a SET instance (SETi) as an object, which has a specific pulse width, fault type ('0' or '1'), fault area and generation probability for the input combination. SETi can be extracted using the pre-characterization results obtained from both SPICE and gate-level simulations for standard cell-based designs. For a given node capacitance and q, the width of SETi, which means the time to change 50% to 50% or FWHM (Full Width at Half Maximum) of the noise voltage, can be obtained from the linear interpolation between the discrete pulse widths and corresponding load capacitances. For each SETi, the collected charge q will be varied and then the different values of  $f_O(q)$  tested for (3).



Volume 15, Number 1, 2015

Figure 2. SET instances for 2-input NAND gate [17]



Figure 3. The hybrid GA for soft error mitigation

#### B. GA Representation

Fig. 3 shows the flow chart of our evolutionary algorithm. The population at the *t*-th generation,  $0 \le t \le t_max$ -1, of the GA is defined as follows:

 $POP_t = \{X_i \mid X_i \text{ is } i-th \text{ chromosome } 0 \le i \le pop\_size-1\}$  (5) where *pop\_size* denotes the total number of chromosomes in the current generation and *t\_max* the maximum number of generations. Each individual (chromosome),  $X_i$  consists of a gene  $x_k$ , according to the definition of (1). The total number of genes is determined by the total number of logic gates in design *G*. The whole process minimizes the total SER of *G* within the predefined gate size and critical path delay as shown in (2).

After the initial creation of  $POP_0$ , the *ISER* for each logic gate and the total SER of G will be evaluated for  $X_i$ . Then, performing genetic operators such as crossover and mutation with probabilities  $P_c$ ,  $P_m$  and  $P_{mc}$  creates the group of offspring,  $O_t$ . Each individual in  $O_t$  can be improved by two subsequent local heuristic algorithms, which originate from the ISER-based topological traversal. In the last stage of the generation, the roulette wheel selection with elitist approach determines the pop size chromosomes in  $POP_t \cup O_t$  to be used in the next generation. This selection operator protects the best fit solution in the current generation during the selection so that it survives in the next generation. This process is iterated until t=t max. The two optimization techniques shown in Figs. 4 and 5 using the topological traversal cause the new offspring to be locally optimized and the scale of the genetic operations can be decreased. Possibly, a large number of candidate chromosomes for the global optimum can be found in a short time. Thus, even if we define t max<30 and pop size<30, this hybrid algorithm provides a reasonable reduction in the SER for the most cases. The relevant numerical experimental results will be introduced in Section IV.

# C. Local Searchers for Hybrid GA

As shown in [16,17], the *ISER*-based topological sort enables incremental optimization, so that the results of the algorithm would be individual logic gate sizing with lower *ISER* than the original design.

| Procedure reduce_circuit_area                                  |
|----------------------------------------------------------------|
| <b>input</b> : netlist, $X_{init}$                             |
| <b>output</b> : updated netlist, $X'$                          |
| $ISERs = calculate_ISERs(X_{init})$                            |
| {X', ISERs'} = ISER_based_topological_sort(netlist with ISERs) |
| for $x_i \in X'$                                               |
| cur ISER = ISER,                                               |
| LOPT = cur ISER                                                |
| foreach available alternative cell, $x_k < x_i$                |
| $X' = \text{sizing\_alternative\_cell}(x_i, x_k)$              |
| if (prev_delay < current_delay and                             |
| delay_constraint is violated)                                  |
| sizing_alternative_cell( $x_k, x_j$ )                          |
| continue                                                       |
| $ISER_{k} = update_{ISER}(X)$                                  |
| $LOPT_{k} = ISER_{k}$                                          |
| if $(LOPT \ge LOPT_{\nu})$                                     |
| $LOPT = LOPT_{k}$                                              |
| else                                                           |
| sizing_alternative_cell( $x_k, x_i$ )                          |
|                                                                |

Figure 4. Pseudo-code for reduction in circuit area

# Procedure reduce\_ISER

| <b>input</b> : netlist, X <sub>init</sub>                                               |
|-----------------------------------------------------------------------------------------|
| output : updated netlist, X'                                                            |
| $ISERs = calculate_ISERs(X_{init})$                                                     |
| { <i>X'</i> , <i>ISERs'</i> } = ISER_based_topological_sort(netlist with <i>ISERs</i> ) |
| for $x_i \in X'$                                                                        |
| $cur_{ISER} = ISER_{i}$                                                                 |
| $LOPT = cur_{ISER}$                                                                     |
| foreach available alternative cell, $x_k$                                               |
| $X' = \text{sizing\_alternative\_cell}(x_i, x_k)$                                       |
| if (area_constraint or delay_constraint is violated)                                    |
| sizing_alternative_cell( $x_k, x_i$ )                                                   |
| break                                                                                   |
| $ISER_k = update_ISER(X)$                                                               |
| $LOPT_k = ISER_k$                                                                       |
| if $(LOPT > LOPT_k)$                                                                    |
| $LOPT = LOPT_k$                                                                         |
| else                                                                                    |
| sizing_alternative_cell( $x_k, x_i$ )                                                   |
|                                                                                         |

Figure 5. Pseudo-code for reduction in ISER

The digital design has many logic gates and the genetic operators change their driving strengths frequently when new offspring are created by mutation and crossover. This might be another starting point in the search for a local optimum without directly advancing to a higher generation. Eventually, the use of this optimization technique provides for the fast convergence of the genetic population toward the global optimum.

This technique consists of two subsequent topological traversals, *reduce\_circuit\_area* and *reduce\_ISER* based on our previous study [17]. In Figs. 4 and 5, we suggest pseudo-codes for these heuristics. Note that the sort used for logic gate ordering in [16] is based on the *ISER* per unit gate size, but the topological sorts in Figs. 4 and 5 only take the *ISER* as the key. The procedure, *reduce\_circuit\_area*, minimizes the circuit area, without worsening the *ISER*. On the other hand, the function, *reduce\_ISER*, reduces the *ISER* without violating the design constraints defined in (2). Two slightly different algorithms are applied to every offspring which might be over-sized, excessively-delayed or more reducible in terms of SER.

Because of the nonlinearity and complexity in the gatelevel design, the solutions generated by crossover and mutation are prone to violate the design constraints. In most cases, the resultant gate sizes become larger than their predefined maximum the circuit size,  $A_{orig}M_A$ . It is impossible to perform reduce ISER directly if the design has already been violated. Any violations would cause this procedure to terminate immediately. Therefore, reduce circ*uit area* has a major role as the typical "repair function" in the GA in order to move the individual to feasible regions (i.e., those which meet the design constraints). If we have more room for the marginal circuit size and delays in this step, the total SER can be reduced in the subsequent function, reduce ISER. The candidates to be replaced are those logic gates whose driving strengths are less than that of the input logic gate  $(\in X_{init})$ . In conjunction with the circuit size, the path delays between the sequential elements are also protected in the algorithm. If the delay after resizing is increased and does not satisfy  $d_{orig}M_d$ , the previous solution is maintained, as shown in Fig. 5. reduce ISER takes the local optimum with the lowest ISER at every gate traversal. Note that the critical path delay and circuit size must be kept within the design constraints. Solutions which have over-size and excessive delays cannot be accepted by each incremental step. The individuals in the GA have different margins for the delay and circuit area after performing reduce circuit area. Thus, the procedure, reduce ISER, causes each chromosome to become a different, good solution, which is not bounded to the specific local optima along the generations. These two searchers are the key functions in our hybrid GA.

# D. Genetic Operators

The details of the genetic operators are as follows:

• Initialization – The initial population, *POP*<sub>0</sub> is generated by modifying  $x_k$  randomly with the predefined probability  $iP_{mp}$ . In other words, when an initial gene  $x_k$  satisfies  $iP_{mp}$ , another  $x_k$ ' is selected within the supported driving strengths of the corresponding logic gate. If  $iP_{mp}$  is not satisfied,  $x_k$  will be unchanged. This helps the chromosomes of  $POP_0$  to be placed mostly in the feasible region of search space. When we do not change the limited parts of the design, it is hard to obtain reasonable solutions even in later generations (i.e., most of the solutions exceed  $A_{orig}M_A$  and/or  $d_{orig}M_d$ ).

- Crossover If X<sub>i</sub> and X<sub>j</sub> of POP<sub>t</sub> are selected with P<sub>c</sub>, then this operator determines x<sub>k</sub> having a lower *ISER* in X<sub>i</sub> and X<sub>j</sub>. This is illustrated in Fig. 6.
- Mutation If X<sub>i</sub> from POP<sub>t</sub> is selected with P<sub>m</sub>, then each x<sub>k</sub> will be changed to another driving strength with P<sub>mp</sub>. For each selected gene, the driving strengths are determined randomly.
- Evaluation function A chromosome is evaluated by summing the *ISERs* of all logic gates in a block, namely the SER. However, when a solution does not meet the design constraints, the original evaluation value is increased by a penalty. This enables that a solution with a higher SER can survive in the current generation, even if its design constraints are slightly exceeded against the initial design requirements. Our GA does not cut off such infeasible solutions at once. The following equation gives the penalty function of our hybrid GA for infeasible solutions :

$$penalty(SER) = SER \times \frac{d_{crit}}{M_d \ d_{orig}} \times \frac{A_X}{M_A \ A_{orig}}$$
(6)

• Selection – The inversion of the evaluation result is used in the selection probability. Using the roulette wheel selection, it determines  $pop\_size$  chromosomes of  $POP_{t+1}$ in  $POP_t \cup O_t$ . Note that the best solution which has the lowest evaluation value in the *t*-th generation will be preserved.

#### **IV. EXPERIMENTS**

In this section, we show the experimental results for the ISCAS-85/89 benchmark circuits. The target CMOS cell library is vsclib013 developed by G. Petely. For the SET pre-characterization process, SPICE simulations were performed iteratively and the SET widths of each logic gate were obtained according to the discrete load capacitances and collected charges q. The charge collection slope and time constant for SET used in the SPICE simulation were extracted from the previous research [21]. Table I summarizes the total of 38 characterized logic cells used in these experiments. The gate-level netlists for the benchmark circuits were logic-synthesized from the structural-level Verilog descriptions. The corresponding pin-to-pin logical value statistics for the gate-level designs were also generated by random test vectors and the relevant utilities supporting the gate-level simulation tool. Fig. 7 shows the estimated SET characteristics at the logic gate output for the various types of logic gates and driving strengths. Generally,



Figure 6. Crossover operation

increasing the number of transistors in a device leads to an increase in the number of reverse-biased PN junctions of the CMOS gates. Therefore, higher driving cells having more sensitive transistor sites are susceptible to generate SETs than the lower ones, because of the increased number of transistors integrated (Fig. 7a). Similarly, NAND or NOR gates have more SET sources than inverters due to the increasing number of transistors. However, in the most cases, the generated SET widths which directly affect the circuit SER, can be electrically suppressed by using higher signal strengths of their own logic cells, as shown in Fig. 7b.

Table II and III summarizes the optimization results of the SER reduction algorithms in the sea-level application where the neuron flux is defined as  $56.15n/m^2/s$  for 10-1000MeV

TABLE I. A SUMMARY FOR THE CHARACTERIZED LOGIC GATES (VSCLIB013)

| Logic cells       | Driving strengths                    | Functions            |  |
|-------------------|--------------------------------------|----------------------|--|
| an2v0x05-an2v0x8  | 0.5X, 1X, 2X, 3X, 4X,<br>6X, 8X      | Two-input AND        |  |
| bf1v0x05-bf1v0x12 | 0.5X, 1X, 2X, 3X, 4X,<br>6X, 8X, 12X | Non-inverting buffer |  |
| iv1v9x05-iv1v0x12 | 0.5X, 1X, 2X, 3X, 4X,<br>6X, 8X, 12X | Inverter             |  |
| nd2v0x05-nd2v0x8  | 0.5X, 1X, 2X, 3X,<br>4X, 6X, 8X      | Two-input NAND       |  |
| nr2v0x05-nr2v0x8  | 0.5X, 1X, 2X, 3X,<br>4X, 6X, 8X      | Two-input NOR        |  |
| dfnt1v0x2         | 2X                                   | D-type flip-flop     |  |



(a) The number of SET instance for driving strengths



(b) Total SET widths for driving strengths (load capacitance=30fF, q=50fC) Figure 7. Electrical characteristics of SET at the faulty site

#### Advances in Electrical and Computer Engineering

| TABLE II. EAI ERIMENTAE RESOLISTOR SER REDOCTION (ISEAS 05) |        |                        |              |           |                             |                             |                              |  |
|-------------------------------------------------------------|--------|------------------------|--------------|-----------|-----------------------------|-----------------------------|------------------------------|--|
| Circuit                                                     | A orig | d <sub>orig</sub> [ps] | SER [FIT]    | Algorithm | SER, $M_A, M_d = 1.1$ [FIT] | SER, $M_A, M_d = 1.2$ [FIT] | SER, $M_A$ , $M_d$ =1.3[FIT] |  |
| C17 27                                                      |        |                        |              | GY        | 1.09E-6                     | 2.38E-7                     | 2.66E-7                      |  |
|                                                             | 295    | 1.97E-6                | Тор          | 1.09E-6   | 8.90E-7                     | 7.48E-7                     |                              |  |
|                                                             |        |                        | GA           | 7.07E-7   | 1.68E-7                     | 7.57E-8                     |                              |  |
| C432 925                                                    |        | 1990                   | 8.97E-6      | GY        | 7.41E-6                     | 4.97E-6                     | 4.05E-6                      |  |
|                                                             | 925    |                        |              | Тор       | 7.55E-6                     | 4.53E-6                     | 4.01E-6                      |  |
|                                                             |        |                        |              | GA        | 7.12E-6                     | 3.07E-7                     | 6.11E-8                      |  |
| C499 2161                                                   |        | 2000                   | 5.19E-5      | GY        | 4.26E-5                     | 3.72E-5                     | 2.82E-5                      |  |
|                                                             | 2161   |                        |              | Тор       | 4.90E-5                     | 3.55E-5                     | 2.61E-5                      |  |
|                                                             |        |                        |              | GA        | 2.55E-5                     | 1.88E-5                     | 5.53E-6                      |  |
|                                                             |        | 2000                   | 4.99E-5      | GY        | 3.58E-5                     | 2.05E-5                     | 1.63E-5                      |  |
| C880 16                                                     | 1630   |                        |              | Тор       | 3.39E-5                     | 1.15E-5                     | 8.20E-6                      |  |
|                                                             |        |                        |              | GA        | 3.16E-5                     | 9.73E-6                     | 4.41E-6                      |  |
| C1355 2410                                                  |        |                        | GY           | 1.20E-5   | 1.07E-5                     | 9.87E-6                     |                              |  |
|                                                             | 2410   | 1998                   | 1998 6.66E-5 | Тор       | 2.28E-5                     | 1.31E-5                     | 9.90E-6                      |  |
|                                                             |        |                        |              | GA        | 8.33E-6                     | 3.05E-6                     | 9.76E-7                      |  |
|                                                             |        |                        | 5.75E-5      | GY        | 4.38E-5                     | 3.15E-5                     | 3.08E-5                      |  |
| C1908                                                       | 2630   | 1991                   |              | Тор       | 4.67E-5                     | 3.04E-5                     | 2.45E-5                      |  |
|                                                             |        |                        |              | GA        | 2.40E-5                     | 1.01E-5                     | 5.34E-6                      |  |
|                                                             | 2838   |                        | 8.78E-5      | GY        | 8.83E-5                     | 1.61E-5                     | 1.43E-5                      |  |
| C2670                                                       |        | 1989                   |              | Тор       | 2.43E-5                     | 1.70E-5                     | 1.09E-5                      |  |
|                                                             |        |                        |              | GA        | 2.25E-5                     | 7.52E-6                     | 2.05E-6                      |  |

TABLE II. EXPERIMENTAL RESULTS FOR SER REDUCTION (ISCAS-85)

 TABLE III. EXPERIMENTAL RESULTS FOR SER REDUCTION (ISCAS-89)

| Circuit           | $A_{orig}$ | d <sub>orig</sub> [ps] | SER [FIT] | Algorithm | SER, $M_A, M_d = 1.1$ [FIT] | SER, $M_A, M_d = 1.2$ [FIT] | SER, $M_A$ , $M_d$ =1.3[FIT] |
|-------------------|------------|------------------------|-----------|-----------|-----------------------------|-----------------------------|------------------------------|
| <i>S</i> 298 745  |            |                        | 9.90E-6   | GY        | 1.84E-6                     | 1.01E-5                     | 1.16E-05                     |
|                   | 745        | 1431                   |           | Тор       | 2.63E-6                     | 1.70E-6                     | 2.30E-7                      |
|                   |            |                        | GA        | 1.37E-6   | 1.30E-7                     | 1.02E-12                    |                              |
|                   |            | 1754                   | 8.35E-6   | GY        | 1.82E-6                     | 1.10E-6                     | 1.13E-6                      |
| S344 75           | 753        |                        |           | Тор       | 6.15E-6                     | 3.92E-6                     | 9.35E-7                      |
|                   |            |                        |           | GA        | 1.07E-6                     | 5.33E-7                     | 1.70E-9                      |
| <i>S</i> 386 599  |            |                        | 1.08E-5   | GY        | 1.12E-5                     | 9.70E-7                     | 6.00E-7                      |
|                   | 599        | 1190                   |           | Тор       | 6.26E-6                     | 2.17E-6                     | 4.80E-6                      |
|                   |            |                        |           | GA        | 2.92E-6                     | 7.70E-7                     | 0.0                          |
| <i>S</i> 526 1137 |            | 1372                   |           | GY        | 1.07E-5                     | 2.58E-7                     | 1.70E-7                      |
|                   | 1137       |                        | 1.06E-5   | Тор       | 4.77E-8                     | 0.0                         | 0.0                          |
|                   |            |                        |           | GA        | 2.05E-8                     | 0.0                         | 0.0                          |
|                   |            | 1835                   | 2.69E-5   | GY        | 2.99E-6                     | 7.55E-7                     | 4.30E-7                      |
| S641              | 1019       |                        |           | Тор       | 2.86E-6                     | 4.06E-7                     | 1.90E-7                      |
|                   |            |                        |           | GA        | 1.63E-6                     | 1.95E-7                     | 0.0                          |
|                   |            | 1821                   | 6.11E-5   | GY        | 2.96E-6                     | 3.06E-6                     | 2.56E-7                      |
| S838              | 1840       |                        |           | Тор       | 9.10E-6                     | 1.02E-6                     | 0.0                          |
|                   |            |                        |           | GA        | 1.46E-6                     | 2.20E-7                     | 0.0                          |
|                   |            | 1837                   | 2.19E-5   | GY        | 1.64E-6                     | 6.28E-7                     | 3.76E-7                      |
| <i>S</i> 953      | 2099       |                        |           | Тор       | 1.30E-5                     | 1.39E-6                     | 5.59E-7                      |
|                   |            |                        |           | GA        | 1.42E-6                     | 5.18E-7                     | 6.89E-8                      |
|                   |            |                        |           | GY        | 3.07E-6                     | 1.22E-6                     | 9.18E-7                      |
| S1196             | 2469       | 1609                   | 2.39E-5   | Тор       | 9.76E-6                     | 7.73E-6                     | 1.56E-6                      |
|                   |            |                        |           | GA        | 3.07E-6                     | 1.00E-6                     | 4.53E-7                      |
|                   |            | 1624                   | 2.06E-5   | GY        | 2.78E-6                     | 1.56E-6                     | 1.33E-6                      |
| S1238             | 2468       |                        |           | Тор       | 7.86E-6                     | 5.29E-6                     | 1.19E-6                      |
|                   |            |                        |           | GA        | 2.40E-6                     | 1.22E-6                     | 8.28E-8                      |

[20] and the effective injection rate of a neutron to the target device is  $2.2 \cdot 10^{-5}$  [21]. Assume that the POs not driven by any internal F/Fs are the inputs of outside F/Fs and its setup plus hold time of the clock transition is defined as 221.5ps. Thus, all types of soft errors to be evaluated are appeared at the inside and the outside F/Fs.

"GY" refers to the conventional greedy algorithm which preferably traverses the logic gates with a larger *ISER*. 'Top' denotes the *ISER*-based topological traversal introduced in [16]. GA denotes the hybrid GA introduced in Section III, with parameters,  $t_max=7$ ,  $pop_size=25$ ,  $P_c=0.2$ ,  $P_m=0.3$ ,  $P_{mp}=0.06$  and  $iP_{mp}=0.2$ . Each experiment for the benchmark circuit was further extended with margins,  $M_A$  and  $M_d$ . (i.e., 10%, 20% and 30% margins of the original delays and circuit area were allowed in the experiments) Thus, the results indicate how the amount of soft errors can be reduced by the given method where additional design cost increases. Again, Table IV shows the summary of SER mitigation levels between these three different algorithms, normalized to 'Greedy' and the initial design according to  $M_A$  and  $M_d$ . Since each benchmark circuit has a different logic gate size, we used different weights for the circuit size in the evaluation, accordingly.

As shown in the experiments, the conventional greedy algorithm provides the better solution in the lower  $M_A$  and  $M_d$  than the topological traversal. This algorithm, however cannot protect the local optimum in former searches and might degrade the SER reduction rate later on in the gate traversal process. Because gate sizing is performed more frequently in the case of high  $M_A$  and  $M_d$ , it is easy to

falsely size the logic gates and, consequently, the total SER might be increased or nearly unchanged. We can see such similar cases in C17, C1355, C2670, S298, S344 and S1238. On the other hand, the reverse topological traversal gains a greater reduction of the SER than the greedy approach, even if there is some degradation in the reduction of SER in the case of small design margins. It produces better solutions against the initial design in all of the experiment results listed in Table II and III. By requiring more computing power, the hybrid GA including local searchers mitigates 97.6% of the SER at 30% delay margins. There are the results with zero SERs in the cases of S386, S526, S641 and S838. These results mean very small numbers in SER due to the discrete steps of collected charge q in these experiments. Although the number of individuals and the number of the total generations is quite small, the local optimizers (reduce circuit area and reduce ISER) refine the newly created chromosomes successively in a generation and finally improve the soft error tolerance of the overall population.

Fig. 8 shows the efficiency of the given algorithms as the design constraints vary. The results show how amount of SER can be reduced by increasing the circuit size and delay budgets. SER<sup>-</sup> for a specific design constraint denotes SER for the lesser constraint. On the other hand, SER<sup>+</sup> is the improved SER by the use of the given reduction technique at the extended design constraint. A ratio of SER<sup>+</sup> to SER<sup>-</sup> for a given reduction technique means how amount of reduction in SER can be achieved by the additional costs for

TABLE IV. SUMMARIZED CIRCUIT OPTIMIZATION RESULTS

| Heuristics | Norma | alized to G | Normalized to Initial<br>SER [%] |      |      |      |
|------------|-------|-------------|----------------------------------|------|------|------|
| $M_A, M_d$ | 1.1   | 1.2         | 1.3                              | 1.1  | 1.2  | 1.3  |
| GY         | 100.0 | 100.0       | 100.0                            | 32.3 | 25.3 | 23.0 |
| Тор        | 624.7 | 170.0       | 85.5                             | 48.1 | 24.6 | 13.8 |
| GA         | 63.1  | 37.3        | 11.0                             | 18.8 | 7.6  | 2.4  |

"Normalized to GY" summarizes the relative SER [%] obtained by each algorithm compared to the greedy approach.

"Normalized to Initial SER" denotes the reduction of the SER that can be achieved by each algorithm compared to the un-optimized design. The initial design has 100% of its own SER in this case.



Design overheads

Figure 8. SER improvement according to design overheads

delay and the circuit size.

For example, a SER<sup>+</sup>/SER<sup>-</sup> ratio of 20% design overhead means the ratio of the SER at  $M_A, M_d=1.2$  to that at  $M_{4}, M_{d}=1.1$ . By comparing such numbers between the candidate algorithms, we can check the SER improvement in 20% design margin against in the case of 10% margins allowed. Note that the initial SER of the input target design was used as SER<sup>-</sup> at  $M_A, M_d = 1.1$ . The numerical values were also adjusted for each circuit size, as applied in Table IV. Consequently, the results show the benefit for extension of marginal constraint in GA and topological search algorithms. The topological traversal and proposed GA has robust optimization performance, independent of the degree of the design constraints. However, it is clear that the greedy approach suffers from increasing inefficiency as the design margins increase. The greedy algorithm has superior performance at the 10% margins of the initial design constraint, but is severely degraded when the margins become more than 20%. This means that we cannot expect a large reduction in SER with the greedy approaches, even if we allow more design budget for the circuit size and critical path delays.

The topological traversal takes a slightly faster execution time than the greedy algorithm, because the block SER must be re-evaluated at the end of the greedy algorithm. For the incremental optimization, the proposed local search does not need to re-evaluate the SER. The hybrid GA, which runs iteratively up to total 7 generations with 25-sized population, requires a computing time approximately 70-100 times greater than the single run time for the local heuristics (greedy and topological algorithms) in the experiments.

#### V. CONCLUSION

In a perspective of device and system reliability, SETs of complex logic integrated circuits still give a great concern for future technology nodes. High density of transistor integration confronts increasing the number of soft error occurrence even in terrestrial region. This paper describes a gate-sizing technique for the reduction of the SER using a hybrid genetic algorithm. The proposed algorithm can be easily applied to today's cell based design flow, which exploits table-based CMOS delay model such as CCS (Composite Current Source) and NLDM (Non-Linear Delay Model) in the gate-level analysis. Since gate-level optimization techniques are also generally used to reduce the delay, power and circuit area, the SER would be another design metric which can be refined at this level of abstraction. The conventional greedy algorithm does not reflect the variation of the propagation delay originating from the transition time for SETs and load capacitances and, therefore the individual SER originating from the logic gate cannot be preserved during the optimization. The topological gate traversal in the GA enables incremental optimization, which maintains the former search results while improving its efficiency at the large margins defined in the design specification. In conjunction with the crossover, mutation and selection operators, this hybrid approach gives more than 97% reduction in the SER on average whereas allowing for 30% design overhead in the experiments.

Based on the current cell-based analysis and optimization results, our future work will be further extended to study the register transfer level reduction techniques for transient faults and soft errors. This will be accompanied by developing the system-level metric to evaluate soft error susceptibility for digital systems.

#### REFERENCES

- [1] J. L. Leray, "Effects of atmospheric neutrons on devices, at sea level and in avionics embedded systems," Microelectronics Reliability, 2007. [Online]. vol.47, pp. 1827-1835, Available: doi: 10.1016/j.microrel.2007.07.101
- [2] R. C. Baumann, "Soft Errors in Commercial Integrated Circuits," Int. J. of High Speed Electronics and Systems. vol. 14, no.2, pp. 299-309, 2004. [Online]. Available: doi:10.1142/S0129156404002363 E. Normand, "Single Event Effects in Avionics and On The Ground,"
- [3] Int. J. of High Speed Electronics and Systems. vol. 14, no. 2, pp. 285-298, 2004. [Online]. Available: doi: 10.1142/S0129156404002351
- P. Shivakumar, M. Kistler, S. W. Keckler, D. Burger and L. Alvisi, [4] "Modeling the Effect of Technology Trends on the Soft Error Rate of Combinational Logic," Proc. of Int. Conf. on Dependable Systems and Networks, pp. 389-398, 2002. [Online]. Available: doi: 10.1109/DSN.2002.1028924
- M. Zhang and N. R. Shanbhag, "Soft-Error-Rate-Analysis (SERA) [5] Methodology," IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, vol. 25, no. 10, pp. 2140-2155, 2006. [Online]. Available: doi:10.1109/TCAD.2005.862738
- R. R. Rao, K. Chopra, and D. T. Blaauw, "Computing the Soft Error [6] Rate of a Combinational Logic Circuit Using Parameterized Descriptors," IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, vol. 26, no. 3, pp. 468-479, 2007. [Online]. Available: doi: 10.1109/TCAD.2007.891036
- [7] J. K. Park, H. S. Choi and J. T. Kim, "A Soft Error Analysis Tool for High-Speed Digital Designs," Proc. of 2nd Int. conf. on Ubiquitous Information Management and Communication, pp. 280-282, 2008. [Online]. Available: doi:10.1145/1352793.1352849
- S. Kwon, J. K. Park and J. T. Kim, "An approximated soft error analysis technique for gate-level designs," IEICE Electronics Express, [8] vol. 11, no. 10, pp. 1-7, 2014. [Online]. Available: doi: 10.1587/elex.11.20140224
- H-K. Peng, C. H-P. Wen and B. Jayanta, "On soft error rate analysis [9] of scaled CMOS designs: a statistical perspective," Proc. of IEEE/ACM Int'l Conf. on Computer-Aided Design, pp. 157-163, 2009. [Online]. Available: doi:10.1145/1687399.1687428
- [10] Y-H. Kuo, H-K. Peng and C. H-P. Wen, "Accurate statistical soft error rate (SSER) analysis using a quasi-Monte Carlo framework with

quality cell models." Proc. of 11th Int'l symposium on Quality Electronic Design (ISQED), pp. 831-838, 2010. [Online]. Available: doi:10.1109/ISQED.2010.5450485

- [11] A. C-C. Chang, R. H-M. Huang and C. H-P. Wen, "CASSER: A Closed-Form Analysis Framework for Statistical Soft Error Rate," IEEE Trans. on Very Large Scale Integration, vol. 21, no. 10, pp. 1837-1848, 2013. [Online]. Available: doi:10.1109/TVLSI.2012.2220 386
- [12] N. M. Zivanov and D. Marculescu, "MARS-C: Modeling and Reduction of Soft Errors in Combinational Circuits," Proc. of Design Automation Conference, pp. 767-772, 2006. [Online]. Available: doi:10.1109/DAC.2006.229323
- Q. Zhou and K. Mohanram, "Cost-Effective Radiation Hardening [13] Technique for Combinational Logic," Proc. of Int. conf. on Computer Aided Design, pp. 100-106, 2004. [Online]. Available: doi: 10.1109/ICCAD.2004.1382551
- [14] H. Asadi and M. Tahoori, "Soft error hardening for logic-level designs," Proc. of IEEE Symp. on In Circuits and Systems, 2006. [Online]. Available: doi:10.1109/ISCAS.2006.1693540
- Y. S. Dhillon, A. U. Diril, A. Chatterjee, and A. D. Singh, "Analysis [15] and Optimization of Nanometer CMOS Circuits for Soft-Error Tolerance," IEEE Trans. on Very Large Scale Integration Systems, vol. 14, no. 5, pp. 514-524, 2006. [Online]. Available: doi: 10.1109/ TVLSI.2006.876104
- [16] J. K. Park and J. T. Kim, "A soft error mitigation technique for constrained gate-level designs," IEICE Electronics Express, vol. 5, 698-704, 2008. [Online]. no. 18, pp. Available: doi: 10.1587/elex.5.698
- [17] J. K. Park and J. T. Kim, "A Cell Sizing Technique for Mitigating Logic Soft Errors in Gate-level Designs," Advances In Electrical and Computer Engineering J., vol. 13, no. 4, pp. 13-18, 2013. [Online]. Available: doi: 10.4316/AECE.2013.04003
- [18] M. Gen and R. Cheng, "Genetic Algorithms and Engineering Optimization", pp.1-39, John Wiley & Sons, 2000. [19] M. Gen and R. Cheng, "Genetic Algorithms and Engineering
- Design", pp. 31-33, John Wiley & Sons, 1997. J. F. Ziegler, "Terrestrial cosmic rays," IBM J., vol. 40, no. 1, pp. 19-
- [20] 39, 1996. [Online]. Available: doi: 10.1147/rd.401.0019
- P. Hazucha, and C. Svensson, "Impact of CMOS technology scaling [21] on the atmospheric neutron soft error rate," IEEE Trans. on Nuclear Science, vol. 47, no. 6, pp. 2586-2594, 2000. [Online]. Available: doi:10.1109/23.903813
- [22] B. Zhang, W. Wang, and M. Orshansky, "FASER: Fast Analysis of Soft Error Susceptibility for Cell-Based Designs," Proc. of 7th Int. Symp. on Quality Electronic Design, pp. 755-760, 2006. [Online]. Available: doi:10.1109/ISQED.2006.64