# Circuit Layout and Yield

## **CHARLES KOOPERBERG**

Abstract — In this article the relation between circuit layout and yield is studied. A well-known yield formula is extended to a more general yield formula, in which not necessarily every defect is fatal. A defect sensitivity function is discussed. This function contains, in principle, all information about a chip layout necessary to calculate the yield. The relation between this sensitivity function and the yield formula is explained. If the defect size distribution is known the defect sensitivity function can be used to compute an optimal shrinking factor. An example, in which several defect size distributions are used, shows that these computations are highly sensitive for the form of the defect size distribution.

#### I. Introduction

SINCE the pioneering work of Murphy [1], several articles have been published on the modeling of the yield of integrated circuits [2]–[9]. Most articles deal mainly with the description of the yield as a function of the area of the circuit; little attention is paid to the relation between circuit layout and yield. However, it is clear that not all defects are fatal to a circuit. Whether a defect is fatal will, in general, depend both on the type of the defect and on the layout of the circuit in the area where the defect is located.

# II. YIELD FORMULAE

Several formulae for the yield, the probability that a circuit has no fatal defects, are known from literature [1]-[9]. One of the most popular yield formulae is:

$$Y = \left(1 + AD/c\right)^{-c} \tag{1}$$

where A is the area of the circuit, D the defect density, and c a parameter. There are two justifications known for this formula:

- a) defects have a tendency for clustering; c is a parameter that indicates the degree of clustering [7], [10];
   and
- the defect density differs from wafer to wafer and from lot to lot; its distribution is (approximately) a gamma distribution, whose shape is determined by c [11].

With c = 1, (1) becomes the well-known Seed's formula [2], and with  $c = \infty$ , it becomes the Poisson yield formula [1].

Manuscript received November 2, 1987; revised January 7, 1988. The author was with the Philips Research Laboratories, Eindhoven, The Netherlands. He is now with the Department of Statistics, University of California, Berkeley, CA 94720. IEEE Log Number 8821660.

Formula (1) can be extended easily to a more general yield formula, which takes into account the fact that not every defect is fatal [12]. Therefore, we shall distinguish the following two concepts:

- defects are local disturbances on the surface of a wafer, with a supposed random character, for example, dust particles; and
- 2) faults are failures of the circuit, caused by defects. Not every defect causes a fault. The probability that two or more defects cause a fault together (but that none of them causes a fault individually) is neglected.

Typically the layout of a circuit is seen to consist of several parts, each with its own density of structure (e.g., in memories there are the peripheral structure and the matrix of cells). It is intuitively clear that the effect of a similar defect on the chip is strongly dependent on where the defect is located. A defect located in a region of the chip with much detailed structure has a larger probability of causing a fault than a defect which is located in a region with less structure. Therefore the chip will be divided into J chip regions  $R_i$ , with area  $A_i$   $(j=1,\dots,J,\sum A_j=A)$ , in such a way that within each chip region the density of structure is about constant. It is assumed that within each chip region  $R_i$  every defect has a constant probability  $P_i$ of causing a fault. Chip regions do not necessarily need to be connected. For example, in a memory chip one could define three chip regions: the memory cells, the peripheral structure, and the area in between the cells.

To incorporate this into (1), further assumptions about the distribution of the defects have to be made. Viewed in the light of a) and b) above it does not seem natural to assume that the presence of defects in different chip regions is independent (which would be computationally easy). However, it seems more natural to assume that the distribution of the defects over the whole chip, which is assumed for (1), is still valid for the whole chip in this case. Therefore, in (1) AD is replaced by  $\Sigma A_j DP_j$  to incorporate the fact that there are differences between the chip regions (the " $\Sigma$ ") and that not every defect is fatal (replacing D by  $DP_j$ ). Formula (1) now becomes

$$Y = \left(1 + \sum_{j} A_{j} DP_{j}/c\right)^{-c}.$$
 (2)

This formula is only valid for one type of defect. When

there are N types of defects, this formula becomes

$$Y = \prod_{i=1}^{N} \left( 1 + \sum_{i} A_{j} D_{i} P_{ij} / c_{i} \right)^{-c_{i}}.$$
 (3)

We replace D and c by  $D_i$  and  $c_i$ , since each type of defect will have a different density, and we replace the  $P_j$  by  $P_{ij}$  since each type of defect might have a different chance of causing a fault. The product follows from the fact that a chip will only work if none of the defect types causes a fault, and the fact that different types of defects are assumed to be independent.

Note that from the parameters of (3) only the  $P_{ij}$  are dependent on the circuit layout. We will examine these  $P_{ij}$  more in the following section.

#### III. THE DEFECT SENSITIVITY FUNCTION

In this section a defect sensitivity function is introduced. This function is a generalization of a concept introduced by Stapper [13] and Maly and Deszczka [14] (further extended by Maly [15]), and it may also be considered as an alternative to the fault probability kernel of Ferris-Prabhu [16]. The defect sensitivity incorporates the influence of a particular layout on the yield. In particular it may be used to calculate the sensitivity of a chip design to defects, and the effect of shrinking.

In the following we assume that all defects can be treated as circular defects. If we assume that defects of a certain type i in chip region  $R_j$  can be considered as the result of a homogeneous Poisson process on  $R_j$  we get

$$P_{ij} = P(\text{a defect of type } i \text{ in chip region } R_j \text{ causes a fault})$$

$$= \int_0^\infty p(\text{a defect of type } i \text{ has radius } r)$$

$$\cdot \left[ \int_{R_j} \frac{1}{A_j} P(f_i | D_{ixr}) dx \right] dr$$

$$= \int_0^\infty p(\text{a defect of type } i \text{ has radius } r) g_{ij}(r) dr$$

$$= \int_0^\infty f_i(r) g_{ij}(r) dr \qquad (4)$$

where

 $F_i$  fault of type i; defect of type i with radius r centered at point x in  $R_j$ , conditional probability that a defect centered at point x in x in x will cause a fault of type x, uniform probability density on chip region x.

 $f_i(r)$  density of the distribution of the radius of defects of type i, and

 $g_{ij}(r)$  probability that a defect with radius r of type i in chip region  $R_j$  will cause a fault.



Fig. 1.  $P(F_i|D_{ixr})$ : the fault type *i* considered is a break in the conductor. The chip region contains a single conductor.



Fig. 2. The defect sensitivity function  $g_{ij}(r)$  for a large number of parallel conductors. The type of fault considered is a complete break of a conductor (q=1).

The function  $g_{ij}(r)$  will be called the defect sensitivity function. It is the fraction of the area of the chip region  $R_j$  where a defect of type i with radius r causes a fault.

In some cases it is possible to derive relatively simple expressions for the defect sensitivity function  $g_{ij}(r)$  using the following relation:

$$g_{ij}(r) = P(\text{a defect of type } i \text{ and}$$
  
radius  $r$  in  $R_j$  causes a fault)  

$$= \int_{R_j} \frac{1}{A_j} P(F_i | D_{ixr}) dx.$$
 (5)

It is reasonable to assume that when a defect with radius r which is located at x causes a fault, that a defect of the same type with radius r' > r at x causes a fault too, and similarly, if a defect with radius r which is located at x causes no fault, that a defect of the same type with radius r' < r at x causes no fault. Thus typically there exists an r'' (depending on i and x) such that

$$P(F_i|D_{ixr}) = \begin{cases} 0, & r < r'' \\ 1, & r \geqslant r'' \end{cases}$$
 (6)

See, for example, Fig. 1.

For simple structures and simple types of faults it is sometimes possible to calculate the defect sensitivity function directly.

The case in Fig. 2 is well known [13]–[16]. However, it is also possible to calculate  $g_{ij}(r)$  in somewhat more complicated structures, e.g., Fig. 3. The computation proceeds as follows. Fix r. Calculate in which part of the area a defect, of radius r, if centered there, would cause a fault of the circuit. In Fig. 3(a) and (b), for example, this means that a circular defect of radius r would cover the conduc-



Fig. 3. (a), (b) The defect sensitivity function  $g_{ij}(r)$  for two simple patterns. The type of fault considered is a complete break of a conductor (q=1). The chip region considered is the square with side l.

tor to cause a break (the "dotted" area). Now compute what fraction of the total area  $l^2$  is involved. Repeat this for all r and obtain  $g_{ij}$  as a function of r. Clearly  $g_{ij}(r)$  will be 0 for very small r, 1 for very large r, and nondecreasing in between.

In Section II we argued that  $P_{ij}$  is the link between circuit design and yield. When  $P_{ij}$  is smaller, region  $R_j$  of the circuit is less sensitive for defects of type i. Therefore  $P_{ij}$  can be seen as a measure of the sensitivity of the circuit to defects of type i in chip region  $R_j$  of the design.  $P_{ij}$  depends on the defect size distribution  $f_i(r)$  and the defect sensitivity function  $g_{ij}(r)$ . Since the size distribution of the defects is independent of the circuit layout, the defect sensitivity function contains all the information of a defect sensitivity measure of the chip layout. This aspect of the defect sensitivity function deserves more attention. For example, it might be useful to study how it is possible to calculate the defect sensitivity function by means of a simulation program.

# IV. OPTIMIZATION OF THE CHIP SIZE (SHRINKING)

It is customary at a certain time in a circuit's life cycle to consider the possibility of dimensional shrinkage, i.e., reducing the size of the chip elements by the same factor.

From the previous section it follows that  $P_{ij}$  will increase due to the shrinking (i.e., the size distribution remains the same, but the defect sensitivity function will increase). On the other hand the area  $A (= \sum A_j)$  of a chip will decrease. The following questions arise:

- "how does shrinking affect the yield;" and
- "which shrinking factor optimizes the number of good dies per wafer (approximately Y/A)?"

These questions can be answered using the following procedure:

- a design of a chip is given, and the features of the chip which have to be shrunk by a factor f are chosen;
- 2) a yield formula is given; among the parameters are the  $P_{ij}$   $(i=1,\cdots,N,\ j=1,\cdots,J)$  and  $A_j$   $(j=1,\cdots,J)$ ;
- 3) the  $g_{ij}(r)$  are calculated (or approximated) as a function of f;
- 4) using (4) the  $P_{ij}$  can be calculated as a function of f; and
- 5) this gives a yield formula as a function of f. Remembering that the total area of a chip depends on f too, it is now possible to calculate Y/A as a function of f.

## V. Example

We will now give an example to illustrate points 1-5 of Section IV.

- 1) A defect monitor consists of two layers [18] (Fig 4). The line width in layer I is 2  $\mu$ m, and the spacing 2.5  $\mu$ m; in layer II the line width is 3  $\mu$ m, and the spacing 2.75  $\mu$ m. Four types of defects can be distinguished by electrical measurements on the defect monitor (N = 4):
  - 1) shorts in layer I;
  - 2) shorts in layer II;
  - 3) breaks in the string of layer II; and
  - shorts between layers I and II.

Because of the length of the "fingers" of the defect monitor, the edge effects can be neglected (we are left with one chip region: J=1), and it is therefore assumed that both layers consist of a large number of parallel conductors.

Normally all sizes are shrunk by the same factor, however, since a defect monitor is mainly one-dimensional, the length of the fingers is kept constant. Only the line width and the spacing in both layers are made proportional to a factor f. The area of the defect monitor now becomes Af. The insulating area between layers I and II is thus also proportional to f.

Note that for the defect monitor shrinking in the second dimension would only result in shorter "fingers." The area would, therefore, become smaller. However, the defect sensitivity function would remain unchanged. In the scope of this paper this is not a very interesting situation.

2) We used yield formula (3):

$$Y = \prod_{i=1}^{4} (1 + AD_{i}P_{i}/c_{i})^{-c_{i}}.$$

Since the physical interpretation of  $c_i$  is unknown, and this constant plays no important role in the following,  $c_i = \infty$ , for all i, is chosen for computational simplicity. Formula (3) now becomes

$$Y = e^{-Af\sum_{i=1}^{4} D_i P_i} \tag{7}$$





Fig. 4. The defect monitor. w: line width; s: spacing; l: length.

and consequently, the number of good chips per wafer is proportional to

$$Y/f = e^{-Af\sum_{i=1}^{4} D_i P_i} / f.$$
 (8)

From the yield data that were available from already produced monitors, it was possible to estimate  $D_i P_i$ , for i = 1, 2, 3, and 4.

- 3) Since it is assumed that the defect monitor consists of a large number of parallel conductors, it is simple to calculate the defect sensitivity function for defect types 1), 2), and 3). For defect type 4) it is assumed that each defect which is in the insulating area between both layers causes a fault. Therefore  $g_4(r, f) = (\text{area where there is a conductor in both layer I and layer II})/A$ , which is independent of r and f.
- 4) The calculations have been carried out for several different size distributions. One of these distributions is the size distribution that was used in [16] and [17]:

$$f_i(r) = \begin{cases} r/r_0^2, & 0 < r < r_0 \\ r_0^2/r^3, & r_0 \le r. \end{cases}$$
 (9)

Note that the mean of distribution (9) is  $4r_0/3$  and that it has infinite variance. When  $r_0$  is assumed smaller than the smallest detail, the results of the calculations turn out to be independent of the actual value of  $r_0$ .

Because literature from other disciplines [19], [20] suggests that the distribution of the radius of dust or smoke particles is distributed lognormal, the calculations were



Fig. 5. Two lognormal densities and the density from (9) compared.

also carried out with lognormal distributions. The density of the lognormal distribution is

$$f_i(r) = \frac{1}{\gamma r} \phi \left[ \log \left( r/\beta \right) / \gamma \right] \tag{10}$$

with

$$\phi(x) = \frac{1}{\sqrt{2\pi}} e^{-x^2/2}$$
 the standard normal density.

The mean of this distribution is  $\beta e^{\gamma^2/2}$ , and the variance is  $\beta^2 e^{\gamma^2} (e^{\gamma^2} - 1)$ . We chose lognormal distributions with mean 0.5  $\mu$ m and variance 0.25  $\mu$ m<sup>2</sup> and mean 0.5  $\mu$ m and variance 1  $\mu$ m<sup>2</sup> (compared with line widths of 2 and 3  $\mu$ m and spacings of 2.5 and 2.75  $\mu$ m). In Fig. 5 the densities of the three distributions that were chosen are compared. (Note that the vertical axis is logarithmic.) Clearly the important part to look at are the tails of the distributions. (Defects with radius 0.1  $\mu$ m will not do much harm.) The distribution from (9) has the heaviest tail, while the lognormal distributions have lighter tails. Heavier tails correspond to many large defects. It is not clear what the actual distribution is.

The same size distribution was taken for all types of defects. In fact not much is known about size distributions. However, it can be seen from the following results that the defect size distribution is a very important factor in yield calculations. Therefore we strongly agree with [18] that the size distribution of defects is a factor that deserves more attention.

Using the results of Section V-3 it is now possible to calculate  $P_i$  for i = 1, 2, 3, and 4.

5) Combining Sections V-2 and 4, expressions for the yield and Y/f as a function of f are obtained. The results can be found in Figs. 6 and 7. From Fig. 7 it is easy to conclude what the optimal shrinking factor (optimal in the sense that the number of good dice per wafer is maximized) would be, if the defect size distribution is, approximately, known. However, from this example it is clear that this optimal shrinking factor strongly depends on the defect size distribution. When it is assumed that the defect size distribution is the one from (9), it would be optimal to



Fig. 6. The yield as a function of shrinking factor f. The factor f is linear with line width and spacing.



The number of good dice per wafer as a function of shrinking factor f. The factor f is linear with line width and spacing.

shrink the chip further. When the defect size distribution is a lognormal with mean 0.5  $\mu$ m and variance 0.25  $\mu$ m<sup>2</sup>, the present dimensions of the chip would be nearly optimal. However, when the size distribution is lognormal with mean 0.5  $\mu$ m and variance 1.00  $\mu$ m<sup>2</sup> the conclusion would be that the chip has been shrunk already too much; the number of good dice per wafer would increase if the dimensions of the chip were increased.

It is not our purpose to draw any conclusions from Figs. 6 and 7 about the defect monitor itself. The defect monitor is a research chip, and the production circumstances were different each time (this is probably why it seems that it is optimal to enlarge the dimensions of some details). The example is only meant to illustrate how the defect sensitivity function can be used to calculate an optimal shrinking factor. However, one thing is clear from Figs. 6 and 7: depending on the size distribution of the defects we would draw completely different conclusions.

In this aspect it is good to realize that we could have "inverted" the computations. Producing defect monitors with different widths and different spacings, under further identical conditions, one can make inferences about size distributions.

## VI. SUMMARY AND CONCLUSIONS

We have extended a general known yield formula to a more comprehensive yield formula which includes the fact that not every defect is fatal to the circuit.

We discussed a defect sensitivity function, which depends only on the design of the chip. It was explained how this function is a tool to obtain a measure of the sensitivity to defects of a chip design. An example was given to illustrate how the defect sensitivity function can be used to optimize the sizes of the components of a chip.

It appears that the distribution of the sizes of the defects is a very important factor in yield calculations. (Very) little is known about these size distributions. It might be possible to extract more information about size distributions, using, for example, defect monitors with different line widths and spacings.

#### ACKNOWLEDGMENT

The author wishes to thank Dr. F. M. Dekking, Dr. W. Th. F. den Hollander, Prof. M. S. Keane (Delft University of Technology), Dr. W. J. J. Rey, F. Camerick, Dr. R. G. M. Penning de Vries, and especially H. J. Prins (Philips Research Laboratories), for their valuable contributions and comments.

# REFERENCES

- B. T. Murphy, "Cost-size optima of monolithic integrated circuits," *Proc. IEEE*, vol. 52, pp. 1537–1545, 1964.
  R. B. Seeds, "Yield, economic and logistical models for complex digital arrays," in *IEEE Int. Conv. Rec.*, part 6, 1967, pp. 60-61.
  A. G. F. Dingwall, "High-yield-processed bipolar LSI arrays," in *IEEM Took Dis.* 1069

- R. B. Seeds, "Yield, economic and logistical models for complex digital arrays," in IEEE Int. Conv. Rec., part 6, 1967, pp. 60-61.
  A. G. F. Dingwall, "High-yield-processed bipolar LSI arrays," in IEDM Tech. Dig., 1968.
  T. Okabe, M. Nagate, and S. Shimada, "Analysis of integrated circuits and a new expression for the yield," Elec. Eng. Japan, vol. 92, pp. 135-141, 1972.
  R. M. Warner, "Applying a composite model to the yield problem," IEEE J. Solid-State Circuits, vol. SC-9, pp. 96-103, 1974.
  C. H. Stapper, "On a composite model to the IC yield problem," IEEE J. Solid-State Circuits, vol. SC-10, pp. 537-539, 1975.
  C. H. Stapper, "LSI yield and process monitoring," IBM J. Res. Dev., vol. 20, pp. 228-234, 1976.
  O. Paz and T. R. Lawson, "Modifications of Poisson statistics: Modeling defects induced by diffusion," IEEE J. Solid-State Circuits, vol. SC-12, pp. 540-546, 1977.
  S. M. Hu, "Some considerations in the formulation of IC yield statistics," Solid-State Electron., vol. 22, pp. 205-211, 1979.
  C. H. Stapper, F. M. Armstrong, and K. Saji, "Integrated circuit yield statistics," Proc. IEEE, vol. 71, pp. 453-470, 1983.
  S. C. Seth and V. D. Agrawal, "Characterizing the LSI yield equation from wafer test data." IEEE Trans. Computer-Aided Des., vol. CAD-3, pp. 123-125, 1984.
  C. H. Stapper, "The effect of wafer to wafer and defect density variations on integrated circuit defect and fault distributions," IBM J. Res. Dev., vol. 29, pp. 87-97, 1985. [12] variations on integrated circuit defect and fault distributions, J. Res. Dev., vol. 29, pp. 87-97, 1985.

- [13] C. H. Stapper, "Modeling of integrated circuit defect sensitivities," IBM J. Res. Dev., vol. 27, pp. 549-557, 1983.
  [14] W. Maly and J. Deszczka, "A yield estimation model for VLSI artwork evaluation," Electron. Lett., vol. 19, no. 6, pp. 226-227, 1983.

- 1983.
  [15] W. Maly, "Modeling of lithography related yield losses for CAD of VLSI circuits," *IEEE Trans. Computer-Aided Des.*, vol. CAD-4, pp. 166–177, 1985.
  [16] A. V. Ferris-Prabhu, "Modeling the critical area in yield forecasts," *IEEE J. Solid-State Circuits*, vol. SC-20, pp. 874–880, 1985.
  [17] A. V. Ferris-Prabhu, "Role of defect size distribution in yield modeling," *IEEE Trans. Electron Devices*, vol. ED-32, pp. 1732–1736, 1985.
  [18] A. C. Ipri and J. C. Sarace, "Integrated circuit process and design rule evaluation techniques," *RCA Rev.*, vol. 38, pp. 323–350, 1977.
  [19] S. A. Roach, *The Theory of Random Clumping*. London: Methuen, 1968.

[20] E. E. Underwood, Quantitative Stereology. Reading, MA: Addison Wesley, 1970.



Charles Kooperberg received the masters degree in mathematical engineering in 1985 from the Delft University of Technology, Delft, The Netherlands, while working with the Philips Re-search Laboratories, Eindhoven, The Nether-lands, on yield models. He is currently working toward the Ph.D. degree in statistics at the Uni-versity of California at Berkeley versity of California at Berkeley.