See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/4750380
Two-level experiments for binary response data Article in Computational Statistics & Data Analysis · September 2008 DOI: 10.1016/j.csda.2008.07.013 · Source: RePEc
CITATIONS
READS
10
78
3 authors, including: Roberto Dorta-Guerra
Enrique González-dávila
Universidad de La Laguna
Universidad de La Laguna
17 PUBLICATIONS 39 CITATIONS
41 PUBLICATIONS 373 CITATIONS
SEE PROFILE
SEE PROFILE
All content following this page was uploaded by Roberto Dorta-Guerra on 23 May 2014. The user has requested enhancement of the downloaded file. All in-text references underlined in blue are added to the original document and are linked to publications on ResearchGate, letting you access and read them immediately.
Computational Statistics and Data Analysis 53 (2008) 196–208
Contents lists available at ScienceDirect
Computational Statistics and Data Analysis journal homepage: www.elsevier.com/locate/csda
Two-level experiments for binary response data Roberto Dorta-Guerra a , Enrique González-Dávila a , Josep Ginebra b,∗ a
Department of Statistics, University of La Laguna, 38271 La Laguna, Spain
b
Department of Statistics, Technical University of Catalonia, 08028 Barcelona, Spain
article
info
Article history: Received 3 August 2007 Received in revised form 8 July 2008 Accepted 8 July 2008 Available online 15 July 2008
a b s t r a c t Information in a statistical experiment is often measured through the determinant of its information matrix. Under first order normal linear models, the determinant of the information matrix of a two-level factorial experiment neither depends on where the experiment is centered, nor on how it is oriented, and balanced allocations are more informative than unbalanced ones with the same number of runs. In contrast, under binary response models, none of these properties hold. The performance of two-level experiments for binomial responses is explored by investigating the dependence of the determinant of their information matrix on their location, orientation, range, presence or absence of interactions and on the relative allocation of runs to support points, and in particular, on the type of fractionating involved. Conventional wisdom about two-level factorial experiments, which is deeply rooted on normal response models, does not apply to binomial models. In binary response settings, factorial experiments should not be used for screening or as building blocks for binary response surface exploration, and there is no alternative to the optimal design theory approach to planning experiments. © 2008 Elsevier B.V. All rights reserved.
1. Introduction There are two basic approaches when choosing an experimental design in practice. One possibility is to plan for simple standard designs, like two or three level factorial experiments in the factors under study, while a second possibility is to chose an experiment that maximizes a given measure of the information, like the determinant of its information matrix. Conventional wisdom states that two-level factorial experiments and its variants are to be favored in the early screening stages of the investigation, and as building blocks for response surface exploration, the most frequent scenario in industrial practice, while the second approach is more indicated in statistically more structured settings, like when one plans an experiment to learn about the parameters of a given nonlinear model. When the response of interest can be conveniently modeled through first order normal linear regression models, these two approaches tend to agree, because two level factorial experiments are either optimal, or close to optimal, among all experiments with the same sample size, for a broad class of experimental regions and for most sensible design optimality criteria. Two-level factorial experiments are also popular for experiments with binary responses. For example, Hamada and Nelder (1997) analyze a 24−1 fractional factorial experiment to learn about the effect of film thickness, oil mixture, type of gloves and type of metal blank on the number of parts that are classified as good. Bisgaard and Fuller (1995a) consider the case of a grinding process where one is interested in learning about the effect of blade size, centering, leveling and speed on the presence or absence of undesirable marks on steel samples, and where the experiment used is a 24 factorial
∗
Tel: +34 934011728; fax: +34 934016575. E-mail address:
[email protected] (J. Ginebra).
0167-9473/$ – see front matter © 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.csda.2008.07.013
R. Dorta-Guerra et al. / Computational Statistics and Data Analysis 53 (2008) 196–208
197
experiment. Many other examples of the use of two-level factorial designs for screening and as building blocks for binary response surface exploration can be found in Bisgaard and Fuller (1995b), Myers and Montgomery (1997, 2002), Bisgaard and Gertsbakh (2000), Wu and Hamada (2000), Myers and Montgomery (2002), Woods et al. (2006), González-Dávila et al. (2007a) and in references therein. It turns out though, that the design issues involved when planning for two-level factorial experiments for non-normal response data are much more involved. When determining the information carried by these experiments, the role played by the location of their center point, by their range, by their orientation relative to the contour lines of the response surface, by the relative allocation of runs to support points, and by the presence or absence of interactions is much more complicated than for normal responses, and the orthogonality of the corresponding experimental design matrix is no longer necessarily desirable. In this manuscript, we search for experiments maximizing the determinant of the information matrix for binary response models among two-level experiments centered at a given point, assuming no restriction on the experimental region. By searching across center points, we then find the overall D-optimal two-level experiment for the binomial model under consideration. It is a remarkable fact that, even though for more than one design variable the determinant of the information matrix can be made arbitrarily large by choice of an experiment, when one restricts consideration to two-level experiments that is no longer necessarily the case. Section 2 presents the models for binomial response experiments, and Section 3 explains how one can evaluate the D-optimal design criterion for them. Sections 4–6 deal with the performance of one, two and three-factor two-level experiments, all under binomial main effects models, and it illustrates how conventional wisdom about two-level factorials, which is deeply rooted on normal response models, does not apply to binomial response models. It is also found that the amount of information attainable through two-level experiments depends a lot on the binomial model under consideration. In Section 7 it is found that with interactions in the model, the best two-level experiments are balanced complete factorials centered on the intersection of the asymptotes of the contour levels of the response surface. Note that by k-factor two-level experiments the paper refers to any experiment supported on the vertices of a hyperrectangle, oriented so that its sides are parallel with the cartesian axis, and it covers both equally weighted, as well as unequally weighted allocations. When reading this manuscript keep in mind that it is intended much more as a warning against the use of two-level experiments for binary response data, than as a guide on how to choose one of these experiments. Other than for normal response models, equally weighted two-level factorial and the usual fractional factorial experiments do not have any of the model based optimality properties one has come to expect from them, and they should not be used by default for screening or as building blocks for response surface exploration. When planning experiments for binary response data, there is no alternative to the optimal design theory approach, searching for experiments that maximize a given measure of the information in it often with the restriction that its support points have to be in a given experimental region. 2. Generalized linear models for binary response Often, quality improvement experiments on manufacturing processes assess quality through pass or fail inspection of the articles produced, and bioassay studies estimate dose response curves of drugs through binary response experiments. In such experiments ni articles or subjects are tested at levels xi = (x1i , . . . , xki ) of k design variables or drugs for i = 1, . . . , q, and the outcome is binary. Pq Usually the total number of articles or subjects, n = i=1 ni , is specified from the start and one assumes that the number of successes on the ni subjects under xi , yi , are conditionally independent binomial random variables with a probability of success p(xi ; β). Other than in Section 7 the relationship between yi and xi is modeled through a main effects model, with p(xi ; β) = F (β0 + β1 x1i + · · · + βk xki ) = F (zi ),
(1)
where F (.) is a known function satisfying the properties of a strictly increasing cumulative distribution function. That is, it is assumed that there exists a known link function, F −1 (.), such that zi = F −1 (p(xi ; β)) is an unknown linear combination of (x1i , . . . , xki ). The set of design points xi that takes the same expected response surface value p(xi ; β) = F (z ) for a fixed and given z is recognized as the equal z-dose set, which is shorthanded by EDF (z ) . For example, the ED.5 set is the one that contains all design points xi such that the probability of success p(xi ; β) is equal to .5. Under the model in (1) all equal dose sets are hyper-planes. Under this model the Fisher information matrix of the experiment can be written as 1 x1i
I (β) = n
q X i
λi h(zi ) ... xki
x1i x21i
... x1i xki
... ... ... ...
xki x1i xki
, ...
(2)
x2ki
where zi = β0 + β1 x1i + · · · + βk xki , and λi = ni /n is the relative allocation of runs to support points and where h(zi ) = F 0 (zi )2 /(F (zi )(1 − F (zi ))). Table 1 lists the h(zi ) that correspond to link functions most frequently used in practice.
198
R. Dorta-Guerra et al. / Computational Statistics and Data Analysis 53 (2008) 196–208
Table 1 Inverse of the link function and h(zi ) for frequently used links; Φ (.) is the cumulative distribution function of the standard normal, si is sign(zi ) and zi = β0 + β1 x1i + · · · + βk xki Link
F (zi ) = p(xi ; β)
h(zi )
Logit
e /(1 + e ) Φ (zi ) zi 1 − e−e (1 + si − si e−|zi | )/2 (1 + si − si (1 + |zi |)−1 )/2
ezi /(1 + ezi )2
Probit Complementary log–log Double exponential Double reciprocal
zi
zi
2
e−zi /(2π Φ (zi )Φ (−zi )) zi e2zi /(ee − 1) | zi | 1/(2e − 1) ((1 + |zi |)2 (2(1 + |zi |) − 1))−1
The inverse of I (β) is the asymptotic variance–covariance matrix of the maximum likelihood estimate of β , and it determines the size and shape of the approximate confidence regions for β based on these estimates. Nevertheless, one cannot compare experiments through I (β) itself because it does not induce a total ordering on the space of experiments, and thus one needs to resort to real valued criteria based on I (β), that relate to the size and shape of those regions. 3. Design optimality criteria 3.1. The D-Optimal design criteria When the goal is to estimate an individual component in β or any real valued function of them, one typically chooses that value of q and those values of (xi , λi ) that minimize the asymptotic variance of the maximum likelihood estimate for that component. Instead, when the goal is to jointly estimate all the components of β one typically chooses the experiment that maximizes the determinant of I (β) and thus that minimizes the volume of the approximate confidence ellipsoid for β , obtaining what are called D-optimal experiments. Alternative criteria include the minimization of the trace of I (β)−1 but the determinant of I (β) is often considered as the default criteria, when one lacks detailed specifications about the goal of the experimenter. One appealing feature of the D-optimality criterion is that it induces an ordering of experiments that is invariant under re-parametrizations, which is specially desirable when the parameters have no definite physical meaning. Therefore, Doptimal experiments maximizing the determinant of (2) are also the ones maximizing the determinant of the information matrix under the re-parametrization of (1) through zi = β1 (x1i − µ) + · · · + βk (xki − µ). The determinant of I (β) depends on the unknown parameters, β , through zi (and only through zi ). In the local D-optimal approach one guesses β to be equal to a given β 0 , and finds the q and the (xi , λi )’s that maximize the determinant of I (β 0 ) often with the restriction that the xi ’s belong to a pre-specified experimental region of interest. For a single factor the solution to the D-optimal design problem for an unbounded experimental region is finite and it is supported on two or three points depending on the link (see, e.g., Abdelbasit and Plackett (1983), Minkin (1987), Khan and Yazdi (1988), Ford et al. (1992), Sitter and Wu (1993), Myers et al. (2002), Sitter and Fainaru (1997) and Mathew and Sinha (2001)). In the case of two factors and all the links that we consider, Sitter and Torsney (1995) find that if the experimental region is unbounded, the determinant of I (β) can be made arbitrarily large by choice of experiment, and they characterize the D-optimal solution for a carefully chosen finite region. Nevertheless, in our manuscript it is illustrated how unless β1 = 0 or β2 = 0, D-optimal two-level experiments can be finite even for unbounded regions. 3.2. Determinant of I (β) for two-level experiments In this paper, we restrict our attention to two-level experiments, which by definition are the ones supported on the 2k vertices of an hyper-rectangle of the form xi = (x10 + ai1 R1 , . . . , xk0 + aik Rk ), where the aij ’s are either −1 or 1, where the Ri ’s are the one-half ranges of the factor levels, and where x0 = (x10 , . . . , xk0 ) is the center point of the experiment. Under first order normal linear models, possibly with interactions, the determinant of the information matrix I (β) for two-level factorial experiments have the following properties: 1. the larger the one half ranges Ri , the larger the determinant of the information matrix, and the more informative the experiment, 2. that determinant neither depends on x0 nor on how the experiment is oriented relative to the contour lines of the response surface (i.e., it does not depend on β ), 3. including interaction terms in the model does not lead to an alternative choice of a full factorial experiment, 4. equally weighted allocations that assign the same number of replicates to all factor combinations, and lead to an orthogonal standardized design matrix always have a larger determinant than unequally weighted allocations with the same total number of runs, 5. under equally weighted allocations, the determinant of any fraction of a two-level factorial is never larger than the determinant of the full factorial experiment with the same total number of runs, and in particular the determinant of an experiment varying one factor at a time is never larger than the determinant of a full factorial experiment,
R. Dorta-Guerra et al. / Computational Statistics and Data Analysis 53 (2008) 196–208
199
6. when fractionating two-level factorial experiments, ‘‘spatially’’ balanced allocations treating all factors symmetrically by assigning the same number of experimental combinations to each of their levels, and allocating the same number of runs to each combination, thus leading to an orthogonal standardized design matrix, are always at least as good as allocations leading to a non-orthogonal matrix with the same number of runs, and 7. any two complementary equally sized regular fractions of a full two-level factorial, like the ones considered for example in chapters 6 and 7 of Box et al. (2005), are statistically equivalent. As a consequence, when planning for two-level factorial experiments in the first order linear normal setting, one can always restrict consideration to the usual coded factor levels −1 and +1 and only the orthogonality of the standardized design matrix, and the range of variation of the factors involved matter. Furthermore, in the presence of interactions in the model, one only needs to address the relative merit of various fractional factorial experiments evaluated in terms of which interactions are estimable, and which ones are aliased. Under binary response models though, none of the properties listed above hold any longer, and as a consequence the design issues involved in choosing the best two-level experiment are much more complicated than for normal response models. In particular, concepts like the resolution or the aberration of a fractional factorial do not make sense any longer. Let A be the corresponding standardized design matrix whose i-th row is (1, ai1 , . . . , aik ),
1 1
A= ... 1
a11 a21
...
aq1
... ... ... ...
a1k a2k
. . .
,
(3)
aqk
and let Ai1 ,...,ik+1 , where {i1 , . . . , ik+1 } is a subset of size k + 1 of {1, 2, . . . , q}, denote the (k + 1) × (k + 1) sub-matrix of A formed by its rows i1 , . . . , ik+1 . It is assumed that the experiments are always such that the rank of A is not smaller than k + 1. Under the main effects binomial model, in (1), the determinant of I (β) for any two-level experiment is such that
X
(β1 . . . βk )2 det(I (β)) = nk+1 Πjk=1 (βj Rj )2
(det (Ai1 ,...,ik+1 ))2 λi1 h(zi1 ) . . . λik+1 h(zik+1 ),
(4)
{i1 ,...,ik+1 }
where zi = z0 + ai1 |β1 |R1 + · · · + aik |βk |Rk , where z0 = β0 + β1 x10 + · · · + βk xk0 , and where the is over all the summation combinations of k + 1 out of 2k possible subsets of {1, 2, 3, . . . , 2k } of size k + 1, which are
2k k+1
. This covers as special
cases complete two-level factorial experiments, in which case q = 2 , as well as any fraction of them, in which case q < 2k . Posing everything in terms of the absolute values for βi will allow us to tabulate the best designs for any β by redefining the aij ’s appropriately. Given that (4) only depends on z0 , n, (|β1 |R1 , . . . , |βk |Rk ) and on (λ1 , . . . , λ2k ), all two-level experiments with the same n, |βi |Ri ’s and λj ’s and centered on any point along the EDF (z0 ) , have the same determinant. By computing the |βi |Ri and λj that maximize (4) for a given z0 , one obtains the Ri and λi for the local D-optimal two-level experiment centered on z0 , for any β . For a proof of (4) in a more general form see González-Dávila et al. (2007b). In the normal response two-level factorial settings, one can restrict attention to equally weighted experiments with orthogonal standardized design matrices A (see, e.g., Box et al. (2005); Goel and Ginebra (2003), p. 522). However, under binomial models, the orthogonality of the columns of the standardized design matrix does not lead to a diagonal I (β). In fact, under binomial models, the determinant of I (β) for a two level experiment depends on where it is centered, and therefore the information in it is different from the information in the two-level experiment that results from translating it to the origin. The non-standardized design matrix X of an equally weighted two-level factorial experiment centered away from the origin is not orthogonal, and it is not statistically equivalent to the corresponding two-level experiment centered around the origin with an orthogonal design matrix. Hence for binary response models, design orthogonality is a feature that not even equally weighted two-level factorial experiments can be considered to have. k
4. One-factor two-level experiments Here we compute the D-optimal experiment for the one-factor binomial model within the class of two-level experiments supported on x0 − R and x0 + R for a given x0 . As a special case of (4), it follows that under the binomial model with p(xi , β) = F (β0 + β1 xi ), the determinant of the I (β) for such a two-level experiment is such that
β12 det(I (β)) = 4n2 (β1 R)2 λ1 λ2 h(z1 )h(z2 ),
(5)
where z1 = z0 − |β1 |R, z2 = z0 + |β1 |R, and where λi = ni /n. Given that the number of support points is equal to the number of parameters, D-optimal allocations are equally weighted. Table 2 lists the values of |β1 |R, F (z1 ), F (z2 ) and β12 det(I (β)) for the D-optimal two-point experiment centered on various EDF (z0 ) ’s, which allows one to obtain the local D-optimal R, z1 and z2 for any value of (β0 , β1 ). The D-optimal experiments for the negative of the z0 ’s in that table can be obtained by symmetry. For example, under the probit link the
200
R. Dorta-Guerra et al. / Computational Statistics and Data Analysis 53 (2008) 196–208
Table 2 D-optimal one-factor two-point experiments centered on the EDF (z0 ) point, with p(x0 ; β) = F (z0 ), and overall D-optimal two-point experiment F (z0 )
z0
|β1 |R
F (z1 )
F (z2 )
β12 det(I (β))
.000 .405 .847 1.386 2.197
1.5434 1.5652 1.6371 1.7885 2.1286
.1760 .2387 .3122 .4008 .5171
.8240 .8777 .9230 .9599 .9870
.0501n2 .0478n2 .0409n2 .0296n2 .0146n2
.000 .253 .524 .842 1.282
1.1382 1.1345 1.1230 1.1018 1.0650
.1275 .1891 .2747 .3974 .5857
.8725 .9174 .9503 .9740 .9905
.1987n2 .1881n2 .1571n2 .1086n2 .0489n2
Logit link .50 .60 .70 .80 .90 Probit link .50 .60 .70 .80 .90
Overall D-optimal two-point experiment Link
F (z0 )
z0
|β1 |R
F (z1 )
F (z2 )
β12 det(I (β))
Logit Probit Compl. log–log D. Exponential D. Reciprocal
.5000 .5000 .5666 .8009 .7235
.000 .000 −.179 .921 .808
1.5434 1.1382 1.1587 .92080 .80832
.1760 .1275 .2308 .5000 .5000
.8240 .8725 .9303 .9272 .8089
.0501n2 .1987n2 .1638n2 .0730n2 .0225n2
In all these instances, λ1,2 = .5. For the logit, probit and complementary log–log links the overall D-optimal two-point experiments are the global D-optimal ones, but for double exponential and double reciprocal links the global D-optimal experiments are supported on three-points.
D-optimal two-point experiment centered on the ED.3 is such that |β1 |R = 1.1230, F (z1 ) = 1 − .9503, F (z2 ) = 1 − .2747 and the determinant of its I (β) is .1571n2 /β12 . For the logistic model, the further F (z0 ) is away from .5, the larger the optimal |β1 |R but the opposite happens for the probit model. Note also that under the probit link D-optimal designs have larger determinants and smaller values of R than under the logit link, which are two features shared by all D-optimal designs covered in this manuscript. By searching over all center points, one finds the D-optimal two-point experiments presented in Table 2. Note that the largest determinants attainable through the five links considered depend a lot on the link function at hand. For the logit and probit links the D-optimal two-point experiment is centered on the ED.5 and for the complementary log–log link it is centered on the ED.566 , and they coincide with the global D-optimal experiment maximizing the determinant of I (β) among all possible experiments. For the double exponential link, the D-optimal two-point experiment is supported on the ED.5 and the ED.9272 (or on the ED.0728 and the ED.5 ), and the determinant of its information matrix is 35.7% larger than the determinant for the two-point design symmetric about the ED.5 that is wrongly presented in Table 4 of Ford et al. (1992) as the best two-point design for that link. The global D-optimal experiment (e.g., see Sitter and Wu (1993)) is a three-point experiment supported on the ED.10 , ED.50 and ED.90 with β12 det(I (β)) = .081n2 , which 10% larger than for the D-optimal two-point design. For the double reciprocal link the D-optimal two-point experiment is supported on the ED.5 and the ED.8089 (or on the ED.1911 and the ED.5 ), and the determinant of its information matrix is 75.3% larger than the one for the two-point design wrongly given in Table 4 of Ford et al. (1992) as the best two-point design. The global D-optimal experiment for this link is a three-point experiment supported on the ED.21 , ED.5 and the ED.79 with β12 det(I (β)) = .023n2 , which is very close to the one for the best two-point design in Table 2. 5. Two-factor two-level experiments Here we compute the D-optimal design within the class of designs supported on four vertices of a rectangle, of the form xi = (x10 + ai1 R1 , x20 + ai2 R2 ) where ai1 and ai2 are equal to either −1 or 1 and where x0 is on the previously specified equal dose line, with z0 = β0 + β1 x10 + β2 x20 . That allows one to explore how the performance of two-factor two-level experiments depends on its location. By searching over all z0 we then find the D-optimal two-factor two-level experiment. As a consequence of (4) the determinant of I (β) for any two-factor two-level experiment under the main effects binomial model is such that
(β1 β2 )2 det(I (β)) = 42 n3 (β1 R1 )2 (β2 R2 )2
X
λi λj λr h(zi )h(zj )h(zr ),
(6)
i