Sensitivity analysis for incomplete continuous data

July 7, 2017 | Autor: C. Paulino | Categoria: Statistics, Sensitivity Analysis, Test, Standard Deviation, Missing Data, Mixture Model
Share Embed


Descrição do Produto

Test (2011) 20:589–606 DOI 10.1007/s11749-010-0219-x O R I G I N A L PA P E R

Sensitivity analysis for incomplete continuous data Frederico Z. Poleto · Geert Molenberghs · Carlos Daniel Paulino · Julio M. Singer

Received: 23 July 2010 / Accepted: 11 October 2010 / Published online: 6 November 2010 © Sociedad de Estadística e Investigación Operativa 2010

Abstract Models for missing data are necessarily based on untestable assumptions whose effect on the conclusions are usually assessed via sensitivity analysis. To avoid the usual normality assumption and/or hard-to-interpret sensitivity parameters proposed by many authors for such purposes, we consider a simple approach for estimating means, standard deviations and correlations. We do not make distributional assumptions and adopt a pattern-mixture model parameterization which has easily

Communicated by Domingo Morales. The authors would like to thank the following institutions for financial support: Frederico Z. Poleto and Julio M. Singer, from Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES), Brazil, Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP), Brazil, and Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), Brazil; Geert Molenberghs, from the IAP research Network P6/03 of the Belgian Government (Belgian Science Policy); Carlos Daniel Paulino, from Fundação para a Ciência e Tecnologia (FCT) through the research centre CEAUL-FCUL, Portugal. F.Z. Poleto () · J.M. Singer Instituto de Matemática e Estatística, Universidade de São Paulo, Caixa Postal 66281, São Paulo, SP, 05314-970, Brazil e-mail: [email protected] J.M. Singer e-mail: [email protected] G. Molenberghs I-BioStat, Universiteit Hasselt, 3590 Diepenbeek, Belgium e-mail: [email protected] G. Molenberghs Katholieke Universiteit Leuven, 3000 Leuven, Belgium C.D. Paulino Instituto Superior Técnico, Universidade Técnica de Lisboa (and CEAUL-FCUL), Av. Rovisco Pais, Lisboa, 1049-001, Portugal e-mail: [email protected]

590

F.Z. Poleto et al.

interpreted sensitivity parameters. We use the so-called estimated ignorance and uncertainty intervals to summarize the results and illustrate the proposal with a practical example. We present results for both the univariate and the multivariate cases. Keywords Identifiability · Ignorance interval · Missing data · Pattern-mixture model · Uncertainty interval Mathematics Subject Classification (2000) 62F10 · 62F03

1 Introduction In nearly all problems with missing data, untestable assumptions, such as missing at random (MAR), following the terminology of Rubin (1976), are required to identify appropriate statistical models. Such assumptions are usually questionable and statisticians commonly address the problem via sensitivity analyses. Specifically for continuous data, Rubin (1977), Little (1994), Little and Wang (1996) and Daniels and Hogan (2000) propose sensitivity analyses under assumptions of normality, while Rotnitzky et al. (1998), Scharfstein et al. (1999) and Rotnitzky et al. (2001) use inverse probability weighted (IPW) methods in the context of semi-parametric models for similar purposes. Reviews of some of these and other approaches are presented in Daniels and Hogan (2007, Chaps. 9 and 10), Molenberghs and Kenward (2007, Chaps. 19–25) and Fitzmaurice et al. (2008, Chaps. 18, 20 and 22). Although such methodological developments are useful in many situations, there are cases where they may be difficult to apply. To avoid such difficulties, we derive a simple approach useful for estimating means, standard deviations and correlations. We adopt a pattern-mixture model parameterization (Glynn et al. 1986; Little and Rubin 2002) and employ non-identifiable means, standard deviations, and correlations, or functions thereof, as sensitivity parameters. This strategy is similar to the one adopted by Daniels and Hogan (2000), although we do not assume any parametric distribution for the outcomes. We believe that, in many applications, it may be easier to elicit information on these sensitivity parameters than on the selectionbias functions used by Rotnitzky and colleagues. Instead of IPW methods, we simply estimate the identifiable parameters by their sample analogues. In Sect. 2, we present the data on American colleges that will be used to illustrate the methods described in the remainder of the paper. We introduce the ideas in a univariate setup in Sect. 3 and consider a multivariate extension in Sect. 4. We evaluate the uncertainty intervals employed in our inferences in Sect. 5.

2 The American colleges data The US News & World Report’s Guide to America’s Best Colleges 1995 collected data on more than 30 variables encompassing characteristics such as admission, costs, infrastructure, and performance of students on 1,302 American colleges and universities. Allison (2001) considered the estimation of means, standard deviations and

Sensitivity analysis for incomplete continuous data

591

Table 1 Descriptive statistics for CSAT College administration

CSAT observed Count (%)

CSAT missing Mean

SD

Count (%)

Mean

SD

public

251 (0.53)

945.3

107.5

219 (0.47)

?

?

private

528 (0.63)

978.8

129.2

304 (0.37)

?

?

Note: ? denotes non-observed values, and SD, standard deviation

correlations for seven such variables under MAR and normality assumptions using the EM algorithm. For the sake of our exposition, it suffices to focus only on five of them, namely, GRADRAT (ratio between the number of graduating seniors and the number of enrolled students four years earlier ×100), CSAT (combined average scores on verbal and math sections of the Scholastic Assessment Test), ACT (mean American College Testing scores), RMBRD (total annual costs for room and board, in thousands of dollars) and an indicator of public versus private colleges. One college had a GRADRAT equal to 118 and, therefore, the corresponding value was deleted and considered missing. The public-private college administration indicator was the only variable without missing values; the other four variables were observed simultaneously only for 23% of the colleges and each variable had from 8% to 45% missing values. We are interested in two questions: (i) do the public and private colleges have different means and standard deviations of CSAT? and (ii) are GRADRAT, CSAT, ACT and RMBRD mutually linearly correlated? Descriptive statistics are displayed in Tables 1 and 2. Because all American colleges matching the criteria adopted in the study were surveyed, the data make up the entire study population and therefore, standard errors and confidence and uncertainty intervals will be computed and discussed merely for illustrative purposes.

3 Univariate case Let Yi denote the measurement on the ith study unit and Ri be an indicator variable taking on the value 1 if Yi is observed and 0, otherwise, i = 1, . . . , n. In the patternmixture model framework, the joint distribution (Yi , Ri ) is factored as the product of the marginal distribution of Ri and the conditional distribution of Yi given Ri . As our interest lies only in moments of Yi , we use the pattern-mixture model approach and conditional expectation properties to write   (1) μ = E(Yi ) = E E(Yi |Ri ) = γ1 μ(1) + γ0 μ(0) ,     σ 2 = Var(Yi ) = E Var(Yi |Ri ) + Var E(Yi |Ri ) 2 2 + γ0 σ(0) + γ1 (μ(1) − μ)2 + γ0 (μ(0) − μ)2 , = γ1 σ(1)

(2)

2 = Var(Y |R = r), for r = 0, 1. where γr = P (Ri = r), μ(r) = E(Yi |Ri = r) and σ(r) i i We can estimate γ1 (γ0 = 1 − γ1 ), μ(1) and σ(1) by their sample counterparts, i.e., the proportion of observed units, γˆ1 , and the sample mean and standard deviation of

592

F.Z. Poleto et al.

Table 2 Descriptive statistics for GRADRAT (G), CSAT (C), ACT (A) and RMBRD (R) Resp.pattern G C A R

Counts (%)

o o o o m o o o m m m o m m m m

296 (0.23) 158 (0.12) 158 (0.12) 123 (0.09) 17 (0.01) 119 (0.09) 157 (0.12) 82 (0.06) 11 (0.01) 5 (0.00) 17 (0.01) 110 (0.08) 9 (0.01) 10 (0.01) 16 (0.01) 14 (0.01)

62.6 58.0 64.6 53.0 ? 62.7 64.8 51.9 ? ? ? 57.4 ? ? ? ?

Available Case Analysis

statistics counts (%)

60.4

Resp.pattern G C A R

Counts (%)

o o o o m o o o m m m o m m m m

296 (0.23) 158 (0.12) 158 (0.12) 123 (0.09) 17 (0.01) 119 (0.09) 157 (0.12) 82 (0.06) 11 (0.01) 5 (0.00) 17 (0.01) 110 (0.08) 9 (0.01) 10 (0.01) 16 (0.01) 14 (0.01)

o o o m o o m m m o o m o m m m

o o o m o o m m m o o m o m m m

o o m o o m m o o m o m m o m m

o o m o o m m o o m o m m o m m

Available Case Analysis

o m o o o m o m o o m m m m o m

o m o o o m o m o o m m m m o m

statistics counts (%)

GRADRAT Mean SD

CSAT Mean

SD

19.0 17.0 18.2 14.5 ? 19.1 20.8 16.7 ? ? ? 19.5 ? ? ? ?

988.9 961.5 976.7 ? 879.6 949.9 ? ? ? 871.4 880.9 ? 865.9 ? ? ?

120.4 110.6 134.7 ? 98.1 124.9 ? ? ? 96.1 109.6 ? 56.8 ? ? ?

22.90 22.30 ? 21.39 20.76 ? ? 21.12 20.82 ? 20.47 ? ? 19.90 ? ?

2.74 2.53 ? 1.91 2.41 ? ? 1.98 2.27 ? 2.83 ? ? 1.37 ? ?

4.25 ? 4.73 3.25 3.96 ? 4.27 ? 3.22 3.50 ? ? ? ? 3.20 ?

18.8 1203 (0.92)

968.0

123.6 779 (0.60)

22.12

2.58

4.15 1.17 783 (0.60)

Correlations G×C G×A 0.56 0.56 0.72 ? ? 0.55 ? ? ? ? ? ? ? ? ? ? 0.59 731 (0.56)

0.58 0.54 ? 0.47 ? ? ? 0.36 ? ? ? ? ? ? ? ? 0.56 659 (0.51)

ACT Mean

SD

714 (0.55)

G×R

C×A

C×R

A×R

0.43 ? 0.50 0.26 ? ? 0.58 ? ? ? ? ? ? ? ? ?

0.92 0.87 ? ? 0.83 ? ? ? ? ? 0.93 ? ? ? ? ?

0.45 ? 0.53 ? −0.29 ? ? ? ? 0.02 ? ? ? ? ? ?

0.47 ? ? 0.39 −0.27 ? ? ? −0.15 ? ? ? ? ? ? ?

0.44 476 (0.37)

0.47 447 (0.34)

0.50 734 (0.56)

0.91 488 (0.37)

Note: ? denotes non-observed values, SD, standard deviation, o, observed, and m, missing

RMBRD Mean SD 0.99 ? 1.19 0.78 1.14 ? 1.26 ? 0.85 1.49 ? ? ? ? 1.14 ?

Sensitivity analysis for incomplete continuous data

593

the observed measurements, μˆ (1) and σˆ (1) . However, as ω = μ(0) or ω = (μ(0) , σ(0) ) is also needed to estimate μ or σ , respectively, and both μ(0) and σ(0) are not identified from the observed data, it is useful to recall that the statistical uncertainty is a combination of statistical imprecision and statistical ignorance (Molenberghs et al. 2001; Kenward et al. 2001). Statistical imprecision is caused by not observing the entire population while statistical ignorance is due to deficiencies in the observation process; e.g., when some responses are missing, misclassified, and/or measured with error. When the sample size tends to infinity, the magnitude of statistical imprecision decreases to zero, but that of statistical ignorance may not change. In our case, statistical ignorance is related to the mean and standard deviation of the non-observed units. Setting a value ωS for ω, we may compute γˆ1 , μˆ (1) and σˆ (1) and using (1) and (2), obtain an unbiased estimate μ(ω ˆ S ) of μ(ωS ) and a consistent estimate σˆ (ωS ) of σ (ωS ). As this holds for any ωS , we may perform a sensitivity analysis, obtaining estimates and confidence regions for the target over a set Ω ω that is expected to contain the true value of the sensitivity parameter ω. The range of estimates provides a Honestly Estimated Ignorance Region (HEIR) and the union of 100(1 − α)% confidence regions obtained for different values of ω provides a 100(1 − α)% Estimated Uncertainty Region (EURO). In the same way that standard errors and confidence regions quantify statistical imprecision, ignorance regions measure statistical ignorance and uncertainty regions assess statistical uncertainty. Vansteelandt et al. (2006) consider a formal approach to the problem and provide appropriate definitions of consistency and coverage for these regions. They show how to construct EUROs with uncertainty level 100(1 − α)% for a scalar parameter π according to the different definitions of uncertainty regions: (i) strong EUROs cover π(ω) simultaneously for all ω ∈ Ω ω with at least 100(1 − α)% probability, (ii) pointwise EUROs cover π(ω) uniformly over ω ∈ Ω ω with at least 100(1 − α)% probability, and (iii) weak EUROs have an expected overlap with the ignorance region of at least 100(1 − α)% probability. Strong EUROs are conservative pointwise EUROs, which in turn, are conservative weak EUROs. The choice among the three versions of EUROs depends on which is the more appropriate definition for the uncertainty region and on the desired degree of conservativeness. We summarize the algorithms to construct the three EUROs in Appendix A. For categorical missing data, the set Ω ω may cover an in-depth grid of the whole parameter space of ω, but for continuous data, this strategy is clearly not feasible; for instance, note from (1) that μ(μ ˆ (0) ) −→ ∞ as μ(0) −→ ∞ irrespectively of 0 < γˆ1 < 1 and μˆ (1) . Therefore, as it is not possible to perform the analysis without a careful evaluation of Ω ω , the data analyst shall look for alternative parameterizations for the sensitivity parameters and select the one for which the elicitation task is more easily accomplished. For example, in lieu of using μ(0) as sensitivity parameter, one may prefer to use α, β, or p, where μ(0) = α + μ(1) , μ(0) = βμ(1) , chiefly for positive −1 (p), the pth quantile of the theoretical distribution of the variables, and μ(0) = F(1) observed units. The variance of the estimator of μ depends upon which sensitivity parameter is used, e.g., for the first three cases, we have 2   γ1 σ(1) γ1 (1 − γ1 )(μ(1) − μ(0) )2 Var μ(μ ˆ (0) ) = + , n n

(3)

594

F.Z. Poleto et al.

    1 γ1 (1 − γ1 )α 2 2 , Var μ(α) ˆ = σ(1) E + n1 n       2β(1 − β) γ1 (1 − β)2 1 2 2 + + Var μ(β) ˆ = σ(1) β E n1 n n +

γ1 (1 − γ1 )μ2(1) (1 − β)2 n

(4)

(5)

,

 where n1 = ni=1 Ri denotes the number of observed units. Since no sensible analysis can be accomplished if all outcomes are missing, i.e., if n1 = 0, we could have computed (3)–(5) assuming that n1 follows a positive binomial distribution (Stephan 1945) instead of a binomial distribution with parameters n and γ1 . However, the changes required for such purposes generate cumbersome formulae and improve accuracy only when nγ1 is small. Simple approximations for the first negative moment of the positive binomial distribution are discussed, for example, by Grab and Savage (1954) and Mendenhall and Lehman (1960). We adopt E(1/n1 ) ∼ = 1/(nγ1 + 1 − γ1 ) in practice; this is usually accurate to at least two decimal places if nγ1 > 10 (Grab and Savage 1954). Nevertheless, comparing (3)–(5) is easier when using the cruder approximation 1/nγ1 in which case the following relationships hold for large n:       Var μ(μ ˆ (0) ) < Var μ(α) ˆ ≤ Var μ(β) ˆ , if β ≤ β1 or β ≥ 1, (6)       ˆ < Var μ(α) ˆ , if β1 < β ≤ β2 or β3 ≤ β < 1, (7) Var μ(μ ˆ (0) ) ≤ Var μ(β)       ˆ , if β2 < β < β3 , (8) Var μ(β) ˆ < Var μ(μ ˆ (0) ) < Var μ(α) where β1 = (−1 − γ1 )/(1 − γ1 ), β2 = (−γ14 − γ1 )/(1 − γ1 ) and β3 = (γ14 − γ1 )/ (1 − γ1 ). As β1 < β2 < β3 < 0 and β will generally be positive, (8) will hardly occur in practice. Using expressions for the asymptotic variances and covariances of order statistics (Sen et al. 2009, p. 223), we obtain    2   γ1 σ(1) γ1 − 2 p(1 − p) 1 ∼ Var μ(p) ˆ + +E = −1 2 n n n1 {f(1) [F(1) (p)]} +

−1 (p)]2 γ1 (1 − γ1 )[μ(1) − F(1)

×

+

n k

    1 2 1 − + E n n1 n

pj (1 − p)

−1 f [F −1 (pj )]f(1) [F(1) (p)] j =1 (1) (1) n1

p(1 − pj )

j =k+1

−1 −1 f(1) [F(1) (p)]f(1) [F(1) (pj )]

,

(9)

where f(1) denotes the density of the observed units, pk < p < pk+1 and pj = −1/2 j/n1 + o(n1 ), j = 1, . . . , n1 , such that 0 < p1 < p2 < · · · < pn1 < 1, e.g., pj may

Sensitivity analysis for incomplete continuous data

595

Fig. 1 Estimates of standard errors for μˆ of CSAT using μ(0) , α, β or p as sensitivity parameter for (a) public and (b) private colleges

be equal to (j − 0.5)/n1 or to one of the other three definitions discussed by Hyndman and Fan (1996) that satisfy their Property 5; we also assume that f(1) and F(1) −1 −1 −1 are continuous at F(1) (p) and at F(1) (pj ), j = 1, . . . , n1 , and that f(1) [F(1) (p)] > 0

−1 (pj )] > 0, j = 1, . . . , n1 . and f(1) [F(1) Figure 1 portrays estimates of the square root of (3)–(5) and (9) for the data in Table 1, obtained by replacing the parameters in the formulae by their sample analogues; f(1) in (9) was replaced by a Gaussian kernel density estimate (Silverman 1986). The four horizontal axes indicate equivalences among the four sensitivity parameters for estimating μ; for instance, p = 0.25, β ∼ = 0.92, α ∼ = −75 and μ(0) = 870 lead to the same μˆ for public colleges, but to different estimates of its standard error. The standard errors obtained when μ(0) is chosen as the sensitivity parameter are smaller than the corresponding ones based on the other sensitivity parameters because as α, β and p relate the unobserved mean either to the mean or to the quantile of the observed distribution, these approaches include more uncertainty about μ(0) than when assuming that μ(0) is known. Hence, when there is a plausible guess for Ω μ(0) , μ(0) has an advantage over the other sensitivity parameters because it generates a more precise estimator for μ. The parameter spaces of α and β are unbounded (like μ(0) ) and the precisions of their estimates are not so different, suggesting that these sensitivity parameters may equally be chosen on the basis of interpretative purposes, exclusively. The sensitivity parameter p, on the other hand, has a bounded parameter space and the corresponding analysis can only provide reasonable answers if we believe that the missingness mechanism is not so “extreme” to the point that the mean of the unobserved distribution would be smaller than the minimum (or larger than the maximum) observed value. Looking at the estimation of σ in (2), in addition to employing σ(0) as a sensitivity parameter, one can alternatively work with λ where σ(0) = λσ(1) . Whether we use (μ(0) , σ(0) ) or any other parameterization, a simple way to obtain an estimate

596

F.Z. Poleto et al.

of the variance of σˆ is to employ the non-parametric bootstrap1 (Efron and Gong 1983). As in the previous discussion, it is expected that Var[σˆ (σ(0) )] < Var[σˆ (λ)] when σˆ (σ(0) ) = σˆ (λ), for any of the other parameterizations employed for μ(0) . For example, setting σ0 = 134.3 or λ = 1.25 for public colleges, both with μ(0) = 870, α∼ = −75, β ∼ = 0.92 or p = 0.25, we obtain the same estimate for σ (126.4), but the estimates of the standard errors of σˆ are 2.6, 5.5, 2.3, 5.2, 2.3, 5.1, 2.9, 5.8, respectively, when employing (μ(0) , σ0 ), (μ(0) , λ), . . . , (p, λ). Although we do not have an explanation for why the standard errors obtained employing μ(0) are slightly larger than the corresponding ones obtained using α and β, the main conclusion here is that the parameterization for the sensitivity parameter of μ(0) seems to have less impact over the standard error of σˆ than the parameterization of σ(0) . One way to assess question (i) of Sect. 2 is by means of the parametric functions μ(PRI) − μ(PUB) and σ(PRI) /σ(PUB) . As we do not have the opportunity to obtain expert information, we prefer to use the parameterization (p, λ) for which our guesses are that (1) the mean of CSAT for both public and private colleges that did not inform the CSAT might be between quantiles 20% and 50% of the observed distributions of CSAT, and (2) the standard deviation of CSAT for colleges that did not report the CSAT might range from 0.80 to 1.25 times the corresponding standard deviation for colleges that informed it. We first consider sensitivity parameters individually for each level of college administration, Ω p(PUB) = Ω p(PRI) = [0.2; 0.5] and Ω λ(PUB) = Ω λ(PRI) = [0.80; 1.25], and then simplify to the case where the sensitivity parameters are equal for both levels of college administration, that is, p = p(PUB) = p(PRI) and λ = λ(PUB) = λ(PRI) , and consequently, Ω p = [0.2; 0.5] and Ω λ = [0.80; 1.25]. Estimates and confidence intervals for μ(PRI) − μ(PUB) are depicted in Fig. 2. Interestingly, as the empirical distribution function is a step function, μˆ (PRI) − μˆ (PUB) is a monotone function when p(PUB) and p(PRI) are employed, but it is a non-monotone function when only p is used. Had we employed μ(0) , α or β, this lack of monotonicity would not have been observed. Estimated intervals of ignorance and uncertainty for both parametric functions of interest are displayed in Table 3. If we believe that the sensitivity parameters are equal for both levels of college administrations we would conclude that the mean and the standard deviation of the private college CSATs are greater than the corresponding ones from the public colleges. This is in agreement with the commonly employed analysis of the complete data, which is valid under the assumption that μ(0) = μ(1) and σ(0) = σ(1) for each level of college administration, providing estimates and 95% confidence intervals for μ(PRI) − μ(PUB) and σ(PRI) /σ(PUB) equal to, respectively, 33.5 [16.2; 50.8] and 1.20 [1.07; 1.34]. However, as we do not have information that supports either of these strong assumptions, we rely on the results that employ one set of sensitivity parameters for each level of college administration, albeit they do not allow us to conclude whether there is a difference between the means or the standard deviations of public and private college CSATs. Reductions of the number of sensitivity parameters have to be conducted carefully so as not to miss the true underlying model; see also the discussion of Poleto et al. (2010) in the incomplete categorical data setting. 1 Wherever the bootstrap has been employed in the paper to estimate standard errors, 2,000 replicates of

the statistic were generated.

Sensitivity analysis for incomplete continuous data

597

Fig. 2 Estimates and confidence intervals for μ(PRI) − μ(PUB) with (a) one sensitivity parameter for each level of college administration, p(PUB) and p(PRI) , and (b) one common sensitivity parameter, p, for both public and private colleges Table 3 HEIRs and 95% EUROs for μ(PRI) − μ(PUB) and σ(PRI) /σ(PUB) of CSAT Parameter

Sens. Par.a

HEIR

Weak EURO

Pointwise EURO

Strong EURO

μ(PRI) − μ(PUB)

individual

[−6.1; 72.9]

[−10.4; 77.2]

[−21.6; 88.2]

[−24.6; 91.2]

common

[26.5; 41.0]

[13.5; 53.7]

[10.8; 56.2]

[7.9; 59.1]

σ(PRI) σ(PUB)

individual

[0.93; 1.54]

[0.91; 1.58]

[0.84; 1.69]

[0.83; 1.71]

common

[1.16; 1.24]

[1.06; 1.34]

[1.05; 1.35]

[1.03; 1.37]

a Individual and common sensitivity parameters for the college administrations

4 Multivariate case Let Yi = (Yi1 , . . . , YiJ ) where Yij denote the j th response of the ith study unit, i = 1, . . . , n, j = 1, . . . , J . In addition, define a vector of response indicators Ri = (Ri1 , . . . , RiJ ) with Rij = 1 if Yij is observed and Rij = 0 otherwise. Relations (1) and (2) are then the univariate versions of

μ= γr μ(r) , (10) r∈R

Σ=



r∈R

γr Σ (r) +



γr (μ(r) − μ)(μ(r) − μ) ,

(11)

r∈R

where μ = (μ1 , . . . , μJ ) = E(Yi ), Σ = Cov(Yi ), γr = P (Ri = r), R contains all observed values of r = (r1 , . . . , rJ ) , μ(r) = (μ1(r) , . . . , μJ (r) ) = E(Yi |Ri = r) and Σ (r) = Cov(Yi |Ri = r). Often, we prefer to work with correlations, ψj k(r) = Corr(Yij , Yik |Ri = r), instead of covariances, σj k(r) = Cov(Yij , Yik |Ri = r), j = k,

598

F.Z. Poleto et al.

and therefore we let Σ (r) = Dσ (r) Ψ (r) Dσ (r) , where Dσ (r) denotes a diagonal matrix with the elements of σ (r) along the main diagonal, σ (r) = (σj (r) , j = 1, . . . , J ) , σj2(r) = Var(Yij |Ri = r) and Ψ (r) = Corr(Yi |Ri = r); the corresponding definitions for the unconditional variances, covariances and correlations follow analogously. In the tetravariate version of Table 2, there are 16 missingness patterns and 136 non-identifiable parameters for estimating μ and Σ using (10) and (11); among these sensitivity parameters, there are 32 means, 32 standard deviations and 72 correlations, as indicated by the non-observed values in Table 2. For J = 10 variables subject to missingness, there are 2J = 1,024 potential missingness patterns that generate 44,800 non-identifiable J × 2J −1 = 5,120 means, 5,120 standard devia j  J −1 J  J parameters: tions and j =0 j 2 − 2 = 34,560 correlations. The challenge here is not only that the number of sensitivity parameters increases exponentially depending on the missingness patterns and on the number of variables, but also that there are additional options for the parameterization. For instance, in the tetravariate case, instead of using μ1(0,r2 ,r3 ,r4 ) , r2 , r3 , r4 = 0, 1, as sensitivity parameters to estimate μ1 , we may prefer to employ functions of these parameters along with the identifiable ones, namely, μ1(1,r2 ,r3 ,r4 ) , r2 , r3 , r4 = 0, 1, or yet to use other options, such as quantiles of the distribution of Yi1 conditionally on Ri = (1, r2 , r3 , r4 ), for some values of r2 , r3 , r4 . Another alternative is to use all available data for the j th variable, that is, the mean μj (1) or the quantile of the distribution of Yij conditionally on Rij = 1, Fj (1) , e.g., μ1(0,r2 ,r3 ,r4 ) = α1(0,r2 ,r3 ,r4 ) + μ1(1) , μ1(0,r2 ,r3 ,r4 ) = β1(0,r2 ,r3 ,r4 ) μ1(1) , or μ1(0,r2 ,r3 ,r4 ) = Fj−1 (1) (p1(0,r2 ,r3 ,r4 ) ). Analogously, and like in the previous section, there is also the possibility to consider reparameterizations of the form σ1(0,r2 ,r3 ,r4 ) = λ1(0,r2 ,r3 ,r4 ) σ1(1) , where σj2(1) = Var(Yij |Rij = 1). The lower (upper) bounds of the ignorance interval for each parametric function of interest may be obtained by minimizing (maximizing) the corresponding function over the appropriate sensitivity parameters. In the tetravariate case, for example, when the target is the mean μj , the optimizations are carried over only the eight non-identifiable means of {μj (r) } and, for standard deviations, it may be performed over the eight non-identifiable means of {μj (r) } and eight non-identifiable standard deviations of {σj (r) } (16 sensitivity parameters). For each correlation, the number of sensitivity parameters to be handled in the optimization increases to 44 (16 means, 16 standard deviations and 12 correlations). So, in practice, all the 136 sensitivity parameters would only be used simultaneously if the target were some specific function of (μ, σ , Ψ ). Because of the large number of alternative parameterizations that may be comˆ only for the case where bined for a set of variables, we computed the variance of μ the sensitivity parameters are the non-identifiable elements in μ(r) . In Appendix B, this result as well as (10) and (11) are expressed in matrix formulation useful for computational implementation. Considering other parameterizations for the sensitivity parameters, we appeal to the non-parametric bootstrap to obtain estimates for the ˆ σˆ , Ψˆ and functions thereof. standard errors of μ, To evaluate question (ii) of Sect. 2, our conjectures about the non-identifiable parameters are that (1) the non-observed means of GRADRAT and of the other variables might range, respectively, between the 30%–50% and 20%–50% quantiles of the corresponding observed distributions, (2) the non-observed standard deviations

Sensitivity analysis for incomplete continuous data

599

Table 4 HEIRs and 95% EUROs for means, standard deviations (SDs) and correlations of GRADRAT (G), CSAT (C), ACT (A) and RMBRD (R) Variable

Parameter

HEIR

Weak EURO

Pointwise EURO

Strong EURO

Mean

[59.6; 60.3]

[58.8; 61.1]

[58.7; 61.2]

[58.5; 61.4]

SD

[18.7; 19.3]

[18.3; 19.7]

[18.1; 19.9]

[18.0; 20.0]

Mean

[926.7; 963.6]

[924.4; 965.8]

[918.9; 971.1]

[917.4; 972.5]

SD

[114.7; 146.0]

[112.8; 148.3]

[108.7; 153.3]

[107.6; 154.7]

Mean

[21.16; 22.07]

[21.17; 22.06]

[21.08; 22.15]

[21.06; 22.16]

SD

[2.43; 2.95]

[2.38; 3.01]

[2.31; 3.10]

[2.28; 3.13]

Mean

[3.75; 4.09]

[3.73; 4.11]

[3.69; 4.15]

[3.68; 4.17]

SD

[1.09; 1.38]

[1.08; 1.40]

[1.04; 1.44]

[1.03; 1.46]

G×C

Corr.

[0.37; 0.69]

[0.38; 0.69]

[0.35; 0.71]

[0.34; 0.72]

G×A

Corr.

[0.31; 0.66]

[0.31; 0.66]

[0.28; 0.69]

[0.28; 0.69] [0.21; 0.55]

GRADRAT CSAT ACT RMBRD

G×R

Corr.

[0.24; 0.52]

[0.24; 0.52]

[0.22; 0.55]

C×A

Corr.

[0.59; 0.95]

[0.60; 0.95]

[0.57; 0.97]

[0.57; 0.97]

C×R

Corr.

[0.11; 0.55]

[0.12; 0.54]

[0.08; 0.57]

[0.07; 0.58]

A×R

Corr.

[0.05; 0.46]

[0.06; 0.46]

[0.02; 0.49]

[0.02; 0.50]

of all variables might be between 0.80 and 1.25 times the standard deviations of the corresponding observed distributions, and (3) the non-observed linear correlations of (a) RMBRD vs. the other three variables might range from 0.0 to 0.5, (b) GRADRAT vs. CSAT and GRADRAT vs. ACT might be between 0.2 and 0.8, and (c) CSAT vs. ACT might range from 0.5 to 1.0. The underlying reparameterizations employed for the corresponding sensitivity parameters are μj (r) = Fj−1 (1) (pj (r) ) and σj (r) = λj (r) σj (1) , ∀r such that rj = 0, and the implied assumptions are (1) Ω pj (r) is = [0.3; 0.5] for j = 1 and = [0.2; 0.5] for j = 2, 3, 4, ∀r such that rj = 0, (2) Ω λj (r) = [0.80; 1.25], ∀j, r such that rj = 0, (3) Ω ψj k(r) , ∀r such that rj = 0 and/or rk = 0, is (a) = [0.0; 0.5] for j = 1, 2, 3, k = 4, (b) = [0.2; 0.8] for j = 1, k = 2, 3, and (c) = [0.5; 1.0] for j = 2, k = 3. In Table 4 we exhibit estimated intervals of ignorance and uncertainty for means, standard deviations and correlations of the four variables. We conclude that each pair of variables is positively linearly correlated; the magnitude of the correlations, however, is difficult to assess, given the ignorance caused by the missing data. As expected, the intervals are wider for pairs of variables that simultaneously had more missing data (see Table 2). Complete and available case analyses also point to a positive association among the variables, but as they do not account for the uncertainty caused by the missing data, their 95% confidence intervals (not shown) are much narrower than any of the EUROs.

5 Assessment of the uncertainty intervals To construct EUROs, Vansteelandt et al. (2006) assume that the values of the sensitivity parameters that correspond to the lower and upper bounds of the ignorance interval are independent of the observed data. This assumption is satisfied when the

600

F.Z. Poleto et al.

target is the mean, but it may fail for standard deviations and correlations. Consider, for example, the target σ in the univariate setting of Sect. 3, and assume that U L ; σ U ] and Ω L Ω σ(0) = [σ(0) μ(0) = [μ(0) ; μ(0) ] are specified. After replacing the identi(0) L and σ U are the fiable parameters in (2) by their sample counterparts, we note that σ(0) (0) values of σ(0) in the set Ω σ(0) that, respectively, minimize and maximize σˆ (μ(0) , σ(0) ) irrespectively of the data and of μ(0) . However, as ⎧ L L ⎪ ⎨ μ(0) , if μˆ (1) < μ(0) , arg min σˆ (μ(0) , σ(0) ) = μU , if μˆ (1) > μU (0) , ⎪ μ(0) ∈Ω μ(0) ⎩ (0) μˆ (1) , otherwise, U 2 2 μ(0) , if (μˆ (1) − μU ˆ (1) − μL (0) ) , (0) ) > (μ arg max σˆ (μ(0) , σ(0) ) = μL μ(0) ∈Ω μ(0) (0) , otherwise clearly depend on the data through μˆ (1) , the assumption is violated. We note, though, that for this specific case, with only two missingness patterns, the assumption is satisfied for standard deviations if we switch the sensitivity parameter from μ(0) to α or β. For the difference of means and the ratio of standard deviations considered in Sect. 3, the assumption is satisfied when there is one set of sensitivity parameters for each level of college administration and, additionally for the ratio of standard deviations, if the parameterization with α or β is employed; otherwise, the assumption fails. Vansteelandt et al. (2006) present another example wherein the assumption fails and, therefore, compute bootstrap estimates to assess the coverage probability of the uncertainty intervals. Following their ideas, we resample the dataset and repeat the analyses B = 5,000 times. The estimate of coverage for strong EURO is the proportion of the B strong EUROs that contain the HEIR of the original analysis; for pointwise EURO, it is the minimum between the proportion of the B pointwise EUROs that contain the lower bound of the HEIR of the original analysis and the corresponding proportion that contain the upper bound; finally, the estimate of coverage for weak EURO, is the mean of the B lengths of the intersections between the weak EUROs and the HEIR of the original analysis divided by the length of this HEIR. We also evaluate the coverages for the case where the standard errors and the uncertainty intervals are estimated on transformed scales, where we expect that the asymptotic normality, assumed for the construction of the EUROs, would be a good approximation; afterwards, the EUROs are backtransformed to the original scales. We considered the logit function for the mean of GRADRAT, the logarithm for the mean of RMBRD as well as for all standard deviations or the ratio of standard deviations, and Fisher’s z-transformation for all correlations, i.e., 0.5 log[(1 + ψj k )/(1 − ψj k )].2 We display the estimates of the coverage of the EUROs for the multivariate and the univariate analyses considered above in Tables 5 and 6. For the multivariate analysis (Table 5), the transformed scales do not make the EUROs get closer to the nominal 95% level, and in some cases the results are even worse than the ones obtained on the 2 These EUROs are not shown, because they differ by zero or only one unit of the last digit of the results

presented for the original scales in Tables 3 and 4 and, then, do not alter any conclusions.

Sensitivity analysis for incomplete continuous data

601

Table 5 Bootstrap estimates of coverage (×100) for 95% EUROs of means, standard deviations (SDs) and correlations of GRADRAT (G), CSAT (C), ACT (A) and RMBRD (R) for the analyses of Table 4 Variable

Parameter

Weak EURO original

GRADRAT CSAT ACT RMBRD

Pointwise EURO

transf.a

original

transf.a

Strong EURO original

transf.a

Mean

93.9

94.3

93.6

94.1

93.5

93.9

SD

94.8

94.7

94.4

94.6

94.7

94.4

95.4

95.0

95.1

96.0

Mean

95.0

SD

95.6

Mean

95.3

94.4

94.2

95.7

95.4

96.8

SD

95.0

94.7

93.6

93.5

94.7

93.8

Mean

93.6

93.7

88.6

88.5

89.8

89.8

SD

94.7

94.6

92.8

93.5

94.0

93.8

G×C

Corr.

95.8

95.9

96.4

95.4

96.8

95.8

G×A

Corr.

93.7

94.0

89.7

90.7

91.7

90.8

G×R

Corr.

94.9

95.1

94.3

93.8

94.4

94.4

C×A

Corr.

95.1

96.4

86.4

82.0

90.8

86.5

C×R

Corr.

95.5

95.7

95.8

95.8

95.8

95.8

A×R

Corr.

95.9

96.1

94.2

94.0

96.4

96.4

a Transformed scales: logit for the mean of GRADRAT, logarithm for the mean of RMBRD and all SDs, and Fisher’s z-transformation for all correlations

Table 6 Bootstrap estimates of coverage (×100) for 95% EUROs of μ(PRI) − μ(PUB) and σ(PRI) /σ(PUB) of CSAT for the analyses of Table 3 Parameter

Sens. Par.a

Scale

Weak EURO

Pointwise EURO

Strong EURO

μ(PRI) − μ(PUB)

individual

original

94.5

93.2

92.7

common

original

95.0

94.3

93.5

original

92.6

82.5

85.0 94.3

σ(PRI) σ(PUB)

individual

common

logarithm

94.7

94.3

original

94.4

94.2

94.5

logarithm

94.7

94.2

94.6

a Individual and common sensitivity parameters for the college administrations

original scale. On the other hand, the coverage for the EUROs of the ratio of standard deviations (Table 6) is greatly improved on the logarithm scale when no reduction on the number of sensitivity parameters is considered. With the exception of the pointwise and strong EUROs for the mean of RMBRD, for the correlations of GRADRAT and CSAT vs. ACT, and for the ratio of standard deviations, the coverages are not so far from 95%. Weak EUROs are in general much closer to the nominal level than the other two alternatives. It is interesting to note that, although Vansteelandt et al. (2006) state that weak EUROs, contrarily to pointwise and strong EUROS, cannot generally be estimated on a monotonously transformed scale while retaining the original cover-

602

F.Z. Poleto et al.

Table 7 Bootstrap estimates of coverage (×100) for 95% EUROs of μ(PRI) − μ(PUB) and σ(PRI) /σ(PUB) of CSAT Parameter

μ(PRI) − μ(PUB)

Sens. Par.a

Scale

Weak EURO

Pointwise EURO

Strong EURO

μ(0)

original

95.0

95.1

95.3

α

original

95.0

94.6

94.9

β

original

95.0

94.6

94.8

p

original

94.5

93.2

92.7

original

93.7

83.1

84.8

logarithm

94.9

94.4

94.8

original

92.9

82.7

85.1

logarithm

94.8

94.6

94.7

original

93.7

81.5

84.2

logarithm

94.8

94.1

94.2

original

92.9

82.6

84.7

logarithm

94.8

94.5

94.6

original

93.7

81.5

84.1

(μ(0) , σ(0) ) (μ(0) , λ) (α, σ(0) )

σ(PRI) σ(PUB)

(α, λ)

(β, σ(0) ) (β, λ)

(p, σ(0) ) (p, λ)

logarithm

94.8

94.1

94.2

original

92.9

82.7

84.7

logarithm

94.8

94.6

94.6

original

93.2

81.4

83.7

logarithm

94.5

94.2

93.7

original

92.6

82.5

85.0

logarithm

94.7

94.3

94.3

a One of these sensitivity parameters for each level of college administration

age level, for the current analyses, the coverages of the weak EUROs are, in general, closer to the nominal level in the cases for which the transformed scale also improve the coverage of the strong and/or the pointwise EUROs. In Table 7 we show the estimates of the coverages for the EUROs corresponding to the univariate analysis varying the parameterizations for the case with one set of sensitivity parameters for each level of college administration. The ranges of the sensitivity parameters are chosen in such a way that the HEIRs match the corresponding ones of Table 3. When comparing the results in Tables 5, 6 and 7, there is no clear evidence that the coverages are closer to the nominal level when the values of the sensitivity parameters that correspond to the lower and upper bounds of the ignorance interval are independent of the observed data. Indeed, the results may be more influenced by the quality of the normal approximation for the distributions of the estimators, which in turn may depend upon the choice of the sensitivity parameter. For both parametric functions in Table 7, the coverages of all the EUROs are the closest to the nominal level for μ(0) , the farthest from 95% for p, and very similar for α and β. For σ(PRI) /σ(PUB) , the coverages are slightly closer to the nominal level when σ(0) is

Sensitivity analysis for incomplete continuous data

603

chosen in lieu of λ, for weak EUROs when either μ(0) , α, β or p is employed, and also for the pointwise EURO obtained under the original scale with μ(0) ; the roles of σ(0) and λ are reversed for all strong EUROs and the other pointwise EUROs.

6 Discussion Selection models and pattern-mixture models are likely the most common frameworks for incomplete data modelling. In the univariate case, Scharfstein et al. (2003) showed that the assumption logit P (Ri = 0|Yi = y) = constant + q(y)

(12)

where q is the so-called selection-bias function, under the selection model is equivalent to the restriction exp[q(y)] , exp[q(s)]f (1) (s) ds −∞

f(0) (y) = f(1) (y)  ∞

∀y,

(13)

under the pattern-mixture model. Beyond the direct interpretation of these expressions, they also noted that, if q(y) = δ log(y), for example, it follows from (12) that exp(δ) is the odds ratio of missingness between subjects who differ by one unit of log(y); from (13), we may conclude that δ > 0 (< 0) indicates that the distribution of Yi for the missing outcomes is more (less) heavily weighted towards large values of Yi than the distribution of Yi for the observed outcomes. These insights are fundamental to carry out a sensitivity analysis. Nevertheless, the functional form of q(y) as well as the range of values to be considered for δ are hard to assess. When the target of inference is the mean, the standard deviation, the correlation or some function thereof, we may employ their non-identifiable counterparts as sensitivity parameters. These sensitivity parameters are easier to elicit than the selection-bias functions because the former are directly related to the parameters of interest. However, there are connections between both strategies; for example, for a specified q(y), we can use (13) to compute the corresponding results for μ(0) , α, β, p, σ(0) , and λ considered in Sect. 3. Some of these ideas on parameterization were considered previously in the literature. For instance, Rubin (1977) uses (1) to develop a Bayesian solution assuming normality and Daniels and Hogan (2000) consider a pattern-mixture model of multivariate normal distributions wherein the model identification is accomplished through  1/2 −1/2 b(d) = μ(d) − μ(d+1) and C(d) = Σ (d) Σ (d+1) , d = 1, . . . , J , where d = 1 + Jj=1 rj is the drop-out indicator and b(d) and C(d) are, respectively, pre-specified vectors and matrices. We extend these results by dropping the normality assumption and allowing a greater flexibility for the identification of the model. First, we not only consider absolute differences of means of missingness patterns but also relative differences and the possibility of relating non-observed means to quantiles of observed distributions. Second, we replace the hard-to-elicit functions of covariance matrices by relative differences of standard deviations and by correlations. Third, in the multivariate case, we show that a non-identifiable mean (or standard deviation) may be related to an

604

F.Z. Poleto et al.

identifiable one not only in cases with a single specific missingness pattern but also in cases with sets of missingness patterns wherein the corresponding variable is observed. With these alternatives, it is easier to extract information from experts or from historical data and, consequently, to produce meaningful and more plausible sensitivity analyses. When the interest lies only in functions of the means, an advantage of our approach is that we do not need to specify ranges for the sensitivity parameters of standard deviations (and correlations), in contrast to Daniels and Hogan (2000) who have to identify the adopted multivariate normal distributions and, as a consequence, show that their posterior standard deviations of the functions of the means are influenced by the choice of these sensitivity parameters. However, a drawback of our approach is that there is no way to set the sensitivity parameters to a value that corresponds to the MAR assumption. Acknowledgement The authors are grateful to an associate editor and a referee for their enlightening and constructive comments.

Appendix A: Estimated uncertainty regions For a scalar parameter π of interest, the HEIR obtained by setting ω equal to ωl and ωu is denoted by  ir(π, Ω ω ) = [π(ω ˆ l ), πˆ (ωu )] = [πˆ l , πˆ u ]. Vansteelandt et al. (2006) provided algorithms for constructing the three versions of EUROs defined in Sect. 3, all with the usual form [πˆ l − cα ∗ /2 se(πˆ l ), πˆ u + cα ∗ /2 se(πˆ u )], where πˆ l and πˆ u are obtained from consistent and asymptotically normal estimators of πl and πu , and se(πˆ l ) and se(πˆ u ) are obtained from consistent estimators of the standard errors of πˆ l and πˆ u . For strong EUROs, the critical value cα ∗ /2 is the 100(1 − α/2)% percentile of the standard normal distribution. For pointwise EUROs, cα ∗ /2 is the solution of       πˆ u − πˆ l πˆ u − πˆ l , Φ cα ∗ /2 + − Φ(−cα ∗ /2 ) min Φ(cα ∗ /2 ) − Φ −cα ∗ /2 − se(πˆ u ) se(πˆ l ) = 1 − α, where Φ denotes the standard normal cumulative distribution function. For weak EUROs, cα ∗ /2 is the solution of  se(πˆ l ) + se(πˆ u ) +∞ α= zϕ(z + cα ∗ /2 ) dz + ε, πˆ u − πˆ l 0 where ϕ is the standard normal density function and ε is the correction term   +∞ se(πˆ u ) +∞ ϕ(z + cα ∗ /2 ) dz − zϕ(z + cα ∗ /2 ) dz ε= πˆ u − πˆ l (πˆ u −πˆ l )/ (πˆ u −πˆ l )/ se(πˆ u ) se(πˆ u )  +∞  +∞ se(πˆ l ) + ϕ(z + cα ∗ /2 ) dz − zϕ(z + cα ∗ /2 ) dz πˆ u − πˆ l (πˆ u −πˆ l )/ se(πˆ l ) se(πˆ l ) (πˆ u −πˆ l )/ that may be set equal to zero unless the sample size is small and/or there is little ignorance about π . When there is much ignorance about π and the sample size is large, pointwise EUROs approach strong EUROs.

Sensitivity analysis for incomplete continuous data

605

Appendix B: Matrix expressions With some algebra, (10) and (11) may be conveniently rewritten as μ = γ ∗ μ(∗) , Σ = γ ∗ Σ (∗) (1P ⊗ IJ ) + γ ∗ S(∗) (1P ⊗ IJ ), where γ ∗ = γ ⊗ IJ , γ = (γr , r ∈ R) , ⊗ denotes the Kronecker product, IJ represents an identity matrix of order J , μ(∗) = (μr , r ∈ R) , Σ (∗) and S(∗) are block diagonal matrices with blocks Σ (r) and S(r) = (μ(r) − μ)(μ(r) − μ) , r ∈ R, respectively, 1P denotes a P × 1 vector with all elements equal to 1 and P represents the number of missingness patterns, i.e., the cardinality of R. When employing the nonidentifiable components of μ(r) , stacked in the vector μNI , as sensitivity parameters, ˆ is specified as the covariance matrix of μ 



  NI  1  I 1 ˆ μ = γ ∗ Σ (∗) (1P ⊗ IJ ) + μD Dγ − γ γ  ⊗ 1J 1J μD , Cov μ n n where μD = (Dμr , r ∈ R) and Σ I(∗) is obtained from Σ (∗) by replacing the nonidentifiable parameters by 0.

References Allison PD (2001) Missing data. Sage, Thousand Oaks Daniels MJ, Hogan JW (2000) Reparameterizing the pattern mixture model for sensitivity analyses under informative dropout. Biometrics 56:1241–1248 Daniels MJ, Hogan JW (2007) Missing data in longitudinal studies: strategies for Bayesian modeling and sensitivity analysis. Chapman & Hall, London Efron B, Gong G (1983) A leisure look at the bootstrap, the jackknife and cross-validation. Am Stat 37:36– 48 Fitzmaurice G, Davidian M, Verbeke G, Molenberghs G (2008) Longitudinal data analysis. Chapman & Hall, Boca Raton Glynn RJ, Laird NM, Rubin DB (1986) Selection modeling versus mixture modeling with nonignorable nonresponse (with discussion). In: Wainer H (ed) Drawing inferences from self-selected samples. Erlbaum, Mahwah, pp 115–151 Grab EL, Savage IR (1954) Tables of the expected value of 1/X for positive Bernoulli and Poisson variables. J Am Stat Assoc 49:169–177 Hyndman RJ, Fan Y (1996) Sample quantiles in statistical packages. Am Stat 50:361–365 Kenward MG, Goetghebeur E, Molenberghs G (2001) Sensitivity analysis for incomplete categorical data. Stat Model 1:31–48 Little RJA (1994) A class of pattern-mixture models for normal incomplete data. Biometrika 81:471–483 Little RJA, Rubin DB (2002) Statistical analysis with missing data, 2nd edn. Wiley, New York Little RJA, Wang YX (1996) Pattern-mixture models for multivariate incomplete data with covariates. Biometrics 52:98–111 Mendenhall W, Lehman JEH (1960) An approximation to the negative moments of the positive binomial useful in life testing. Technometrics 2:227–242 Molenberghs G, Kenward MG (2007) Missing data in clinical studies. Wiley, New York Molenberghs G, Kenward MG, Goetghebeur E (2001) Sensitivity analysis for incomplete contingency tables: the Slovenian plebiscite case. Appl Stat 50:15–29 Poleto FZ, Paulino CD, Molenberghs G, Singer JM (2010) Inferential implications of over-parameterization: a case study in incomplete categorical data. Tech rep, RT-MAE-2010-04, Instituto de Matemática e Estatística, Universidade de São Paulo

606

F.Z. Poleto et al.

Rotnitzky A, Robins JM, Scharfstein DO (1998) Semiparametric regression for repeated outcomes with nonignorable nonresponse. J Amer Stat Assoc 93:1321–1339 Rotnitzky A, Scharfstein D, Su TL, Robins JM (2001) Methods for conducting sensitivity analysis of trials with potentially nonignorable competing causes of censoring. Biometrics 57:103–113 Rubin DB (1976) Inference and missing data. Biometrika 63:581–592 Rubin DB (1977) Formalizing subjective notions about the effect of nonrespondents in sample surveys. J Am Stat Assoc 72:538–543 Scharfstein DO, Rotnitzky A, Robins JM (1999) Adjusting for nonignorable drop-out using semiparametric nonresponse models (with discussion). J Am Stat Assoc 94:1096–1146 Scharfstein DO, Daniels MJ, Robins JM (2003) Incorporating prior beliefs about selection bias into the analysis of randomized trials with missing outcomes. Biostatistics 4:495–512 Sen PK, Singer JM, Pedroso de Lima AC (2009) From finite sample to asymptotic methods in statistics. Cambridge University Press, Cambridge Silverman BW (1986) Density estimation, 2nd edn. Chapman & Hall, London Stephan FF (1945) The expected value and variance of the reciprocal and other negative powers of a positive Bernoullian variate. Ann Math Stat 16:50–61 Vansteelandt S, Goetghebeur E, Kenward MG, Molenberghs G (2006) Ignorance and uncertainty regions as inferential tools in a sensitivity analysis. Stat Sin 16:953–979

Lihat lebih banyak...

Comentários

Copyright © 2017 DADOSPDF Inc.