A sensitivity result for semidefinite programs

June 1, 2017 | Autor: Roland Freund | Categoria: Applied Mathematics, Business and Management, Sensitivity, Numerical Analysis and Computational Mathematics, Linear Equations

Share Embed

Denunciar este link

Descrição do Produto

Available online at www.sciencedirect.com

Operations Research Letters 32 (2004) 126 – 132

Operations Research Letters www.elsevier.com/locate/dsw

A sensitivity result for semide#nite programs Roland W. Freunda , Florian Jarreb;∗ a Bell

Laboratories, Lucent Technologies Room 2C-525, 700 Mountain Avenue, Murray Hill, NJ 07974-0636, USA b Institut f* ur Mathematik, Universit*at D*usseldorf, Universit*atsstra.e 1, D-40225 D*usseldorf, Germany Received 6 February 2003; received in revised form 5 May 2003; accepted 7 May 2003

Abstract We study the sensitivity of solutions of linear semide#nite programs under small arbitrary perturbations of the data. We present an elementary and self-contained proof of the di/erentiability of the solutions as functions of the perturbations, and we characterize the derivative as the solution of a system of linear equations. c 2003 Elsevier B.V. All rights reserved. Keywords: Linear semide#nite program; Sensitivity; Strict complementarity

1. Introduction We consider the perturbation of unique and strictly complementary optimal solutions of linear semidefinite programs when the data is subject to small arbitrary changes. Sensitivity results of this type have been derived for more general nonlinear programs with cone constraints. For such programs, Robinson [7, Theorem 2.3] established an upper semicontinuity property of the set of stationary points under certain regularity assumptions. In this paper, by exploiting the special structure of the semide#nite cone, we strengthen this result to a di/erentiability property, and we characterize the derivatives as solutions of a system of linear equations. Such a di/erentiability result was obtained earlier by Shapiro [8] in a more general setting and with a di/erent characterization of

∗

Corresponding author. E-mail addresses: [email protected] (R.W. Freund), [email protected] (F. Jarre).

the derivative, namely as the solution of a quadratic program. We also present some examples that illustrate why this di/erentiability result does not hold for more general convex programs in conic form. Our result complements earlier results in [3,10–12] where perturbations of the objective function or of the right-hand side of a semide#nite program have been linked to optimal partitions. Further related results include the paper [4], where a local Lipschitz continuity property of unique and strictly complementary solutions of linear semide#nite programs is derived. The work of Pataki [5] analyzes the sensitivity of the optimal solutions of certain cone programs, which include semide#nite programs as a special case. In [1,8,9], the di/erentiability of certain solutions of nonlinear semide#nite programs is established under the assumption that a transversality condition and a second-order suBcient condition are satis#ed. Furthermore, the derivative is characterized as the solution of a quadratic program. This result also applies to the class of linear semide#nite programs considered in Section 3 below.

c 2003 Elsevier B.V. All rights reserved. 0167-6377/$ - see front matter doi:10.1016/S0167-6377(03)00069-5

R.W. Freund, F. Jarre / Operations Research Letters 32 (2004) 126 – 132

In this paper, we present a brief and self-contained proof of di/erentiability of unique and strictly complementary solutions of linear semide#nite programs, and we derive a nonsingular system of linear equations for the derivatives. This characterization of the derivatives as solutions of a system of linear equations seems to be new. Computing the derivatives via simple linear system solves may be useful in the context of new approaches for solving more complicated nonlinear semide#nite programs by a sequence of linear semide#nite programs that are subject to small changes in the data from one iteration to the next. Such an approach will be described in the forthcoming report [2]. The remainder of this short note is as follows. In Section 2, we brieEy introduce some notation and recall a known result for linear semide#nite programs. In Section 3 we present and prove our perturbation theorem. In Section 4 we discuss two examples. 2. Notation and a known result The following notation has become standard in the literature on linear semide#nite programs. The space of real symmetric n × n-matrices is denoted by Sn . The notation X ¡ 0 (X 0) is used to indicate that X ∈ Sn is positive semide#nite (positive de#nite). The standard scalar product on the space of n × n-matrices is given by n Ci; j Xi; j : C; X := C • X := trace(C T X ) = i; j=1

For given matrices A(i) ∈ Sn , i = 1; 2; : : : ; m, we de#ne a linear map A : Sn → Rm by   (1) A •X     .. A(X ) :=   ; X ∈ Sn : .   A

(m)

•X

The adjoint operator A∗ : Rm → Sn , which satis#es A∗ (y); X = yT A(X ) is given by m yi A(i) ; A∗ (y) = i=1

for all

y ∈ Rm :

X ∈ S n ; y ∈ Rm ;

127

With these de#nitions, the standard pair of primal and dual linear semide#nite programs can now be stated as follows: (P)

minimize

C •X

subject to

A(X ) = b;

maximize

bT y

subject to

A∗ (y) + S = C;

X ¡ 0:

and (D)

S ¡ 0:

Note that the data for (P) and (D) are the linear map A, the vector b ∈ Rm , and the matrix C ∈ Sn . If there exists a matrix X 0 that is feasible for (P), then we call X strictly feasible for (P) and say that (P) satis#es Slater’s condition. Likewise, if there exists a matrix S 0 and a vector y such that y, S is feasible for (D), we call (D) strictly feasible. The following result on the pair of semide#nite programs (P) and (D) is well known; see, e.g., [6,10]. If problem (P) or (D) satis#es Slater’s condition, then the optimal values of (P) and (D) coincide. Furthermore, if in addition both problems (P) and (D) are feasible, then optimal solutions X opt and yopt , S opt of both problems exist and satisfy the complementarity condition X opt S opt = 0: Conversely, if X and y, S are feasible points for (P) and (D), respectively, and if XS = 0, then X is an optimal solution of (P) and y, S is an optimal solution of (D).

3. A perturbation theorem In this section, we present our result on the perturbation of strictly complementary solutions of pairs of linear semide#nite programs of the form (P) and (D). Theorem 1. Let a linear operator A : Sn → Rm , a vector b ∈ Rm , and a matrix C ∈ Sn be the data of a pair (P) and (D) of primal and dual linear semidefinite programs. Assume that the programs (P) and (D) satisfy Slater’s condition, and that XL ∈ Sn and yL ∈ Rm , SL ∈ Sn are unique and strictly

128

R.W. Freund, F. Jarre / Operations Research Letters 32 (2004) 126 – 132

complementary solutions of (P) and (D), that is, L + SL = C; A (y)

A(XL ) = b; XL ¡ 0;

∗

SL ¡ 0;

Remark 4. As indicated in Example 2 below, Theorem 1 contrasts recent sensitivity results in cone programming, see e.g. part (2) of Theorem 3.5.1 in [5].

XL SL = 0;

XL + SL 0:

(1)

If the data of (P) and (D) is changed by suAciently small perturbations MA, Mb, and MC, then the optimal solutions of the perturbed semideBnite programs are diCerentiable functions of the perturbations. Furthermore, the derivatives X˙ := DA; b; C XL [MA; Mb; MC]; y˙ := DA; b; C y[MA; L Mb; MC]; and L Mb; MC]; S˙ := DA; b; C S[MA; of the solution X , y, S at XL , y, L SL satisfy A(X˙ ) = Mb − MA(XL );

Proof. We #rst observe that the uniqueness of XL and yL implies that rank A = m. Indeed, assume that rank A∗ = rank A ¡ m. Then, there exists a vector My ∈ Rm , My = 0, such that A∗ (My) = 0. If bT My = 0, then yL + My is also an optimal solution of (D) in contradiction to the uniqueness of y. L If bT My = 0, then (D) does not have a #nite optimal solution, which is again a contradiction. In particular, it follows that the dimension m of the variables y ∈ Rm of the dual problem (D) cannot be larger than the dimension n(n + 1)=2 of the variables X ∈ Sn of the primal problem (P). Slater’s condition states that ∃y; X 0; S 0 :

˙ + S˙ = MC − MA∗ (y); L A∗ (y) X˙ SL + XL S˙ = 0:

Next, we present the proof of Theorem 1.

(2)

Remark 1. In [1, Chapter 5.3] and [8,9], more general sensitivity results for locally unique solutions of nonlinear, possibly nonconvex, semide#nite programs are derived under a transversality condition, a rank condition, strict complementarity, and a second-order suBcient condition. It is shown that derivatives can be characterized as optimal solutions of certain quadratic programs. These results also apply to the class of linear semide#nite programs considered here. The present paper, however, provides a simpler characterization of the derivatives as solutions of nonsingular systems of linear equations, and a self-contained direct proof. Remark 2. A related sensitivity result for linear semide#nite programs for a more restricted class of perturbations than the ones considered in Theorem 1, but also under weaker assumptions, is given in [11]. Remark 3. We believe that the simple result of Theorem 1 has not been explicitly stated before in the literature. In particular, we are not aware of any earlier results that characterize the derivatives via the solution of a nonsingular system of linear equations, as stated in Corollary 1 below

A(X ) = b; A∗ (y) + S = C:

By continuity and the observation that rank A = m, Slater’s condition is also satis#ed for all suBciently small perturbations of the problem data. Hence, the perturbed problem possesses optimal solutions XL + MX , yL + My, and SL + MS. The optimality conditions of the perturbed problem are given by XL + MX ¡ 0, SL + MS ¡ 0, and (A + MA)(XL + MX ) = b + Mb; (A∗ + MA∗ )(yL + My) + S + MS = C + MC; (XL + MX )(SL + MS) = 0: Subtracting from these equations the #rst three equations of (1) yields (A + MA)(MX ) = Mb − MA(XL ); (A∗ + MA∗ )(My) + MS = MC − MA∗ (y); L (MX )SL + XL (MS) = −(MX )(MS):

(3)

Note that the left-hand side of (3) represents an overdetermined system of linear equations for the unknowns MX , My, and MS. Since the existence of MX , My, and MS follows from Slater’s condition,

R.W. Freund, F. Jarre / Operations Research Letters 32 (2004) 126 – 132

it suBces to verify uniqueness and di/erentiability. Neglecting the second-order terms in (3), we then obtain the result claimed in (2). First, we show that (2) has a unique solution. By (1), XL SL = 0 = SLXL , and thus the matrices XL ¡ 0 and SL ¡ 0 commute. This guarantees that there exists a unitary matrix U and diagonal matrices = diag(1 ; 2 ; : : : ; n ) ¡ 0; = diag(1 ; 2 ; : : : ; n ) ¡ 0;

(4)

such that XL = UU T

and

SL = UU T :

(5)

L Together with (5), By (1), we have XL + SL 0 = XL S. it follows that + 0=. This implies that there exists an integer 0 6 k 6 n such that k of the diagonal entries of are positive, n − k of the diagonal entries of are positive, and the two associated subsets of indices are disjoint. Thus, without loss of generality, we may assume that the diagonal entries in (4) satisfy 1 ; 2 ; : : : ; k ¿ 0

and

k+1 ; k+2 ; : : : ; n ¿ 0: (6)

Next, we de#ne matrices := U T SU; ˙ MS

:= U T X˙ U MX

˜ ˜ : Sn → Rm by A(Z) := and a linear map A A(UZU T ) for all Z ∈ Sn . It then follows that ˜ ∗ (y) = U T A∗ (y)U . With these de#nitions, sysA tem (2) with all three right-hand sides set to zero is equivalent to the following system: ) = 0; ˜ MX A( = 0; ˜ ∗ (y) A ˙ + MS + MS = 0: MX

(7)

For any n × n-matrix Z, let Qup (Z) denote the upper and MS triangular part of Z. By the symmetry of MX and the fact that and are diagonal, it follows that the solution set of (7) does not change if the last set of equations is replaced by + MS) = 0: Qup (MX

(8)

Eq. (8) and the #rst two equations of (7) form a linear system of m + n(n + 1) equations for m + n(n + 1) unknowns, namely the entries of the vector y˙ and of

129

and MS. We now show the symmetric matrices MX that this linear system only has the zero solution. To = 0 and this end, we only need to verify that MX ∗ = 0. Indeed, since A has full column rank m, the MS second equation of (7) then implies that also y˙ = 0. and the structure of , By the symmetry of MX is zero. the trailing (n − k) × (n − k)-block of MX is zero. It Likewise, the leading k × k-block of MS also turns out MX i; j = 0 for all 1 6 i 6 k ¡ j 6 n. Indeed, suppose there were indices 1 6 i 6 k ¡ j 6 n i; j = 0. By (8), we have such that MX i; j + i MS i; j = 0 j MX and together with (6), it would follow that for all such i; j indices 1 6 i 6 k ¡ j 6 n, the matrix entries MS and MX i; j had opposite signs. This, in turn, would imply i; j MS i; j = MX ; MS MX 0¿2 i6k¡j

; A ); y

∗ (y)

MX = − MX ˙ = −A( ˙ = 0; y ˙ = 0; which is a contradiction. Finally, if for some i; j = 0, then 1 6 i; j 6 k there was an entry MX T L L X + U MX U and y, L S would be optimal solutions of the pair of programs (P) and (D) for suBciently i; j only change the small ||, since the entries MX strictly positive eigenvalues of XL for 1 6 i; j 6 k. This contradicts the assumption of the uniqueness of i; j = 0 for k ¡ i; j 6 n. XL . Likewise, it follows that MS We have thus shown that the overdetermined system (2) has a unique solution. In fact, we have shown that the system that is obtained when the last row of (2) is replaced by L + U T XL SU ˙ )=0 Qup (U T X˙ SU also has a unique solution. With U denoting the #xed matrix from (5), we can now apply the implicit function theorem to the system A(X ) − b = 0;

A∗ (y) − C + S = 0;

Qup (U T XSU ) = 0:

(9)

As we have just seen, the linearization of (9) at XL , y, L SL is nonsingular, and hence (9) has a di/erentiable and locally unique solution. Since the solutions of (2) also satisfy (9), the claim of the theorem follows.

130

R.W. Freund, F. Jarre / Operations Research Letters 32 (2004) 126 – 132

In Theorem 1, the derivatives X˙ , y, ˙ and S˙ are characterized via an overdetermined system of linear equations. The preceding proof also shows that these derivatives are given as a solution of a nonsingular system of linear equations. For completeness, this system is stated explicitly in the following corollary. Corollary 1. Let the assumptions of Theorem 1 be satisBed, and let U be an orthogonal matrix that L as in (5). Then, simultaneously diagonalizes XL and S, the derivatives X˙ , y, ˙ and S˙ are given as the unique solution of the following nonsingular system of m + n(n + 1) linear equations for m + n(n + 1) unknowns: A(X˙ ) = Mb − MA(XL ); A∗ (y) ˙ + S˙ = MC − MA∗ (y); L ˙ ) = 0: Qup (U T (X˙ SL + XL S)U

(10)

˙ ) denotes the upper trianHere, Qup (U T (X˙ SL + XL S)U ˙ . gular part of the matrix U T (X˙ SL + XL S)U

4. Discussion The above proof relies on the fact that the optimal solution is assumed to be strictly complementary. The concept of strict complementarity is based on the identi#cation of smooth active constraint manifolds, such as the zero eigenvalues for linear semide#nite programs. For general closed convex cones such constraint manifolds are not as easily de#ned as for the positive orthant or the semide#nite cone. Next, we present an example that illustrates this diBculty. Example 1. We consider the set C ⊂ R2 de#ned by the intersection of the half-planes containing the origin and that are tangent to the unit circle at the coordinates 1 x2 = ± n and x1 = ± 1 − x22 for n ∈ N: 2 (11) Note that C is a closed and bounded convex set.√It is constructed a square with the vertices [0; ± 2]T √ from T and [ ± 2; 0] . The vertices on the x1 -axis are cut o/ by an in#nite number of ‘facet-de#ning’ cuts. None of these facets contains the point xL := [1; 0]T of C. Nevertheless, xL is the unique optimal solution of the

convex program minimize[ − 1; 0]

x1

x2

subject to

x1 x2

∈ C: (12)

By adding the redundant constraint x1 [1; 0] 6 1 with multiplier y0 := 1 x2 and de#ning the multipliers yi = 0 for all constraints in (11), the point xL can be complemented to a ‘formally strictly complementary’ solution. The reason is that none of the constraints in (11) is active at the point x, L that is, satis#ed with equality. However, the di/erentiability result of Theorem 1 does not hold in this situation. There are arbitrarily small perturbations Mc = [0; ]T of the objective vector c := [ − 1; 0]T for which the set of optimal solutions contains more than one point. Furthermore, for those arbitrarily small perturbations that do have unique solutions xL + Mx, the solutions jump in a nondi/erentiable fashion from one vertex of C to the next. Indeed, each time the solution jumps, the norm Mx2 changes by a factor of about 3 2 , and hence Mx2 is not a di/erentiable function of Mc. Similarly, the closed convex cone K := {x ∈ R3 | x3 ¿ 0; (x1 ; x2 )=x3 ∈ C} ∪ {0} does not possess a straightforward generalization of active constraint manifolds. Problem (12) can be rewritten as minimize x1

subject to

x3 = 1;

x ∈ K;

which is of the form (P) with the positive semide#niteness constraint X ¡ 0 replaced by the cone constraint x ∈ K. This example shows that the di/erentiability properties stated in Theorem 1 cannot be generalized in a straightforward fashion to more general convex cone programs. It is well known that unique—and therefore strictly complementary—solutions of a primal-dual pair of linear programs do not change under suBciently small perturbations of the problem data. For linear semidefinite programs, however, the situation is more complicated. In the appendix of [11], it is shown that the optimal partition of a strictly complementary solution of a primal-dual pair of linear semide#nite programs (P) and (D) can indeed change under arbitrarily small

R.W. Freund, F. Jarre / Operations Research Letters 32 (2004) 126 – 132

perturbations of the problem data C or b, even if Slater’s condition is satis#ed for (P) and (D). However, if, in addition, the optimal solution is unique, Theorem 1 of the present paper is applicable. This situation is illustrated in the next example. Example 2. In a small neighborhood of the point xL = [1; 0]T , the set C of Example 1 forms a very close approximation to the unit circle. To contrast the behavior of Example 1 with our sensitivity result, we replace the set C by the unit circle itself. We then obtain the semide#nite program of form (P),   0 0 −1   0 minimize   0 0 •X −1 0 0   1 0 x1   1 x2  subject to X =  0  ; X ¡ 0: x1 x2 1 The feasible solutions are all x1 , x2 in the unit ball of R2 , and the unique optimal solution XL is given by xL1 = 1, xL2 = 0, just as in (12). Moreover, the objective function coincides with the one from (12) up to a factor of 2, which is introduced merely for notational convenience. The dual problem is as follows: maximize y1 + y2 + y3    0 y1 y4 0      subject to   y4 y2 0  4  0 −1 0 0 y3

0

−1



0

 0 :

0

0

Its unique optimal solution is given by yL 1 = yL 3 = −1 and yL 2 = yL 4 =0. The corresponding optimal dual slack is   1 0 −1   0 SL =   0 0  −1 0 1 and is strictly complementary, that is,   2 0 0    XL + SL =   0 1 0  0: 0 0 2

Thus, Theorem  0  MC := −  0 0

1 is applicable. For the perturbation  0 0  0  ; 0

we set 

0  X˙ :=  0 0 and



0

 S˙ :=  

0

0

0





0 −

0

0



   0   y˙ :=    0   −

  ; 0

0

131



 −  : 0

Then, X˙ , y, ˙ and S˙ satisfy all conditions in (2). In particular, in contrast to the case of linear programming, the optimal solution X will change for any = =0. Acknowledgements The authors would like to thank Shuzhong Zhang for a very interesting discussion about the nature of the optimal partition. They are also indebted to an anonymous referee for pointing out further references related to this paper. References [1] F. Bonnans, A. Shapiro, Perturbation Analysis of Optimization Problems, Springer, New York, 2000. [2] R.W. Freund, F. Jarre, Numerical computation of nearby positive real systems in the descriptor case, Numerical Analysis Manuscript, Bell Laboratories, Murray Hill, NJ, 2003, in preparation. [3] D. Goldfarb, K. Scheinberg, On parametric semide#nite programming, Appl. Numer. Math. 29 (1999) 361–377. [4] M.V. Nayakkankuppam, M.L. Overton, Conditioning of semide#nite programs, Math. Programming 85 (1999) 525–540. [5] G. Pataki, The geometry of semide#nite programming, in: H. Wolkowicz, R. Saigal, L. Vandenberghe (Eds.), Handbook of Semide#nite Programming: Theory, Algorithms and Applications, Kluwer Academic Publishers, Boston, 2000, pp. 29–66. [6] S.M. Robinson, First-order conditions for general nonlinear optimization, SIAM J. Appl. Math. 30 (1976) 597–607.

132

R.W. Freund, F. Jarre / Operations Research Letters 32 (2004) 126 – 132

[7] S.M. Robinson, Generalized equations and their solutions, Part II: applications to nonlinear programming, Math. Programming Study 19 (1982) 200–221. [8] A. Shapiro, First- and second-order analysis of nonlinear semide#nite programs, Math. Programming 77 (1997) 301–320. [9] A. Shapiro, M.K.H. Fan, On eigenvalue optimization, SIAM J. Optim. 5 (1995) 552–569. [10] A. Shapiro, K. Scheinberg, Duality and optimality conditions, in: H. Wolkowicz, R. Saigal, L. Vandenberghe (Eds.),

Handbook of Semide#nite Programming: Theory, Algorithms and Applications, Kluwer Academic Publishers, Boston, 2000, pp. 67–110. [11] J.F. Sturm, S. Zhang, On sensitivity of central solutions in semide#nite programming, Math. Programming 90 (2001) 205–227. [12] E.A. Yildirim, An interior-point perspective on sensitivity analysis in linear programming and semide#nite programming, Ph.D. Dissertation, Cornell University, 2001.

Lihat lebih banyak...

A sensitivity result for semidefinite programs

Descrição do Produto

Comentários