Neural network for solving convex quadratic bilevel programming problems

Descrição do Produto

Neural Networks 51 (2014) 17–25

Contents lists available at ScienceDirect

Neural Networks journal homepage: www.elsevier.com/locate/neunet

Neural network for solving convex quadratic bilevel programming problems Xing He a , Chuandong Li a,∗ , Tingwen Huang b , Chaojie Li c a

School of Electronics and Information Engineering, Southwest University, Chongqing 400715, PR China

b

Texas A & M University at Qatar, Doha 5825, Qatar

c

School of Science, Information Technology and Engineering, University of Ballarat, Mt Helen, VIC 3350, Australia

article

info

Article history: Received 2 August 2013 Received in revised form 14 October 2013 Accepted 18 November 2013 Keywords: Neural network Convex quadratic bilevel programming problems Nonautonomous differential inclusions Nonsmooth analysis

abstract In this paper, using the idea of successive approximation, we propose a neural network to solve convex quadratic bilevel programming problems (CQBPPs), which is modeled by a nonautonomous differential inclusion. Different from the existing neural network for CQBPP, the model has the least number of state variables and simple structure. Based on the theory of nonsmooth analysis, differential inclusions and Lyapunov-like method, the limit equilibrium points sequence of the proposed neural networks can approximately converge to an optimal solution of CQBPP under certain conditions. Finally, simulation results on two numerical examples and the portfolio selection problem show the effectiveness and performance of the proposed neural network. © 2013 Elsevier Ltd. All rights reserved.

1. Introduction Bilevel programming problems (BPPs) are hierarchical optimization problems in which the constraint region is implicitly determined by another optimization problem. The BPP can be formulated as follows: min F (x, y) x

s. t . H ( x, y ) ≤ 0 min f (x, y)

y∈

y

(1)

s.t . h (x, y) ≤ 0 x ∈ X, y ∈ Y

where X ⊂ Rn , Y ⊂ Rm , and H, h are vector valued functions of dimensions p and q. F and f are real-valued functions of appropriate dimensions. Numerous applications in science and engineering, such as network design, transport system planning, management and economic policy, can be formulated as BPP. Fernandez-Blanco, Arroyo, and Alguacil (2012) constructed a general bilevel programming framework for alternative market-clearing procedures dependent on market-clearing prices. Using bilevel programming and swarm

∗

Corresponding author. Tel.: +86 23 65103199. E-mail addresses: [email protected] (X. He), [email protected], [email protected] (C. Li), [email protected] (T. Huang), [email protected] (C. Li). 0893-6080/$ – see front matter © 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.neunet.2013.11.015

intelligence technique, Zhang, Zhang, Gao, and Lu (2011) presented a competitive strategic bidding optimization problem in electricity markets. Yang, Zhang, He, and Yang (2009) constructed a bilevel programming model for the flow interception problem with customer choice. So many researchers have made deep research in this field, including the theory, algorithm and application of bilevel programming (Amouzegar, 1999; Dempe, 2002; Etoa, 2010; Luo, Pang, & Ralph, 1996; Teng & Li, 2002; Vicente, Savard, & Júdice, 1994; Wang, Jiao, & Li, 2005). In the past years, a variety of numerical algorithms have been developed for BPP. However, in many engineering applications, real-time solutions are often needed. For such real-time applications, neural networks based on circuit implementation (Hopfield & Tank, 1985) are more competent. Over the years, neural networks for optimization and their engineering applications have been widely investigated. Tank and Hopfield applied the Hopfield network for solving linear programming problems (Hopfield & Tank, 1985; Tank & Hopfield, 1986), which motivated the development of neural networks for solving linear programming (Liu, Cao, & Chen, 2010; Wang, 1993; Xia, 1996; Xia & Wang, 1995), variational inequalities (Cheng, Hou, & Tan, 2008; Gao, Liao, & Qi, 2005; Hu & Wang, 2006, 2007), nonlinear programming (Bian & Chen, 2012; Forti, Nistri, & Quincampoix, 2004, 2006; Hosseini, Wang, & Mohamma, 2013; Liu, Dang, & Huang, 2013; Liu, Guo, & Wang, 2012; Liu & Wang, 2011, 2013; Xia & Wang, 2004) and so on. These neural networks are essentially governed by a set of dynamic systems characterized by an energy function, which is the combination of the objection function and constraints of the original optimization problem, and

18

X. He et al. / Neural Networks 51 (2014) 17–25

three common techniques, such as penalty functions, Lagrange functions and primal and dual functions, which are used to construct neural networks for solving the optimization problem. Recently, neural networks for solving BPP have received attention in the literature (Shih, Wen, Lee, Lan, & Hsiao, 2004), (Hu, Gao, Fu, & Lv, 2010; Lan, Wen, Shih, & Lee, 2007; Lv, Chen, & Wan, 2010; Lv, Hu, Wang, & Wan, 2008; Sheng, Lv, & Xu, 1996). Based on the Frank–Wolfe method, Sheng et al. (1996) first proposed a neural network to solve a class of BPPs appearing in the algorithm. Shih et al. (2004) utilized the dynamic behavior of neural networks to solve multiobjective programming and multilevel programming problems. Lv and his colleagues (Hu et al., 2010; Lan et al., 2007; Lv et al., 2010, 2008) presented neural networks for solving the bilevel linear programming problem (BLPP), convex quadratic BPP and nonlinear BPP. These neural networks for solving BPP are based on Lagrange functions. However, due to the use of Lagrange multipliers, the number of state variables increased doubly, which enlarged the scale of network. Recently, some neural networks (Liu et al., 2010, 2013, 2012; Liu & Wang, 2011) for nonlinear optimization are constructed based on the penalty function, which can be used to reduce the scale of neural networks. Therefore, there is an urgent and significant need to reduce the scale of neural networks for solving BPP. In this paper, following the Karush–Kuhn–Tacker optimality conditions (Facchinei, Jiang, & Qi, 1999), we first transform the convex quadratic bilevel programming problems (CQBPPs) into a single level problem. Then an approximation equivalent nonlinear optimization problem can be obtained through smoothing the single level problem. In order to solve the approximation equivalent problem effectively, based on the method of penalty functions and the theory of differential inclusions, nonautonomous neural network and neural sub-networks can be constructed. Compared with existing neural networks for CQBPP (Lv et al., 2010), the true power and advantage of our neural networks lie in simple structure and the least number of state variables, and their dynamical behavior and optimization capabilities are analyzed in the framework of nonsmooth analysis (Clarke, 1983) and the theory of differential inclusions (Aubin & Cellina, 1984). It is shown that the limit equilibrium points sequence of the proposed neural networks can approximately converge to an optimal solution of CQBPP under certain conditions. Simulation results on numerical examples and the portfolio selection problem show the effectiveness and performance of the neural network for solving CQBPP. The remainder of this paper is organized as follows. In the next section, the preliminaries relevant to CQBPP are introduced. In Section 3, the nonautonomous neural network is derived. The convergence of the proposed neural network is proved in Section 4. In Section 5, neural sub-networks for solving CQBPP are constructed. Simulation results on two numerical examples and the portfolio selection problem are given in Section 6 to demonstrate the effectiveness and performance of the neural network. Finally, Section 7 concludes this paper. Notation: Given column vectors x = (x1 , x2 , . . . , xn )T and y = n T T (y1 , y2 , . . . , yn ) , ⟨x, y⟩ = x y = i=1 xi yi is the scalar product of x, y, and ∥x∥1 =



|xi |, ∥x∥2 = x = , T    √ √ √ √ T T x22 , . . . , x2n , x = x1 , x2 , . . . , xn , 1n = [1, 1, . . . , 1] ,    n

n i =1

i =1

x2i ,



x21

2

2. Preliminaries In this section, some models, assumptions and lemmas about CQBPP are introduced, which are needed in the following development. If F and f are quadratic functions, and H and h are linear constraints, problem (1) gives rise to the following 1



C1 C3T

C3 C2

 

x y 2 +c1T x + dT1 y s.t . A1 x + B1 y ≤ b1 1 (LP ) min f (x, y) = yT Qy + yT Dx + dT2 y y ≥0 2 s.t . A2 x + B2 y ≤ b2

(UP ) min F (x, y) = x≥0

(xT , yT )

where c1 ∈ Rn , d1 , d2 ∈ Rm , C1 ∈ Rn×n , Q , C2 ∈ Rm×m , D, C3T ∈ Rm×n , A1 ∈ Rp×n , B1 ⊂ Rp×m , A2 ∈ Rq×n , B2 ⊂ Rq×m , b1 ∈ Rp , b2 ∈ Rq . The term (UP) is called the upper level problem and (LP) is called the lower level problem. At the UP, the decision maker has to choose first a vector x ∈ X to minimize his objective function F ; then under this decision the LP decision maker has to select the decision vector y ∈ Y that minimizes his own objective f . Throughout the rest of the paper, we make the following assumptions. Assumption 1. The constraint region of the above bilevel programming problem S = {(x, y) : x ≥ 0, y ≥ 0, A1 x + B1 y ≤ b1 , A2 x + B2 y ≤ b2 } is nonempty and compact. Assumption 2. C = matrices.



C1

C3T

C3



C2

and Q are positive semi-definite

From Facchinei et al. (1999), we can reduce the bilevel programming problem to the one-level programming problem by replacing the lower-level problem with its Karush–Kuhn–Tacker (KKT) optimality condition. The KKT reformulation of CQBPP follows: min F (x, y) s.t . A1 x + B1 y ≤ b1 A2 x + B2 y ≤ b2 Qy + Dx + d2 + BT2 u − υ = 0 uT (b2 − A2 x − B2 y) = 0 υT y = 0 x, y, u, υ ≥ 0

(3)

where u ∈ Rq , υ ∈ Rm . Problem (3) is non-convex and nondifferentiable, moreover the regularity assumptions which are needed for successfully handling smooth optimization problems are never satisfied and it is not good for using the neural network approach to solve the problem. Dempe (2002) presented the smoothing method for solving BPP. Following this smoothing method, we can propose a neural network approach to solve problem (3) in the next section. Let ε ∈ R+ be a parameter. Define the function Φε : R2 → R by

Φε (a, b) =



a2 + b2 + 2ε − a − b.

The important property of this function can be stated in the following result.

n

T

R+ = [0, +∞), ε˙ (t )2m+n+q = [ε˙ (t ) , ε˙ (t ) , . . . , ε˙ (t )] . 1





2m+n+q



(2)

Lemma 1. For every ε > 0, we have

Φε (a, b) = 0 ⇔ a > 0, b > 0,

ab = ε

X. He et al. / Neural Networks 51 (2014) 17–25

1 2

xT (x − |x|) = 0

Moreover, 12 xT (x − |x|) is continuous and convex. Make use of Φε (a, b) and Lemmas 1–2, problem (3) can be approximated by

Consider the following unconstrained optimization problem min z ∈R2m+n+q

min F (x, y) s.t .

(4)

u2i + (b2 − A2 x − B2 y)2i + 2ε − ui

− (b2 − A2 x − B2 y)i = 0, i = 1, . . . , q  υj2 + y2j + 2ε − υj − yj = 0, j = 1, . . . , m

Theorem 2. If N = F (z ∗ ), then a solution of problem (7) is a solution of problem (5) and vice versa.

xT (x − |x|) = 0.

To develop the recurrent neural network for solving problem (4) and simplify our discussion, we introduce the following notations F (x, y, u, υ) = F (x, y), G(x, y, u, υ) = (GT1 , GT2 , GT3 , GT4 , GT5 )T , where a vector G1 = 21 (b1 − A1 x − B1 y)T (b1 − A1 x − B1 y − |b1 − A1 x − B1 y|), G2 = Qy + Dx + d2 + BT2 u − υ , G3 =

 2 2 u − (b2 − A2 x − B2 y), G4 = u + (b2 − A2 x − B2 y) + 2ε 1q − 2 2 υ + y + 2ε 1m − υ − y, G5 = 21 xT (x − |x|).  T Let z = xT , yT , uT , υ T , then problem (4) can be rewritten as follows



 

1  T T  C1 C3 x ,y C3T C2 2 s.t . G (z ) = 0.

x y

+ c1T x + dT1 y

(5)

Proof. On the one hand, for any z, E (z ∗ ) = (F (z ∗ ) − N )+ + ∥G(z ∗ )∥1 = 0 ≤ (F (z ) − N )+ + ∥G(z )∥1 = E (z ) Hence, z ∗ is a solution of problem (7).   On the other hand, assume zˆ is a solution of problem (7), E zˆ ≤     E (z ) for any z, let z = z ∗ , E zˆ ≤ E (z ∗ ) = 0. So E zˆ = 0. Hence      +    = 0, G zˆ 1 = 0. Hence F zˆ ≤ N = F (z ∗ ) ≤ F zˆ − N F (z ) for any z satisfying G (z ) = 0. This shows that zˆ is a solution of problem (5). The proof is completed.

Remark 1. Theorem 1 indicates that if we solve problem (7), and find that it gives a solution which satisfies the constraints well enough, then it is an adequate solution for problem (5). Theorem 2 shows that it is very important to choose a value of N for solving problem (7). In the following, we will construct a recurrent neural network for solving problem (7). Let ε (t ) : R1+ → R1+ be a decreasing function and lim ε (t ) = 0. For variable z, by simple calculation, t →+∞

the generalized gradient of E (z , N , ε (t )) is obtained as

Definition 1. Let z be a feasible point of problem (5), we say that z is a regular point of problem (5) if the gradients ∇1 G (z ) , . . . , ∇2m+n+q G (z ) are linearly independent. Similar to the main results (Dempe, 2002), we can obtain the following result, which gives the relationship between the solutions of problem (2) and that of problem (5).

 

Lemma 3. Let zε∗ be a sequence of solutions of problem (5). Suppose   that the sequence zε∗ converges to some z¯ for ε → 0+ . If z¯ is a regular point, then z¯ is a Bouligand stationary solution for problem (2). 3. Proposed neural network In this section, based on an unconstrained optimization problem of (5), we construct a neural network modeled by a nonautonomous differential inclusion to solve problem (5). Some definitions and properties concerning set-valued maps, nonsmooth analysis are given in the Appendix. Take N as a lower bound of the optimal value of problem (5), i.e, N ≤ F (z ∗ ), where z ∗ is an optimal value of problem (5). Based on Section 2, we construct an energy function as follows: E (z , N ) = (F (z ) − N )+ + ∥G (z )∥1

Proof. Since zˆ is a solution of problem (7), we have E (z ∗ ) ≥ E zˆ , and we obtain  +    (F (z ∗ ) − N )+ + ∥G(z ∗ )∥1 ≥ F (ˆz ) − N + G zˆ 1 , (F (z ∗ ) − N )+ ≥ (F(ˆz ) − N )+ + ∥G(ˆz )∥1 ≥ (F (ˆz ) − N )+   If F zˆ ≥ N, F (z ∗ ) − N ≥ F zˆ − N, hence F zˆ ≤ F (z ∗ ).       If F zˆ ≤ N, so F zˆ ≤ N ≤ F (z ∗ ), so that F zˆ ≤ F (z ∗ ) is still satisfied.

 

Qy + Dx + d2 + BT2 u − υ = 0

min F (z ) =

(7)

Theorem 1. If zˆ is a solution of problem (7), then F zˆ ≤ F (z ∗ ).

T



2

E (z , N ) = (F (z ) − N )+ + ∥G (z )∥1 .

 

1

(b1 − A1 x − B1 y) (b1 − A1 x − B1 y 2 − |b1 − A1 x − B1 y|) = 0

1

+ T

 , (zi )+ = max {zi , 0}. So (z1 )+ , . . . , z2m+n+q E (z , N ) ≥ 0, and E (z , N ) = 0 if and only if z = z ∗ , N = F (z ∗ ). It is obvious that E (z , N ) is local Lipschitz in Rn and differential for almost everywhere (a.e.) x ∈ Rn in the sense of Lebesgue measure. Therefore, E (z , N ) is a regular function. where (z )+ =

Lemma 2. For x ∈ Rn , we have x≥0⇔

19



(6)

   3   ∂ F (z ) T ∂ Gi (z ) T   Kh (F (z ) − N ) +   ∂x ∂x   i=1   T   ∂ G5 (z )    Kg (Gi ) + Kg (G5 )   ∂x     4     ∂ F ( z ) T ∂ Gi (z ) T  Kh (F (z ) − N ) + ∂ E (z , N ) = (8) ∂y ∂y i=1    Kg (Gi )     3    ∂ Gi (z ) T   Kg (Gi )   ∂u   i=2     T  T    ∂ G2 (z ) Kg (G2 ) + ∂ G4 (z ) Kg (G4 ) ∂υ ∂υ  and ∂ E (z , N ) is defined in the Appendix, and g (s) = g1 (s) , g2 (s) , T . . . , gq (s) with appropriate dimension is defined as  1, u≥0 gi (u) = −1, u < 0 T  and also h (s) = h1 (s) , h2 (s) , . . . , hp (s) with appropriate dimension is as h i ( u) =

1, 0,



u≥0 u 0 and α > 0, which is the solution of ε (t ) = t +α the following dynamical system

 2  ε˙ (t ) = − ε (t ) ε0  ε (0) = ε0 . α Thus, for the implementation of the proposed neural network, we can change state variable of (9) as (z , ε). The proposed neural network is different from the Lagrange neural network, and it has the simpler structure compared with the neural network (Lv et al., 2010, 2008), which just has the least state variables 2m + n + q + 1.

Theorem 4. For any z 0 ∈ R2m+n+q , there exists a unique solution z (t ) with initial condition z (0) = z 0 , which is defined for t ∈ [0, +∞). Proof. Since E (z , N ) is a regular function, according to Lemma 4 in ∂ E (z ,N ) the Appendix, ∂ z : R2m+n+q −→ R2m+n+q is upper semicontinuous and a set-valued map with nonempty compact convex values. As a consequence, given any z 0 ∈ R2m+n+q , there exists at least one local solution z (t ) of system (9) with initial condition z (0) = z 0 , which means that there exists T > 0 such that system (9) is satisfied for a.e. t ∈ [0, T ]. Furthermore, z (t ) can be prolongable to [0, +∞). Let z1 (t ) and z2 (t ) be two solutions of system (9) with z1 (0) = z2 (0) = z 0 and t ∈ [0, +∞). Then z1 (t ) and z2 (t ) are absolutely continuous on [0, +∞). By definition of z1 (t ) and z2 (t ), we have



d dt

Definition 2. The solution of system (9) on [0, T ], T > 0, with initial condition z (0) = z 0 ∈ R2m+n+q , is an absolutely continuous function z (t ) on [0, T ] such that z (t ) ∈ R2m+n+q for t ∈ [0, T ], and for almost everywhere t ∈ [0, T ], we have z˙ (t ) ∈ −∂ E (z , N , ε (t )). lim

t →+∞

∂ E (z ,N ,ε(t )) ∂z

= S (z , N ). T

Definition 3. z ∗ = (x∗ )T , (y∗ )T , (u∗ )T , (υ ∗ )T is said to be a limit equilibrium point of neural network (9) if 0 ∈ S (z ∗ , N ).



In the following we will obtain the relationship between the equilibrium point of system (9) and the solution of problem (5). Theorem 3. For some N, suppose z (t ) is the solution of system (9) and lim z (t ) = z ∗ , then F (z ∗ ) is a local minimum for problem t →+∞

(2) if the following conditions are satisfied (i) z ∗ is a regular point of problem (5); (ii) For any d ∈ D = {d ̸= 0, [∇1 G(z ∗ )]T d = 0, . . . , [∇2m+n+q G(z ∗ )]T d = 0}, dT ∇ 2 E (z ∗ , N ) > 0. Proof. From the above analysis, z ∗ is the limit equilibrium point of neural network (9). Since the limit equilibrium points of neural network are the K–K–T points of problem (2), and condition (ii) are the sufficiency optimality conditions of second order for problem (5), and following condition (i) and Lemma 3, the proof is completed.

∥z1 (t ) − z2 (t )∥



= ⟨z1 (t ) − z2 (t ) , z˙1 (t ) − z˙2 (t )⟩

where ξ1 ∈ ∂ E (z1 , N , ε (t )), ξ2 ∈ ∂ E (z2 , N , ε (t )). In the right ∂ G (z ) ∂ F (z ) hand side of neural network (9), for variables z, ∂ z and ∂iz are linear functions except i = 3. Be similar with the proof of ∂ G (z ) Lemma 4.3 of Geiger and Kanzow (1996), ∂iz is Lipschitz continuous on R2m+n+q . Then, the gradient function ∂ E (z , N , ε (t )) is Lipschitz continuous. Then we obtain

∥∂ E (z1 ) − ∂ E (z2 )∥2 < L ∥z1 − z2 ∥2 where L is a Lipschitz constant. Thus



dt

In this section, the convergence and optimality of the proposed neural network are proven. According to the theory of differential inclusions (Aubin & Cellina, 1984), we have the following.

Assumption 3.

2

2 2

= ⟨z1 (t ) − z2 (t ) , −ξ1 + ξ2 ⟩

d 4. Theoretical analysis

1

1 2

∥z1 (t ) − z2 (t )∥22



≤ L ∥z1 − z2 ∥22 .

Integrating between 0 and t, one obtains 1

∥z1 (t ) − z2 (t )∥22 ≤

2

t



L ∥z1 (s) − z2 (s)∥22 ds. 0

By the Gronwall inequality, we have 1

∥z1 (t ) − z2 (t )∥22 ≤ 0

2

which means z1 (t ) = z2 (t ) for any t ∈ [0, +∞). The proof is completed. From Eq. (6), E (z , N ) is non-convex because of the influence of parameter ε in R2m+n+q , however there exists a set Ω ⊂ R2m+n+q , such that E (z , N ) is a convex function for z ∈ Ω . This is called local convexity of the non-convex function. The following theorem will focus the behavior of the solution path of system (9) if z ∈ Ω ⊂ R2m+n+q . Theorem 5. Suppose that neural network (9) has a limit equilibrium point z ∗ , and z ∗ ∈ Ω ⊂ R2m+n+q , and E (z , N ) is a convex function for z ∈ Ω , for any initial point z 0 ∈ Ω , the solution of system (9) has the following properties (i) ∥˙z (t )∥2 is nonincreasing; (ii) lim ∥˙z (t )∥2 = 0. t →+∞

Proof. Let z1 (t ) and z2 (t ) be two solutions of system (9) with z1 (0) = z 1 , z2 (0) = z 2 ∈ Ω . Then z1 (t ) and z2 (t ) are absolutely continuous on [0, +∞). By definition of z1 (t ) and z2 (t ), we have d



1



= ⟨z1 (t ) − z2 (t ) , z˙1 (t ) − z˙2 (t )⟩  ∂ E (z1 , N ) ∂ E (z2 , N ) = z1 (t ) − z2 (t ) , − + ∂ z1 ∂ z2

dt

2



∥z1 (t ) − z2 (t )∥22

X. He et al. / Neural Networks 51 (2014) 17–25

due to the convexity of E (z , N1 ) on Ω , ∂ E (z , N1 ) is monotone. Then we obtain



d dt

1 2

∥z1 (t ) − z2 (t )∥22



< 0.

21

Combining (11) and (12), we obtain

   d 1 z (t ) − z ∗ 2 ≤ E z ∗ , N , ε (t ) − E (z , N , ε (t )) . 2 dt 2

Since (F (z ∗ ) − N )+ < (F (z ) − N )+ and ∥G (z ∗ )∥1 = 0, we get

Thus

     d 1 z (t ) − z ∗ 2 ≤ F z ∗ − N + 2 dt 2

∥z1 (t ) − z2 (t )∥2 ≤ ∥z1 (0) − z2 (0)∥2 .

− (F (z ) − N )+ − ∥G (z )∥1 < 0.

Let z2 (t ) = z1 (t + h) with h > 0, then

     z1 (t + h) − z (t )      ≤  z1 (h) − z1 (0)      h h

Thus lim ∥z (t ) − z ∥2 exists.

and let h tend to 0, we obtain



2

t →+∞

Integrating (13) from 0 to t, we get

dt

∀t ∈ [0, +∞)

∂ E (z ,N ,ε(t )) , ξ2 ∂z

∈

From (9), we have

dt

∂ E (z ,N ,ε(t )) , ∂ε



(10)

t →+∞



E (z , N , ε (s)) − E z ∗ , N , ε (s)



lim

2 t →+∞

 t

≤

   +  ds (F (z ) − N )+ − F z ∗ − N

0

1

lim [∥z (0) − z ∗ ∥22 − ∥z (t ) − z ∗ ∥22 ] 2 t →+∞ t →+∞

∥G(z )∥1 ds.

(16)

0

t   (F (z ) − N )+ − (F (z ∗ ) − N )+ ds ≥ 0, if lim ∥G (z 0 t →+∞

1

n→+∞

lim z (t ) − z ∗ 2 = 0.



+∞



t →+∞

∥˙z (t )∥22 dt ≤ E (z (0) , N , ε (0)) < +∞.

The proof is completed.

0

Combining the above result with the nonincreasing property of ∥˙z (t )∥2 on [0, +∞), we obtain

 0

L z

lim ∥˙z (t )∥2 = 0.

Theorem 6. Suppose that neural network (9) has a limit equilibrium point z ∗ . and z ∗ ∈ Ω ⊂ R2m+n+q , and E (z , N ) is a convex function for z ∈ Ω , for any point z 0 = z (0) ∈ Ω , the solution of (9) converges to z ∗ . Proof. Since E (z , N ) is convex for z ∈ β (z , δ), using the first∂ E (z ,N ,ε(t )) order optimization condition, for any ξ1 ∈ , we have ∂z ∗

E (z , N , ε (t )) − E z ∗ , N , ε (t ) ≤ ξ1 , z − z ∗ .



(11)

∥z (t ) − z ∗ ∥22 along the solution of (9), there exist

such that

 d 1 z (t ) − z ∗ 2 = ⟨z − z ∗ , z˙ (t )⟩ = ⟨z − z ∗ , −ξ¯1 ⟩. 2 dt 2

(12)

Theorem 7. Suppose that the level set

 = z:

t →+∞

ξ1 ∈

t

 − lim

t →+∞

Since E (z , N , ε (t )) > 0, t ≥ 0, then

1 2

2

lim ∥G (z (tn ))∥1 = ∥G (z ∗ )∥1 , which implies that

0

∂ E (z ,N ,ε(t )) , ∂z

2

Then there exists a sequence tn such that lim tn = +∞ and

∥˙z (s)∥22 ds ≤ E (z (0) , N , ε (0)) − E (z , N , ε (t )) .

Differentiating

     z (0) − z ∗ 2 − z (t ) − z ∗ 2

t →+∞

t



ds

(t ))∥1 ̸= 0, which leads a contradiction with (16), we obtain    lim ∥G (z (t ))∥1 = G z ∗  = 0.

Integrating the above inequality from 0 to t,





0

1

Since



(15)

t

 lim

and ε (t ) is a decreasing function, therefore, one

 ξ2 , ε˙ (t )2m+n+q ≤ 0.



ds

Limiting the above inequality for t

we obtain ξ2i ≥ 0, where

gets





    1  z (0) − z ∗ 2 − z (t ) − z ∗ 2 . 2 2 2

≤

t →+∞



ξ2 = (ξ )



lim

E (z , N , ε (t )) = − ∥˙z (t )∥22 + ξ2 , ε˙ (t )2m+n+q

2m+n+q , 2i i=1

E (z , N , ε (s)) − E z ∗ , N , ε (s)

so

∂ E (z ,N ,ε(t )) . ∂ε

and due to the expression of





≤   E (z , N , ε (t )) = ⟨ξ1 , z˙ (t )⟩ + ξ2 , ε˙ (t )2m+n+q

∀ξ1 ∈ d

t 0

which follows that ∥˙z (t )∥2 is nonincreasing on [0, +∞). Since E (z , N ) is regular on R2m+n+q , E (z (t )) and z (t ) are differentiable for almost everywhere t ∈ [0, +∞). Moreover, 0 given an arbitrary  initial  point z ∈ Ω , there exists an unique 0 trajectory z = z t , z of system (9), Differentiating E (z , N , ε (t )) along the solution of neural network (9), we have d

(14)

∗

2

∥˙z1 (t )∥2 ≤ ∥˙z1 (0)∥2 ,

(13)

lim

  0

ε(t )→0+

E z

 − E (z ) ≥ 0



is bound, then there exist a limit equilibrium point z ∗ and a strictly increasing sequence {tn }, such that lim z tn , z 0 = z ∗ and 0 ∈ n→+∞

S (z ∗ , N ).

Proof. Along the trajectory z = z t , z 0 , by choosing ξ1 = z˙ (t ) ∈



∂ E (z ,N ,ε(t )) , ξ2 ∂z

dE z t , z

 

∈  0

∂ E (z ,N ,ε(t )) , ∂ε

dz

there is

  + ξ2 , ε˙ (t )2m+n+q ,   and due to the expression ξ2 , ε˙ (t )2m+n+q ≤ 0. Then we obtain      dE z t , z 0 = − ∥˙z (t )∥22 + ξ2 , ε˙ (t )2m+n+q < 0, dt    so, E z t , z 0 is monotone nonincreasing along the trajec        tory z = z t , z 0 . Let η z 0 = z t , z 0 , t ≥ t0 , due to dt

= ξ1T



dt

22

X. He et al. / Neural Networks 51 (2014) 17–25

monotonicity of E z t , z

 

, one obtains η z

 0

 0

  ⊆ L z 0 . Hence

η z

 0

is a bounded sequence composed of infinite points. Thus {tn } ⊆ {t , t ≥ t0 }, there exists a strictly increasing  subsequence  tn → +∞, such that lim z tn , z 0 = z ∗ . By the LaSalle invarin→+∞

ance principle, z ∗ ∈ S (S is the largest invariant set in the set of limit equilibrium points). From neural network (9), E (z ∗ ) = 0 ⇔ 0 ∈ S (z ∗ , N ).

   + = Nk + F zk∗ − Nk = Nk+1 ≤     F (z ∗ ). If F zk > Nk , there is F zk∗ ≤ F (z ∗ ); If F zk∗ < Nk , there  ∗ is F zk ≤ Nk < F (z ∗ ).   (iv) Let the parameter k → ∞, by Nk+1 = Nk + E zk∗ , Nk ,  ∗   ∗  we have lim E zk , Nk = 0. Since the continuity of E zk , Nk and (iii) By (i), Nk + E zk∗ , Nk





 ∗

k→∞

conclusion (ii), we have lim zk∗ = z, F (¯z ) ≤ N ∗ and E (z , N ∗ ) = k→+∞

0. If N ∗ = F (z ∗ ), the proof is completed.

5. Neural sub-networks for CQBPP In this section, we will design neural sub-networks for solving problem (5). From Section 4, neural network (9) is Lyapunov stable and converges to the solution of problem (7) under some conditions. However, it cannot guarantee to be a solution of (5). From Section 3, one should choose a bigger value of N (keeping N ≤ F (z ∗ )) to obtain an approximate solution of problem (5). In the following, neural sub-networks are constructed for solving problem (5), which is constructed by using neural network (9) as a sub-network. We consider the following sub-optimization problem with the parameter Nk . min z ∈R2m+n+q

E (z , Nk ) = (F (z ) − Nk )+ + ∥G (z )∥1

dz

∈ −∂ E (z , Nk , ε (t )) . (18)  ∗  Let Nk+1 = Nk + E zk , Nk , ε (t ) , where zk∗ is the limit equilibrium dt

point of neural network (18). Then the following feedback neural network for solving problem (5) is constructed dz

∈ −∂ E (z , Nk , ε (t )) dt z (0) = zk∗−1 , k = 1, 2, . . .

(19)

where z (0) is the initial point of neural network (19).

 0

Theorem 8. Suppose that L z

∗

is bounded and z is an optimal

solution of problem (5), and let Nk+1 = Nk + E zk∗ , Nk , N1 ≤ F (z ∗ ), then we have





x≥0,y≥0

  x y

(20)

choose N1 = F xˇ , yˇ . Then consider the results as a better start parameter N1 for the neural sub-networks (19), and the optimal solution can be found by updating Nk . In the next section, some examples are used to show the effectiveness of this strategy.





6. Numerical simulations In this section, we use two numerical examples and the portfolio selection problem to illustrate the good performance of neural network (19) for solving problem (5). Example 1. Consider the following optimization problem (Muu & Quy, 2003). min F (x, y) = −7x1 + 4x2 + y21 + y23 − y1 y3 − 4y2 x

(i) Nk ≤ F (z ), k = 1, 2, . . . ; (ii) lim Nk = N ∗ ; k→∞

s.t . x1 + x2 ≤ 1 x1 x2 ≥ 0 min f (x, y) = (1 − 3x1 ) y1 + (1 + x2 ) y2 y

(iii) F zk∗ ≤ F (z ∗ );   (iv) If N ∗ = F (z ∗ ), the sequence zk∗ converges to an optimal

 

solution of problem (5) and F zk optimal value of problem (5).

C3 C2

which is derived from problem (2) by truncating  objection  the function of the lower level problem. Assume that xˇ , yˇ and (x∗ , y∗ ) are the solutions of problem  and (20), respectively. From Ben (2) Ayed (1988), one has F xˇ , yˇ ≤ F (x∗ , y∗ ). Therefore, we can

∗

  ∗ 



1  T T  C1 x ,y C3T 2 T T +c1 x + d1 y s.t . A1 x + B1 y ≤ b1 A2 x + B2 y ≤ b2

F (x, y) =

min

(17)

where Nk (k = 1, 2, . . .) as a lower bound of the optimal value of problem (5), i.e, Nk ≤ F (z ∗ ) is an optimal solution of problem (5) and the corresponding recurrent neural sub-network for solving problem (17) is



Remark 3. From above results, it is shown that the value of N1 has significant impact on finding the solutions of problem (5). However, in the application, the solution of problem (2) is unknown, it is difficult to choose a suitable N1 for neural network (19). In the following, we will give a method to find a better N1 for problem (5). Consider the following convex quadratic programming problem

y21

1

y22

1

(21)

y23

+ + + + y1 y2 2 2 s.t . x1 − 2x2 + 2y1 + y2 − y3 + 2 ≤ 0 y1 , y2 , y3 > 0.

approach the corresponding

Proof. (i) By the induction method, we can prove the conclusion (i). First, when k = 1, N1 ≤ F (z ∗ ). Second, assume Nk ≤ F (z ∗ ),then we  need to prove Nk+1 ≤ F (z ∗ ). By the definition of zk∗ , E zk∗ , Nk ≤ E (z , Nk ) for all z. In particular, let z = z ∗ , E (zk∗ , Nk ) ≤ E (z ∗ , Nk ) = (F (z ∗ ) − Nk )+ + ∥G(z ∗ )∥1 = (F (z ∗ ) − Nk )+ = F (z ∗ ) − Nk . So we have

For neural network (9), we choose ε (t ) = t +1 3 with parameter N1 = −12 (obtained by solving (20)). Fig. 1 illustrates the solution of neural network (19) with 10 different initial points, which is an approximate solution of (21), and every trajectory does converge to the theoretical optimal solution (0.609, 0.391, 0, 0, 1.828).

Nk+1 = Nk + E zk∗ , Nk ≤ Nk + E z ∗ , Nk = F z ∗ .

min (x − 5)2 + (2y + 1)2

(ii) Since E (z , Nk ) is non-negative, we can obtain that {Nk } is a nondecreasing sequence. By the conclusion (1), we can prove Nk ≤ F (z ∗ ), therefore, {Nk } is a monotone bounded sequence, and we can get lim Nk = N ∗ ≤ F (z ∗ ).

s.t . min (y − 1)2 − 1.5xy



k→∞







 

Example 2. Consider the following optimization problem (Bard, 1988). x≥0

y≥0

s.t . − 3x + y + 3 ≤ 0 x − 0.5y − 4 ≤ 0 x + y − 7 ≤ 0.

X. He et al. / Neural Networks 51 (2014) 17–25

Fig. 1. Transient behavior of the state variables.

23

Fig. 2. Transient behavior of the state variables.

.01 with parameter N1 = 2 In this example, we let ε (t ) = 0t + 3 (obtained by solving (20)). In Fig. 2, we can find that the solution of the proposed neural network is convergent to one optimal solution (5, 2) with 10 different initial points, and Fig. 3 shows that the transient behavior of the state variables can converge to another optimal solution (1, 0) with 10 different initial points.

Example 3. Consider the following portfolio selection problem (An, Quynh, & Tao, 2012) min δ yT Hy − (1 − δ) ρ x + r T y





x≥0

s.t . max ρ x + r T y y≥0 k

x+



yi = 1

i =1

−ρ x −

k 

rit yi ≤ M ,

∀t = 1, 2 . . . , T

i=1

Fig. 3. Transient behavior of the state variables.

where k is the number of risky assets, the historical return is rit , i = 1, 2, . . . , k, t = 1, 2 · · · T . The expect return r = (r1 , r2 , . . . , rk ), T and ri = T1 t =1 rit , ∀i = 1, 2, . . . , k. The covariance matrix

H = hij , and hij = T1 t =1 (rit − ri ) rjt − rj , ∀i, j = 1, 2, . . . , k. The return of risk-free asset is ρ , and x, yi are the ratios of risky assets, and δ is a constant in [0, 1]. The parameter M is a constant. For the historical return rit , we assume ri > 0. In the simulation, the data rit is generated randomly for k = 10 and T = 100. Let ε (t ) = t0+.02 , ρ = 0.5, M = 0.04 with parameter N1 = − (1 − δ) ρ 10 (obtained by solving (20)). We vary the data δ from 0.1 to 0.9. From Figs. 4–5, we respectively can see that the selection rate x is equal to 0, and the selection rate yi is convergent to one optimal solution. The optimal upper level value and the optimal lower level value with different the value of δ are shown in Figs. 6 and 7, respectively. From Fig. 6, the optimal upper level value is increasing if δ vary from 0.1 to 0.9, and the optimal lower level value is approximately invariant in Fig. 7.

 

T





7. Conclusions Based on the method of penalty functions, this paper has presented a recurrent neural network for solving CQBPP. Compared with the existing neural network, no Lagrange multipliers, or slack variables are involved in the proposed recurrent neural network. Using the theory of nonsmooth analysis, differential inclusions and Lyapunov-like method, an important result has been proven that

Fig. 4. Transient behavior of the state variables with δ = 0.1.

the limit equilibrium points sequence of the proposed neural networks can approximately converge to an optimal solution of CQBPP under certain conditions. Simulation results on two numerical examples and the portfolio selection problem have shown that the proposed recurrent neural network is feasible and very efficient.

24

X. He et al. / Neural Networks 51 (2014) 17–25

Acknowledgment This publication was made possible by NPRP grant # NPRP 41162-1-181 from the Qatar National Research Fund (a member of Qatar Foundation). The statements made herein are solely the responsibility of the authors. This work was also supported by Natural Science Foundation of China (grant no: 61374078). Appendix Here, we present some definitions and properties concerning set-valued maps, nonsmooth analysis, which are needed for the theoretical analysis in this paper. For more details, the readers can refer to Clarke (1983), Aubin and Cellina (1984). Definition 4. For x ∈ E ⊂ Rn , the map F : x → F (x) is called a set-valued map if to each point x of E, there is a nonempty closed set F (x) ⊂ Rn . Fig. 5. Transient behavior of the state variables with δ = 0.8.

Definition 5. A function f : Rn → R is said to be Lipschitz near x ∈ Rn if there exist ϵ , δ , such that for any y, z ∈ B (x, δ), one has |f (y) − f (z )| < ϵ ∥y − z ∥2 . If f is Lipschitz near any point x ∈ Rn , then f is said to be locally Lipschitz in Rn . Definition 6. Assume f is said to be locally Lipschitz in Rn , then f is differential for almost everywhere (a.e.) x ∈ Rn in the sense of Lebesgue measure. The generalized directional derivative of f at x in the direction ν ∈ Rn is defined as f 0 (x; ν) = lim sup

f (y + ξ ν) − f (y)

y→x ξ →0+

ξ

.

Furthermore, Clarke’s generalized gradient of f at x is defined as

  ∂ f (x) = ξ ∈ Rn : f 0 (x; ν) ≥ ⟨ν, ξ ⟩ , ∀ν ∈ Rn , which is equivalent to

∂ f (x) = K

Fig. 6. The optimal upper level value with different δ .



lim ∇ f (xn ) : xn → x, x ̸∈ χ , xn ̸∈ Γ

n→∞



,

where K (•) denotes the closure of the convex hull of the corresponding set, χ ⊂ Rn is an arbitrary set of measure zero, and Γ ⊂ Rn is the set of points at which f is not differential. Definition 7. A function f : Rn → R, which is Lipschitz near x is said to be regular at x if the following conditions hold. (i) For all ν ∈ Rn , the usual one-sided directional derivative f ′ (x; ν) = lim [f (y + ξ ν) − f (y)] /ξ exists; ξ →0+

(ii) For all ν ∈ Rn , f ′ (x; ν) = f 0 (x; ν). Regular functions form a rather wider set, and several classes of them are presented in Clarke (1983), and the following lemmas hold. Lemma 4. Let f : Rn → R be Lipschitz of rank k near x, then (i) ∂ f (x) is a nonempty, convex and compact set of Rn , and ∥ξ ∥ ≤ k, ∀ξ ∈ ∂ f (x); (ii) ∂ f (x) is upper semicontinuous at x. Lemma 5. If V : Rn → R is regular at x (t ) and x (t ) : R → Rn is differential at t and Lipschitz, then d Fig. 7. The optimal upper level value with different δ .

dt

V (x (t )) = ⟨ξ , x˙ (t )⟩ ,

∀ξ ∈ ∂ V (x (t )) .

X. He et al. / Neural Networks 51 (2014) 17–25

References Amouzegar, M. A. (1999). An evolutionary algorithm for solving nonlinear bilevel programming based on a new constraint-handling scheme. IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics), 29(6), 771–777. An, L., Quynh, T. D., & Tao, P. D (2012). A DC programming approach for a class of bilevel programming problems and its application in Portfolio Selection. Numerical Algebra, Control and Optimization, 1, 167–185. Aubin, J., & Cellina, A. (1984). Differential inclusions: set-valued maps and viability theory. New York: Springer-Verlag Publishers. Bard, J. F. (1988). Convex two-level optimization. Mathematical Programming, 40, 15–27. Ben-Ayed, O. (1988). Bilevel linear programming: analysis and application to the network design problem, Ph.D. Thesis, University of Illinois at Urbana Champaign IL. Bian, W., & Chen, X. (2012). Smoothing neural network for constrained nonLipschitz optimization with applications. IEEE Transactions on Neural Networks Learning Systems, 23(3), 399–411. Cheng, L., Hou, Z., & Tan, M. (2008). A neutral-type delayed projection neural network for solving nonlinear variational inequalities. IEEE Transactions on Circuits and Systems II: Experimental Briefs, 55(8), 806–810. Clarke, F. (1983). Optimization and nonsmooth analysis. New York: Wiley. Dempe, S. (2002). Foundations of bilevel programming. Kluwer Academic Publishers. Etoa, J. (2010). Solving convex quadratic bilevel programming problems using an enumeration sequential quadratic programming algorithm. Journal of Global Optimization, 47(4), 615–637. Facchinei, F., Jiang, H., & Qi, L. (1999). A smoothing method for mathematical programs with equilibrium constraints. Mathematical Programming, 85, 107–134. Fernandez-Blanco, R., Arroyo, J., & Alguacil, N. (2012). A unified bilevel programming framework for price-based market clearing under marginal pricing. IEEE Transactions on Power Systems, 27(1), 517–525. Forti, M., Nistri, P., & Quincampoix, M. (2004). Generalized neural network for nonsmooth nonliner programming problems. IEEE Transactions on Circuits and Systems. I. Regular Papers, 51(9), 1741–1754. Forti, M., Nistri, P., & Quincampoix, M. (2006). Convergence of neural networks for programming problems via nonsmooth Lojasiewica inequality. IEEE Transactions on Neural Networks, 17(6), 1487–1499. Gao, X., Liao, L., & Qi, L. (2005). A novel neural network for variational inequalities with linear and nonlinear constraints. IEEE Transactions on Neural Networks, 16(6), 1305–1317. Geiger, C., & Kanzow, C. (1996). On the resolution of monotone complementarity problem. Computational Optimization and Applications, 5, 155–173. Hopfield, J. J., & Tank, D. W. (1985). Neural computation of decisions in optimization problems. Biological Cybernetics, 52, 141–152. Hosseini, A., Wang, J., & Mohamma, S. (2013). A recurrent neural network for solving a class of generalized convex optimization problems. Neural Networks, 44, 78–86. Hu, T., Gao, X., Fu, X., & Lv, Y. (2010). A neural network approach for solving linear bilevel programming problem. Knowledege-Based Systems, 23, 239–242. Hu, X., & Wang, J. (2006). Solving pseudomonotone variational inequalities and pseudoconvex optimization problems using the projection neural network. IEEE Transactions on Neural Networks, 17(6), 1487–1499. Hu, X., & Wang, J. (2007). Design of general projection neural networks for solving monotone linear variational inequalities and linear and quadratic optimization problems. IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics), 37(5), 1414–1421. Lan, K., Wen, U., Shih, H., & Lee, E. (2007). A Hybrid neural network approach to bilevel programming problems. Applied Mathematics Letters, 20, 880–884.

25

Liu, Q., Cao, J., & Chen, G. (2010). A novel recurrent neural network with finite-time convergence for linear programming. Neural Computation, 22, 2962–2978. Liu, Q., Dang, C., & Huang, T. (2013). A one-layer recurrent neural network for real-time portfolio optimization with probability criterion. IEEE Transactions on Cybernetics, 43(1), 14–23. Liu, Q., Guo, Z., & Wang, J. (2012). A one-layer recurrent neural network for constrained pseudoconvex optimization and its application for portfolio optimization. Neural Networks, 26(1), 99–109. Liu, Q., & Wang, J. (2011). A one-layer recurrent neural network for constrained nonsmooth optimization. IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics), 41(5), 1323–1333. Liu, Q., & Wang, J. (2013). A one-layer projection neural network for nonsmooth optimization subject to linear equalities and bound constraints. IEEE Transactions on Neural Networks Learning Systems, 24(5), 812–824. Luo, Z. Q., Pang, J. S., & Ralph, D. (1996). Mathematical programs with equilibrium constraints. Cambridge, UK: Cambridge University Press. Lv, Y., Chen, Z., & Wan, Z. (2010). A neural network for solving a convex quadratic bilevel programming problem. Journal of Computational and Applied Mathematics, 234, 505–511. Lv, Y., Hu, T., Wang, G., & Wan, Z. (2008). A neural network approach for solving nonlinear bilevel programming problem. Computers & Mathematics with Applications, 55, 2823–2829. Muu, L. D., & Quy, N. V. (2003). A global optimization method for solving convex quadratic bilevel programming problems. Journal of Global Optimization, 26, 199–219. Sheng, Z., Lv, Z., & Xu, R. (1996). A new algorithm based on the Frank–Wolfe method and neural network for a class of bilevel decision making problem. Acta Automatica Sinica, 22(6), 657–665. Shih, H. S., Wen, U. P., Lee, E. S., Lan, K. M., & Hsiao, H. C. (2004). A neural network approach to multiobjective and multilevel programming problems. Computers & Mathematics with Applications, 48, 95–108. Tank, D. W., & Hopfield, J. J. (1986). Simple neural optimization networks: an A/D converter, signal decision circuit, and a linear programming circuit. IEEE Transactions on Circuits and Systems, CS-33(5), 533–541. Teng, C., & Li, Z. (2002). The theory and application of bilevel programming. The Science Press. Vicente, L., Savard, G., & Júdice, J. (1994). Descent approaches for quadratic bilevel programming. Journal of Optimization Theory and Applications, 2, 379–399. Wang, J. (1993). Analysis and design of a recurrent neural network for linear programming. IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications, 40(9), 613–618. Wang, Y., Jiao, Y. C., & Li, H. (2005). Solution algorithm for the bi-level discrete network design problem. IEEE Transactions on Systems, Man and Cybernetics, Part C (Applications Review), 35(2), 221–232. Xia, Y. (1996). A new neural network for solving linear programming problems and its application. IEEE Transactions on Neural Networks, 7(2), 525–529. Xia, Y., & Wang, J. (1995). Neural network for solving linear programming problems with bounded variables. IEEE Transactions on Neural Networks, 6(2), 515–519. Xia, Y., & Wang, J. (2004). A one-layer recurrent neural network for support vector machine learning. IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics), 34(2), 1–10. Yang, J., Zhang, M., He, B., & Yang, C. (2009). Bi-level programming model and hybrid genetic algorithm for flow interception problem with customer choice. Computers & Mathematics with Applications, 57, 1985–1994. Zhang, G. Q., Zhang, G. L., Gao, Y., & Lu, J. (2011). Competitive strategic bidding optimization in electricity markets using bilevel programming and swarm technique. IEEE Transactions on Industrial Electronics, 58(6), 2138–2146.

Lihat lebih banyak...

Neural network for solving convex quadratic bilevel programming problems

Descrição do Produto

Comentários