Neural adaptive control of multibody systems

May 24, 2017 | Autor: Carlo Bottasso | Categoria: Model Predictive Control, Adaptive Control, Neural Network, Virtual Environment, Multibody Dynamics, Real Time Control, Autonomous Vehicle, Operant Conditioning, Non Linear Control, Real Time Control, Autonomous Vehicle, Operant Conditioning, Non Linear Control

Share Embed

Denunciar este link

Descrição do Produto

Proceedings of ACMD06

A00687

NEURAL ADAPTIVE CONTROL OF MULTIBODY SYSTEMS Carlo L. Bottasso, Alessandro Croce, Roberto Nicastro, Barbara Savini, Luca Riviello Dipartimento di Ingegneria Aerospaziale Politecnico di Milano Milano, 20156, ITALY Email: [email protected]

ABSTRACT We describe an adaptive predictive control framework which is applied to a few challenging control problems in multibody dynamics. The approach is based on the augmentation of a reference solution, which ensures quick and robust adaption without the need for pre-training.

1

INTRODUCTION In this paper we describe an ongoing investigation on the development of adaptive model-predictive controllers. This work is motivated by applications dealing with the real-time control of rotorcraft autonomous vehicles and of wind turbine generators. In both cases, one needs to control a highly nonlinear aero-servo-elastic system in the presence of disturbances and uncertainties on the model parameters and of the environment. An adaptive control strategy, i.e. a controller that can learn and automatically adjust to changes in the operating conditions and that can self-compensate approximations in the model, is therefore of particular interest for the present applications. Model predictive control is based on the solution of an optimal control problem at each instant of time, which seeks to minimize some appropriate cost function (Findeisen et al. (2003)). The problem is formulated in terms of a reduced model of the plant, that allows one to predict the future response of the system on a given time horizon. The control actions computed this way, are then input to the plant. At the next time step, the open-loop optimal control problem is solved again on a prediction window shifted forward in time (receding horizon control). Clearly, the performance of a predictive controller is highly dependent on the suitability of the reduced model, i.e. on its ability to perform

faithful predictions of the plant response. Therefore, the applicability of a NMP controller to the problems of interest for this work is crucially dependent on the ability to develop good non-linear reduced models of the plant, even in the presence of disturbances and when the operating conditions are changing. To address this concern, the approach that we are pursuing is as follows. The prediction provided by a reference approximate model of the plant is augmented with a neural network, which is trained on-line to capture the defect between the solution of the approximate model and the real response of the plant. This formulation is described in Bottasso et al. (2006a). If the reduced model is linear, the cost is quadratic and the prediction horizon of infinite length, one has the familiar linear-quadraticregulator (LQR) problem, for which a closed-form solution can be computed by solving the associated Riccati equation. When the problem is non-linear (non-linear model predictive (NMP) control), the solution has to be computed by numerical means, as for example in Bottasso et al. (2006a). Although there are efficient methods to solve optimal control problems (Betts (2001)), these are still typically not fast enough for real-time applications. In this work, the solution of the optimal control problem on the prediction window is address in two different ways. For off-line applications, the problem is solved using the direct transcription approach, as described in Betts (2001); Bottasso et al. (2006a). When real-time computing constraints need to be met, we use the following alternative approach. At each time step, a reference control solution is provided by a suitable method, which is the LQR in this case. These approximate control actions are now augmented by a second neural netc Copyright 2006 by JSME

Proceedings of ACMD06

A00687

work, which is trained on-line to approximate the solution of the non-linear model predictive problem. The process proceeds in three steps. First, the augmented reduced model equations are integrated forward in time. Next, the co-state equations are integrated backwards. Finally, the violation of the transversality conditions is used for a steepest descent correction of the network parameters. Such a procedure is adaptive both at the level of the model and at the level of the control actions. This control framework is described in the following pages, and demonstrated with the help of numerical examples.

ment becomes available. In fact, due to the presence of disturbances and the inevitable mismatch between reduced model and plant, the actual outputs will drift away from the predicted ones. Once the plant has reached the end of the steering window [t, t + Ts ] under the action of the computed control inputs, the model-predictive optimization problem is solved again, looking ahead in the future over the prediction horizon shifted forward in time. This procedure results in a feedback, receding horizon approach. The future control actions are computed by solving the following open-loop optimal control problem:

2

min J, x,u Z t+Tp with: J = F (x(τ ), u(τ ), x∗ , u∗ )dτ,

NON-LINEAR MODEL PREDICTIVE CONTROL A model predictive controller predicts the future behavior of the plant using a reduced model, and finds the control actions necessary for regulating the plant solving an optimal control problem on a receding horizon (Findeisen et al. (2003)). The basic principle of non-linear modelpredictive control is illustrated in Figure 1. A nonlinear reduced model is used for predicting the future behavior of the plant under the action of the control inputs u. An open-loop optimal control problem is solved for the reduced model on a finite horizon (the prediction window [t, t + Tp ]). The cost of this optimization problem is a function of the error in the desired target value. Past

Target value

Future

Predicted output Actual output du/dt=0 t

Projected controls u t+Ts Steering window t+Tc Control horizon

(1a) (1b)

t

˙ x, u) = 0, s.t.: f (x, g(x(τ ), u(τ )) ∈ [gmin , gmax ]

(1c) ∀τ ∈ [t, t + Tc ], (1d)

˙ ) = 0 ∀τ ∈ [t + Tc , t + Tp ]. u(τ

(1e)

Goal of the optimization problem is to bring the system to the target regulation set point x∗ in an optimal way, while satisfying all constraints. The solution of the optimization problem is obtained through the prediction of the response of the plant, as described by a non-linear reduced model given by Eq. (1c). The optimization satisfies possible constraints and bounds on the states and controls, expressed collectively by Eq. (1d). These constraints can include, for example, limited control authority and bounds on the input command and its rates, or maximum values of important parameters of the system that need not to be exceed. The regulation cost J can be selected to provide specific performance characteristics to the controller. For example, the cost can be defined as the integral over the prediction window of the function

t+Tp Prediction horizon

F (x, u, x∗ , u∗ ) = (x − x∗ )T Q(x − x∗ )+ Figure 1: Model-predictive control of wind turbines.

(u − u∗ )T R(u − u∗ ),

(2)

where the first term accounts for the regulation erThe controls computed by the optimizer are now ror x − x∗ , while the second term is typically used used for steering the plant, but only on a short time for limiting the activity of the controller, being u∗ horizon up to time t+Ts , as soon as a new measure- a reference value for the control. c Copyright 2006 by JSME

Proceedings of ACMD06

2.1

A00687

Adaptive Reduced Model

In this work, the reduced model used for predicting the future behavior of the plant includes an adaptive element, following Bottasso et al. (2006a). Hence the reduced model is composed of a reference model augmented by a neural network that provides for the adaption capabilities, and is written ˙ x, u) = d(x, ˙ x, u). fref (x, (3)

compensate for these different effects, thereby optimizing the performance of the closed-loop system. Training, i.e. ensuring a proper matching between predicted and actual outputs, can be obtained by subjecting reduced model and plant to the same control actions, and by tuning the reduced model parameters so that some measure of mismatch between the two is minimized. Indicating with uo the optimal control inputs computed at e the retime t on the basis of problem (1) and with x sulting response of the plant in the steering window [t, t + Ts ], then the neural network approximation error of the unknown defect function is computed based on Eq. (3) and (4) as

˙ x, u) is a reference In the previous equation, fref (x, ˙ x, u) is the reference model of the plant, while d(x, model defect. If one knew the function d, the prediction of the system states x would exactly match the corresponding values of the plant. In other words, the e˙ , x e, uo ) − dm (x e˙ , x e, uo , pm )||2 . (6) E = ||fref (x defect is that unknown function that ensures a perfect matching between the states x of the reduced This error is approximatively minimized using model and the corresponding states of the plant. The idea is now to approximate the defect d using the back-propagation algorithm (Fausett (1994)): the local information provided by the last steering a parametric function: of the plant is used for correcting the current esti˙ x, u) = dp (x, ˙ x, u, pm ) + ε. d(x, (4) mate of the network parameters as In this work the parametric function is chosen to pm = pm + ∆pm , (7) be a single-hidden-layer neural network which has where ∆pm is the steepest-descent search direction the following functional form ˙ x, u, pm ) = W T σ(V T i + a) + b, dp (x,

(5)

where i = (x˙ T , xT , uT )T are the network inputs, W and V are matrices of synaptic weights, a and b are the network biases, and σ(φ) is the vectorvalued function of sigmoid activation functions for the Nh processing elements in the hidden layer. The universal approximation property of neural networks (Hornik et al. (1989)) ensures that the functional reconstruction error ε in Eq. (4) can be bounded as ||ε||2 ≤ Cε for any Cε > 0, for some appropriately large number of hidden neurons Nh , assuming that d is sufficiently smooth. The neural network is trained with an error-correction learning algorithm, whereby the network parameters pm = (. . . , Wij , . . . , Vij , . . . , ai , . . . , bi , . . .)T are adjusted to minimize the error between the network output and the desired output (Fausett (1994)). This way, the reduced model, and hence the controller, can self-adjust to specific situations. A model predictive controller augmented with an adaptive neural network can learn how to

∆pm = −η

∂

R t+Ts t

E dt

∂pm

,

(8)

and η is the learning rate.

2.2

Solution of Model Predictive Control Problem

The solution of the model predictive control problem (1) can be approximated in two alternative ways: through a numerical method, or by training of a second neural network (neural control). The two approaches have different characteristics. The latter is less computationally intensive and has the potential for achieving real-time performance even in the presence of limited on-board computational resources. On the other hand, the training of the control network can be problematic because of the very large dimension of the solution space. Furthermore, one can not deal directly with possible input and output constraints (Eq. (1d)), which are on the contrary trivially accounted for with the numerical solution approach. c Copyright 2006 by JSME

Proceedings of ACMD06

A00687

For off-line simulations, the solution to the model predictive regulation problem is here obtained by the direct transcription method (Betts (2001)). The governing equations of the reduced model are discretized on a computational grid of the prediction window using an appropriate numerical method. This defines a set of discrete unknown state and control parameters on the computational grid. Next, the constraint conditions and the problem cost function are expressed in terms of the discrete parameters. This in turn defines a non-linear discrete parameter optimization problem, i.e. a Non-Linear Programming (NLP) problem, that is solved using a sequential quadratic programming (SQP) approach. The numerical solution of the discrete problem approximates the solution of its infinite-dimensional counterpart, problem (1). In this work, the transcription process is based on the implicit mid-point rule. For real-time problems, the model predictive optimal control problem is approximated as follows. First, notice that it is always possible to write the optimal control u(τ ) which solves (1) as u(τ ) = uref (τ ) + v(x0 , τ ),

(9)

where uref (τ ) is a reference (known) control action as computed by using some available controller (e.g. LQR in this work), while v(x0 , τ ) is the unknown control defect. If one knew v(x0 , τ ), then the optimal control would be available without having to solve the open-loop optimal control problem. The idea is then to approximate v(x0 , τ ) using an adaptive parametric element v(x0 , τ ) = vp (x0 , τ, pc ) + ε,

(10)

The first set of equations (11a) represent the reduced model dynamics, the second set (11b) the adjoint equations with Lagrange multipliers (costates) λ, the third set (11c) the transversality conditions, while Eq. (11d) are the state initial conditions and (11e) are the co-state final conditions. An iterative procedure to solve the problem in real-time is then as follows: 1. Integrate the reduced model equations forward in time over the prediction window, using (9) and the latest available estimate of the control parameters pc (state prediction): ˙ x, u) = 0, f (x, ¯ x(t) = x.

τ ∈ [t, t + Tp ],

(12a) (12b)

2. Integrate the adjoint equations backward in time (co-state prediction): T λ˙ + f,x λ − F,x = 0,

τ ∈ [t, t + Tp ], (13a)

λ(t + Tp ) = 0.

(13b)

3. Correct the current estimate of the control parameters pc using steepest descent: pc = pc − η Jˆ,pc .

(14)

We remark that the parameter correction step seeks to enforce the transversality condition (11c), which reads in this case δpTc Jˆ,pc

=

δpTc

Z

t

t+Tp T T v,p (F,u − f,u λ) dτ = 0. c

(15) where pc are the unknown control parameters and Once this is satisfied, the control is optimal, ε is the functional reconstruction error. since the state and co-state equations and the In order to identify the values of the control paboundary conditions are satisfied. rameters pc , consider the optimality conditions obtained from problem (1) by imposing the station- To simplify the training of the control network, the ˆ arity of the augmented cost J: control defect is discretized as ˙ x, u) = 0, f (x, −

d(f,Tx˙ λ) T + f,x λ + F,x = 0, dt T F,u + f,u λ = 0, x(t) = x0 , λ(t + Tp ) = 0.

(11a) (11b) (11c) (11d) (11e)

v(x0 , pc , τ ) =

s X

ˆ NN k (x0 , pc ), φk (τ ) u

(16)

k=1

ˆ NN k (x0 , pc ) where φk (τ ) are shape functions, and u are the neural-network generated nodal values. The present approach has the following highlights: c Copyright 2006 by JSME

Proceedings of ACMD06

3

APPLICATION TO THE REGULATION OF A WIND TURBINE The effects of complex terrains, wind turbulence, and ice accretion, coupled with the large span blades and tall flexible towers of modern wind turbines, render the problem of control of structural vibrations one of the main hurdles in the design and effective installation of such machines. The modeling of wind turbine generators is here based on a comprehensive finite element based multibody dynamics approach (Bottasso et al. (2006b)). Structural elements, joints, actuators and electro-mechanical elements defined in the code library can be assembled to describe arbitrary turbine configurations, which also include mathematical models of the on-board sensors. Specifically, we consider a wind turbine in the 1.2 MW class, with a three-bladed rotor with blade pitch and generator torque control. The multibody model of the turbine includes flexible blades and tower, flexible tower foundations, hub, drive shaft, flexible drive train accounting also for mechanical losses on the shaft bearings, nacelle, generator, and pitch actuators. Aerodynamic effects are modeled by means of appropriate lifting lines on the various bodies of the structure, including an inflow element for the rotor disk, and turbulent wind models. The

1.3

2.7

1.2 2.1

Vm/V*m

• If the reference solutions are in the neighborhood of the true ones, the defect that the networks are in charge of approximating is small, which substantially eases the identification process. This is in contrast with other approaches which delegate to the networks the approximation of the whole solution.

*

• Since it is based on reference solutions, a minimum level of performance is guaranteed even at the beginning of the process and before any learning has taken place. Since reasonable predictions and reasonable control actions are obtained by the reference elements, there is no need for pre-training of the networks, a major hurdle in other network-based control approaches.

aerodynamic models include IEC deterministic and stochastic winds, and the modeling of effects such as rotor-nacelle-tower interactions, upwind turbine shadowing and ice accretion. The non-linear reduced reference model of the turbine includes drive-train shaft dynamics, elastic tower fore-aft motion and blade pitch actuator dynamics. The aerodynamic coefficients were obtained off-line from simulations performed using the multibody model, considering steady wind conditions. The coefficients are computed on-line in terms of effective blade pitch angle and tip-speed ratio by means of look-up tables. We consider the dynamic response of the plant subjected to the extreme coherent gust (ECG), with wind speed changing from a value of 10.6 m/sec to 25 m/sec in 10 sec. Figure 2 shows the NMPC transcription-based solution (solid line) and the solution obtained with a PI controller (dashed line) in terms of the non-dimensional rotor speed Ω/Ω∗ . The gust time history, which is superimposed to the response, highlights the fact that the model predictive controller is able to regulate the system back to its nominal operating condition very soon after the termination of the gust. Notice also from the same plot that this controller shows a substantially reduced over-shoot with respect to the PI solution.

Ω/Ω

• It is adaptive at the level of the model and the level of the control laws. The two are coupled, but they can be trained independently on-line, which eases this process.

A00687

1.1

1.5 1

0.9

0

10

20

30

40

50

0.9 60

Time [sec]

Figure 2: Time history of non-dimensional rotor angular speed Ω/Ω∗ for the ECG gust. NMPC solution: solid line; PI solution: dashed line; thin solid line: deterministic non-dimensional gust speed Vm /Vm∗ . Next, we evaluate the performance in turbulent c Copyright 2006 by JSME

Proceedings of ACMD06

A00687

constant-mean wind. Dynamic simulations were conducted for a duration of 300 sec under constant values of the mean hub-height wind speed (ranging between 13 m/sec and 22 m/sec) and Category A turbulence. Figure 3 reports the standard deviation of the non-dimensional rotor speed error Ω/Ω∗ − 1 as a function of the mean hub-height wind speed for Category A turbulence. It appears that the predictive controller is capable of more than halving the angular velocity fluctuations with respect to the PI controller. At 13 m/sec this improvement is less marked, probably because the turbulent fluctuations can cause wind speeds so low that the target rotor speed can not be achieved no matter which controller one is using.

0.05

Standard Deviation(Ω / Ω* −1)

0.045 0.04 0.035 0.03 0.025 0.02 0.015 0.01 0.005 0

12

14

16

18

20

22

24

Vm [m/sec]

Figure 3: Standard deviation of non-dimensional rotor speed error vs. mean hub-height wind speed for Category A turbulence. NMPC solution: solid line; PI solution: dashed line.

4

APPLICATION TO MANEUVERING ROTORCRAFT VEHICLES In this section we demonstrate the proposed procedures in the context of maneuvering rotorcraft applications, and in particular in the case of emergency take-off procedures. The multibody helicopter model is based on the comprehensive finite element-based formulation of Bauchau et al. (2001), and represents a generic medium-size multi-engine utility helicopter in the 9 ton class which includes a four bladed articulated rotor, modelled using geometrically exact non-linear beam elements and rev-

olute joints, connected to a rigid fuselage. The reference model is based on two-dimensional longitudinal dynamics rotorcraft equations (Bottasso et al. (2006a)). We consider the case of a continued take-off after engine failure. To meet the certification requirements, the rotorcraft must be able to continue the acceleration while clearing the ground with a given safety margin. Eventually, the vehicle must also reach a given safe speed, a given minimum height above the take-off point and a positive rate of climb. The problem is formulated as described in Bottasso et al. (2006b). The rotorcraft trajectory is obtained by solving a planning optimal control problem at the level of the reduced model. The planning cost for this maneuver is the altitude loss. The maneuver optimal control problems also includes bounds on rotor controls and their rates, on the available power, and on the final values of the horizontal and vertical velocity components, pitch rate and rotor angular speed. The planning initial conditions are obtained by starting from hover, and letting the vehicle fall for one second without any control actions, to account for the effect of the pilot reaction time. Once the trajectory has been computed, the multibody rotorcraft model tracks it by using the control strategy described in the previous pages. The numerical solution is based on the direct transcription method. Once the trajectory has been tracked to completion, a new estimate of the reduced model is available, and hence a new planning can be performed. Iterations are carried out among the various phases, planning and trackingsteering-adaption, until convergence. Figure 4 reports the vehicle trajectory and the airspeed vs. time, respectively, for successive planning/tracking-steering-adaption sequences. Solutions obtained during the planning problems are marked with the symbol ◦ for niter = 1, ▽ for niter = 2, ⊳ for niter = 3, ⊲ for niter = 4, and finally with △ for niter = 5. The corresponding outputs obtained by the multibody model during the final tracking is shown with a solid line. Both figures clearly indicate the convergence of the planning problems. Notice that the final solution is quite different, for example in terms of total altitude loss, from the first one, indicating that a substantial error in planning is possible with the initial (reference) reduced model. Excellent tracking performance is obtained at convergence, illustrating c Copyright 2006 by JSME

Proceedings of ACMD06

A00687

2

which also represents the gradient of the regulation cost J with respect to the network parameters pc . This quantity is zero if the control actions are optimal, i.e. if they solve the regulation optimal control problem. The same figure reports, using a solid line, the control action, i.e. the force applied to the cart. The plot shows that the network learns very quickly how to satisfy the transversality conditions, and hence produces optimal control actions. In fact, after about 0.5 sec and well before the system has been regulated, the controls are already optimal.

0

−2

Y [m]

−4

−6

−8

−10

−12

−14

−16

0

50

100

150

200

250

300

350

X [m]

300

Figure 4: Optimal helicopter continued take-off maneuver. Planned trajectories (niter = 1 : ◦; niter = 2 : ▽; niter = 3 : ⊳; niter = 4 : ⊲; niter = 5 : △) and multibody trajectory at the last iteration (solid line).

40

200

ϑ2 [deg]

20

||Jp||

100

the success of the overall procedure.

0

0

5

APPLICATION OF NEURAL CONTROL TO THE INVERTED DOUBLE −20 0 1 2 3 4 5 6 7 8 9 10 PENDULUM MODEL PROBLEM Time [sec] For demonstrating the potential of the neural control strategy described in Section 2.2, we consider the stabilization of a double-inverted pendulum on Figure 6: Transversality violation and control aca cart, as depicted in Figure 5. The reference con- tion vs. time. troller is a LQR, with a linear reduced model linNext we consider the case of a mismatch beearized about the vertical position. tween plant and model. To this effect, we introduced springs in the revolute joints of the double Y pendulum, but did not included them in the reduced model. Figure 7 shows the regulation cost vs. the length of the prediction window. The neuL J ral NMPC solution is plotted using a symbol, while the LQR solution is indicated by the ◦ symm 2g L bol. Notice how, for a sufficiently long prediction J window, the adaptive nature of both the reduced m 1g m0 model and the control action lead to the reduction V of the control cost with respect to the underlying reference controller. X 2

2

2

1

1

1

Figure 5: Double inverted pendulum on a cart.

6

CONCLUDING REMARKS We have described our current research on the apFigure 6 shows with a dashed line the norm of plicability of adaptive and non-linear model prethe violation of the transversality conditions J,p , dictive control to various challenging problems. It c Copyright 2006 by JSME

Proceedings of ACMD06

A00687

ics and Computer Modeling, Vol. 33, pp. 1113– 1137, 2001.

7000

6900

Betts, J.T., Practical Methods for Optimal Control using Non-Linear Programming, SIAM, Philadelphia, 2001.

6800

6700

E

6600

6500

6400

6300

6200

6100 0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

Bottasso, C.L., Chang, C.-S., Croce, A., Leonello, D., and Riviello, L., Adaptive Planning and Tracking of Trajectories for the Simulation of Maneuvers with Multibody Models, Computer Methods in Applied Mechanics and Engineering, to appear in the special issue “Computational Multibody Dynamics”, 2006.

Tp [s]

Bottasso, C.L., Croce, A., Savini, B., Sirchi, W., Trainelli, L., Aeroelastic Modeling and Control Figure 7: Regulation cost vs. length of prediction of Wind Turbine Generators Using Finite Elewindow. LQR solution: ◦ symbol; neural NMPC: ment Multibody Procedures, Multibody Systems symbol. Dynamics, under review. Fausett, L., Fundamentals of Neural Networks, appears that the approach which is overall most Prentice Hall, New York, 1994. satisfying is based on a reference augmented reduced model, with a solution based on a direct Findeisen, R., Imland, L., Allg¨ower, F., and Foss, B.A., State and Output Feedback Nonlinear transcription approach. It appears, from the experModel Predictive Control: An Overview, Euroiments conducted so far, that the direct transcrippean Journal of Control, Vol. 9, pp. 190–206, tion method is quite robust and converges quickly, 2003. especially since in a receding horizon framework the previous solution typically amounts to an excellent Hornik, K., Stinchombe, M., and White, H., Multiinitial guess for the next step. Layer Feed-Forward Networks are Universal ApHowever, in some cases such direct numerical approximators, Neural Networks, Vol. 2, pp. 359– proximation of the optimal prediction problem is 366, 1989. too computationally demanding. To address this problem, we have formulated a neural controller which augments and improves a reference control action. The presence of the reference solution even in this case eases substantially the identification of the controller and ensures reasonable performance even without pre-training. This faster computing response and enhanced adaptive capabilities come however at the price of dropping possible state and control constraints from the model predictive problem.

References Bauchau, O.A., Bottasso, C.L., and Nikishkov, Y.G., Modeling Rotorcraft Dynamics with Finite Element Multibody Procedures, Mathematc Copyright 2006 by JSME

View publication stats

Lihat lebih banyak...

Neural adaptive control of multibody systems

Descrição do Produto

Comentários