Optimal Rendezvous Guidance Using Linear Quadratic Control

This paper handles with an energy optimal guidance law for rendezvous mission, based on linear quadratic control (LQC) problem. Rendezvous of two satellites are expressed by a nonlinear relative orbit dynamic model. The LQC problem minimizes integral of control input quadratic term with given final time and terminal states. A linear relative orbit dynamic, also called as the Clohessy-Wiltshire equation, is utilized as governing equation for optimal rendezvous guidance law. It is proven that renewing costates like an initial time is identical to propagating it from initial time to current time. Thus optimal guidance law can be formulated in state feedback form. To enhance computation efficiency, this work uses Taylor series expansion for the exponential of system matrix. The proposed algorithm is verified through nonlinear relative orbit simulations.


Introduction
Recently, needs on satellite proximity operation are growing faster.For decades, global major countries in the space industry have built the International Space Station (ISS), and its size is now up to a soccer field.Its enormous size makes the ISS requires many maintenance crews to cooperate by turns.In addition, fuel and food need to be supplied almost from the ground until now.The space-X have made Falcon-9, a nine tandem busted space cargo ship to do commercial transport service missions on the ISS.Sooner or later, the satellite constellations or satellite online maintaining missions will open new markets in the space industries.
The proximity operation in space is considered as two of two particles problems (earth-sat1, earth-sat2).The key modeling on the relative orbit dynamics is done by Clohessy and Wiltshire [1].They derived a linear relative orbit dynamic model, so named after as the Clohessy-Wiltshire Equation (CW Equation).It is utilized for many proximity operation conceptual studies; In [2], [3], Kim have studied about relative orbit navigations based on mono image sensor and image processing technique.Kim utilized the CW equation for simulating the relative orbit to get a virtual images of satellite.In [4], Owis used an approximating sequence riccati equations for satellite formation flying guidance.It transforms the nonlinear relative orbit dynamic into a state affined form, then solved it sequentially as if it is a linear quadratic regulation (LQR) problem.Luo has used the CW equation as the governing equation of the rendezvous trajectory and formulated the guidance problem as an LQR problem [5].Luo transformed the problem into a parameter optimization problem finding full state feedback gain K, and solved it through simulated annealing technique to compare with the gain K from the algebraic Riccati equation solution.For more detail description about LQR, referred to read [6][7].
Spencer used a potential field guidance for satellite proximity operation guidance in his Ph.D. dissertation [8].Spencer also utilized the CW equation as governing equation, and he introduced advanced relative orbit elements to describe the chaser orbit.
For a real-time trajectory optimization of missile or launch vehicle, Padhi has developed Model Predictive Static Programming (MPSP) algorithm [9][10].It is inspired from the approximate dynamic programming and the model predictive control.
In this work, we derived the nonlinear relative orbit dynamic model and the chaser satellite elementary orbit model, then introduced details of the CW equation.The CW equation is employed as a dynamic constraint equation for Linear Quadratic Control (LQC) Problem.The LQC is distinguished from LQR in two aspects.1.The LQC specifies terminal time and terminal state, so those are constraints that must be satisfied.2. The LQC has no penalty on the integration of the state quadratic term.The LQC solution can be derived into a state feedback form, because the propagated costate variables in each time is same as the costate by renewal from current state.And the suggested algorithm is verified through cases of simulations.
The rest of this paper is organized as follows; In chapter 2, nonlinear relative orbit dynamic equation is introduced, and related target satellite orbit element is remarked.Then linear relative orbit dynamic equation, the Clohessy-Wiltshire equation, is derived from the nonlinear model.In chapter 3, an optimal rendezvous guidance method, which is based on the linear quadratic control problem, is proposed.In chapter 4, suggested method is utilized on the nonlinear relative orbit dynamic model for several cases of scenarios, and we discuss about results of the nonlinear simulation.After then some brief closing session is at the end of the paper 2 Relative Orbit Dynamic

Nonlinear Relative Orbit Dynamic
Consider two satellites traveling in a similar orbit, one is a chaser and the other is a target.Position vector to the target is r T , position vector to the chaser is r C , and relative position vector from the target to the chaser is ρ.Those vectors are related as follow, C T r r ρ .
(1) Fig. 1 shows two coordinates, one is inertia frame (Iframe) and the other is local vertical local horizontal frame (L-frame).I-frame is fixed on the center of earth, x-axis, and y-axis is respectively aligned to the North Pole and the vernal equinox, remaining z-axis is properly aligned according to the right hand sign convention.Lframe is attached to the target ship, its x is a radial direction from the origin of I-frame to the origin of Lframe, z axis is oriented to angular momentum direction of the target, y axis is determined by the right hand sign convention again.
The second order derivative of (1) with respect to Iframe leads to next, (2) where the superscription on the left of a vector indicates the frame which change of the vector is measured on.The external forces acting on the target are solely the gravity, and those on the chaser are the gravity and the control acceleration a T .Here we construct the earth gravitational parameter μ e of 3.98601E5 (km 3 /s 2 ) to express the gravity force.Now we change the reference frame of time derivatives to observe changes of the relative distance in the L-frame.The time derivatives of ρ in I-frame decomposed to 4 separated vectors; time derivative term of ρ in L-frame, tangential acceleration term, Coriolis term, and centripetal acceleration term.

(
) 2( ) ( ) where ω is angular velocity vector of L-frame respect to I-frame, and α is angular velocity acceleration vector of L-frame respect to I-frame.
Each vectors in (6) ought to be described in L-frame.(9) where e x , e y , and e z represent the orthonormal reference vector of each axis of L-frame.As z-axis of L-frame and angular momentum per unit mass of the target are in same direction, the angular velocity vector has only z-component as, Z ω e (10) T z Z ω e .
(11) Here, ω T is rate of transportation or rate of true anomaly, which is related to the angular velocity of L-frame.As the target motion is independent from relative motion of the chaser, it needs dynamic of the target orbit to describe r T , ω and ωṪ.

Target Satellite Elementary Orbit
The rate of the position vector r T is describable in Lframe as follows, Here, ṙ T indicates velocity of target in radial direction.Then, it is trivial to, Then, by integrating (12) and ( 13) with respect to Iframe, one can obtain the orbit of the target satellite.As y-component of the acceleration vector described on L-frame should equal to zero, it is rearranged about changes of rate of the transportation as follows, However this form contains an unknown, second order derivative of r T , thus it cannot gives ω T .Rather than from (12), the rate of transportation can be computed.
The velocity component of the target ship in I-frame can be described in L-frame by a direction cosine matrix C I L which translating a vector in I-frame to L-fame vector.Thus the radial component and the rate of transportation is calculated as, The C I L changes by the true anomaly, which is a basic orbit element.It indicates in which direction a satellite exists with respect to the origin of I-frame.Then true anomaly of the target satellite is calculated by atan2 function as, atan2 ( , ) Since the orbit plane is perpendicular to the z-axis of Lframe, the C I L is computed as follows, where C I L0 is the DCM from I-frame to L-frame at initial time, and R 3 indicates the DCM rotating in z-axis with a mount of ν.

The Clohessy-Wiltshire Equation
In order to design a rendezvous guidance law, a linear relative orbit dynamic equation is extracted from the nonlinear equation.At first, consider following approximation which is the reciprocal of the gravity acceleration term in (6).
( 1 ) 1 Substituting ( 19) to ( 6), gives next.When an object flies a circular orbit, it undergoes centripetal accelerations.In this case this acceleration should be equal to the gravity, then frequency of the satellite orbit becomes like, Here we, by construction, defined the mean motion frequency n, which is related to a concept, called mean motion frequency in space dynamic.
As you see from (15), when the radial component of acceleration is zero, that is, a satellites flies a perfect circular orbit, the rate of transportation is same as the mean motion frequency.Now, substituting from ( 7) to ( 11) and ( 21), to (20), arrange it component-wise, then it becomes next.
In general, the relative position components x, y and z are quite small relative to r T , thus the last terms on the left hand side of each equation in ( 22) are negligible.If the orbit is quite similar to a circular orbit, the rate of ω T is ignorable too.Then we have following linear relative orbit dynamic equation.

Linear Quadratic Control
Let's define a linear quadratic control problem (LQC), with fixed terminal time and terminal states.For a given system, it supposed to find the control input u that minimizes following cost function while satisfies the terminal states.
where x ℝ n is state, u ℝ m is control input, R uu ℝ n×n is cost weighting.This LQC problem can be solved by the calculus of variation.First, define the Hamiltonian of given object function as follows  28) and (29) gives the augment of costate-sate equation as, The blocked transition matrix is nominated as system matrix F. This is an unforced linear system state space model, whose well-known solution is, The exponential matrix of F is nominated as a state transition matrix Φ (STM), that has information of state propagation from initial time t 0 to a given time t, and it is useful to express STM with blocks of matrices like above.Thus the final state is simply described as, Here, we already know the initial and final states, x 0 and x f , as they are given boundary conditions.On the other hand, the initial and final costates, λ(t 0 ) and λ(t f ), are unknown variable.Therefore, we have 2n system of equations with 2n unknown variables.The remaining problem is just how to solve this system of equations.If we manipulated the STM like a blocked matrices, (33) is turn into, ) ) .
where the subscription, xλ, indicates the transition from initial λ to final x, and so on.By rearranging it, the initial and final costates are computed as follows, ) ) .
Then, with (32) and (35), Now by substituting costate value at a given time, λ(t), to (30), one can obtain an energy optimal control history on LQC problem.

Optimal Rendezvous Guidance
Now we adjust the LQC problem to rendezvous mission of two satellites.The satellites dynamic is expressed by the Clohessy-Wiltshire equation,

A B x x u
(37) > @ T x y z x y z x (38) where the mean motion angular rate, n, is updated by time.For this linear system, a modification of the LQC is utilized to feedback current states by time.
The time left until rendezvous is defined as time to go as follows.
The system matrix is computed by (31), then the STM is approximated as follows, This imitation form of STM is quite accurate when the eigen values of F exist on the left half plane of imaginary plane.By substituting t go , it indicates state transition from current t to the final t f and the STM is function of the t go .The STM from t 0 to t f can be separated with two STMs; STM from t to t f , and STM from t 0 to t.The STM from t to t f is named as the STM-t go .Then the block of matrices are expressible as follows, ) ) . ( 42) Now substitute (42) to (32), rearrange current costate λ(t) to be expressed with the current states x(t) and the STM-t go . ) ) ) ) ) ) ) ) ) Now, substitute current costate to (30), in order to get a state feedback form of suboptimal guidance law.As command u * is computed on L-frame, and the dynamic equation of the nonlinear relative orbit is expressed on L-frame too, we can directly substitute the u * to a T .In the rest of paper, this optimal rendezvous guidance is utilized on the nonlinear relative orbit dynamic model for several cases of initial and final condition sets.

Simulation
In this section, we implement the suggested optimal guidance law on some cases of virtual rendezvous scenarios, in order to verify the method on the nonlinear orbit dynamic models.Here comes operation introduction and simulation results.Then some discussions concerned with the results are attached below.

Satellite Proximity Operation Scenarios
Consider a maintenance mission on a communication satellite (COMSAT), which is skidding the earth in a circular orbital.To extend design life of the COMSAT, a service satellite takes rendezvous with the COMSAT and do some maintenance job.
The servicing satellite will have a mobile robot for mission and it can dock to the COMSAT somehow.It is launched from ground and released near the COMSAT within the visible distance.It approaches to the COMSAT from the below, and has to be attached very slowly, not to crash or damage the COMSAT.
KOREASAT 6, also called Olleh 1, is a commercial COMSAT launched by the KT Corporation in 2010.It services direct broadcasting and fixed satellite service to the users of Republic of Korea within 15 years.It weighs about 2.6 ton, and satellite period is 1437 min/rotate.It is flying at an altitude of 35791 km.For some virtual repairing missions on the KOREASAT 6, we launch a virtual satellite and release it in a near orbit.Table 1 lists each case of simulation initial conditions.The initial and final states of relative orbits are expressed in L-frame, on the other hand, the initial states of target states are described in I-frame.In order to check precisely whether the terminal velocity constraints are satisfied or not, we attach the normalized errors of the relative velocity on Fig. 6.The relative velocity error is divided by velocity normalizer V n of 10 m/s.For all cases, the normalized errors are converged to zero, therefore the terminal velocity constraint are satisfied.

Simulation Results
Acceleration command is displayed in Fig. 7 for each component of L-frame.It varies smoothly in most time, but it is about to be diverged at the end of simulation.However, the divergence is quite small and short, thus it does not affect the terminal state value that much.And if we consider with a thrust actuator, it may not respond to such a short and strong command due to system lag and bandwidth.

Discussion
Comparison with Other Method.
Glide Slope method is a common way of manned rendezvous.It is usually used for space shuttle rendezvous on the International Space Station.It fires sequences of jet on the local vertical or the local horizontal axis.It is quite insightful but takes long time, and impulse on the jet has to be controlled precisely in short time interval.
Linear Quadratic Regulation (LQR) problem has a cost function that has penalty not only on the input quadratic term, but also on the state quadratic term.As it doesn't have the terminal time and state constraints, the KKT condition leads it to the differential Riccati equation (DRE) of 2n+1 variables; n final states, n initial costates and 1 terminal time.However, we alternatively solve algebraic Riccati equation (ARE), because the DRE is unable to solve analytically.The solution of ARE is identical to the optimal solution, only when t f + , thus it is always suboptimal.
Model Predictive Sequential Programming (MPSP) is a discretized nonlinear real-time optimization algorithm.It is inspired from both the Approximate Dynamic Programming and the Model Predictive Control, so it calculates sensitivity on the terminal state of each input sequence and adjust the input sequence repeatedly.As it needs terminal states estimation and sensitivities of each states, it internally has to simulate for each input sequence updates.Additionally, the augmented sensitivity matrix getting larger as the time steps are increased, which often lead the matrix be singular.Meanwhile, the CW equation is reliable in most of rendezvous scenarios, thus the problem is suitable to be handled by the LQC.It is much faster and cheaper way to the optimal rendezvous.

Real-Time Implementation and Expectation
The guidance command was generated at every 0.01 sec interval during the simulation, which was conducted on Intel PC (i7-4770 3.4 GHz) with MATLAB 2013b.
Computing 1500 sec simulation takes 135 sec at the slowest.Thus we can assume onboard flight computers of satellites may have a chance to calculate the command just in time also.The major computation load of the algorithm occurs at matrix inversion and exponential matrix calculation.Here we approximated the exponential matrix to the third order of Taylor series expansion.However, if we can analytically solve the exponential of the system matrix and the inversion of Φ xλ , it will much fasten the algorithm.
In the view of system integration, measurement errors, system uncertainties, and control system lag may cause little performance degradation.To ensure the system safety, further studies on stability margin are required.

Conclusion
A linear quadratic control problem is used to design optimal rendezvous guidance law.The LQC minimizes the integration of input sequence energy from t 0 to t f , The LQC is solved by the KKT condition.In this work, we proved that the current costate obtained by the feedback of current state results in identical costate of the optimal trajectory, so that, we can compute the guidance command in state feedback form.Proposed algorithm utilized the Clohessy-Wiltshire equation as a governing equation, which is a linear system equation.It is quite similar compared to the nonlinear orbit dynamic model, as the nonlinear effect is negligible.By solving the LQC for CW equation, this paper made an optimal rendezvous guidance law.It is shown to be useful to satisfy the terminal time and states through the four cases of simulations.

Figure 1 .
Figure 1.Relative Orbit Geometry.(Target and Chaser) called as the Clohessy-Wiltshire equation.It is a linear state space model of the relative distance and speed between the chaser and the target on L-frame. 0

Fig. 2 Figure 2 .Figure 3 .
Fig. 2 depicts 3-dimensional trajectories of the chaser satellites of each case.It presents relative position of the chaser in L-frame.Red star marker indicates the target satellite, and green ring means initial position of the chaser.Even though the chaser moves in wrong direction at first due to initial velocity, it eventually settles down on the specified terminal position.Each component of relative positions is shown in Fig. 3 by time.The end time of simulation is different case by case, as specified in the simulation scenarios.For case 1 to 3, it terminates on 1000 sec, on the other hand, Case 4 terminates at 1500 sec.The suggested guidance law exactly satisfied the terminal position condition, as shown in Fig. 4. The figure displays normalized relative position errors of the 1.

Figure 4 .
Figure 4. Normalized errors of the relative position of each component in L-frame.Presented values are (ρ-ρf )/Ln, where Ln is 1000 m.Figure 5. Relative Velocity of the chaser satellite in L-frame.

Figure 5 .
Figure 4. Normalized errors of the relative position of each component in L-frame.Presented values are (ρ-ρf )/Ln, where Ln is 1000 m.Figure 5. Relative Velocity of the chaser satellite in L-frame.

Figure 7 .
Figure 7. Acceleration command of the chaser satellite in L-frame.It shows little diverge in terminal phase.

Figure 6 .
Figure 6.Normalized error of the relative velocity of each component in L-frame.Presented values are ΔV/Vn, where Vn is 10m/s.