Transitioning Between Underactuated Periodic Orbits: An Optimal Control Approach for Settling Time Reduction

In underactuated systems, a transition between two periodic orbits is generally characterized by slow convergence. This is due to the fact that the unactuated degree of freedom (DoF) hinders the state of the system to enter the domain of attraction of the target orbit close to the ¯xed point of the Poincar (cid:1)e Map. In this paper, we introduce an optimal control algorithm to reduce the settling time of transitions between periodic orbits of underactuated walking robots. This is achieved by utilizing the hybrid zero dynamics (HZD) framework to express the feasibility condition of the transition which can be imposed as an inequality constraint in the proposed optimal control problem. In addition, the cost function penalizes deviations from the ¯xed point of the target periodic orbit in the zero dynamics manifold while at the same time all dynamic and kinematic assumptions are treated as constraints. Furthermore, high magnitude torques are also penalized. The numerical results show that the proposed methodology can indeed improve the settling time compared to the transition methodology usually found in the bibliography and at the same time provide a feasible and smooth motion. mechatronics, multimodal humansystem interfaces, optimization, nonlinear and hybrid discrete-continuous systems.


Introduction
Underactuated robots are systems that possess less actuators than degrees of freedom (DoFs). In that case, the controller design becomes a challenging task, since the control signals to the actuated DoFs need to induce a motion that stabilizes the whole system. Despite this challenge, there are numerous successful applications for a wide range of systems using various techniques. In the case of academic example systems, there is the swing-up control of the Acrobot, using partial feedback linearization and LQR control. 1 For free-°ying mechanical systems like quadrotors, there is a successfully applied framework for position control and trajectory tracking based on backstepping and sliding mode control. 2 In addition, recent interesting applications emerged from the¯eld of ship-mounted cranes where stabilizing control is achieved through a nonlinear controller design based on the method of Lyapunov. 3,4 Last but not the least, as we will see in the sequel, there are also di®erent approaches in the¯eld of walking robots, which is the main focus of our work.
Periodic motions of underactuated walking robots are dictated by limit cycles. Stable limit cycles have a domain of attraction where any deviation from the nominal orbit can be compensated by the dynamics of the system and the feedback terms of the control law. As a consequence, for the feasibility of a transition between two periodic orbits, it is su±cient that the state of the system enters the domain of attraction of the target orbit. When that happens, convergence is guaranteed and its rate is dictated by the maximum eigenvalue of the Poincar e Map. The time until convergence to the target limit cycle is de¯ned as the settling time.
Even though convergence can be guaranteed, it might be desired to improve its rate. In the case of velocity control, for example, one might want the state of the robot to converge to the target limit cycle as soon as possible, such that the target velocity is acquired very fast. Another reason might be that the target limit cycle is optimized with respect to an energy criterion and fast convergence to it allows the robotic system to consume less power. In general, a target limit cycle is usually designed to satisfy a task objective and faster convergence to the limit cycle means that the task is ful¯lled faster. In such cases, a possible solution is to try to minimize the maximum eigenvalue of the Poincar e Map associated with the target limit cycle, as was done for a running robot model and a somersaulting simulated robot. 5,6 In these two related works, the cost function to be minimized is the maximum eigenvalue of the Poincar e Map itself. As suggested, however, in the mentioned works, this minimization constitutes a di±cult problem. One of the reasons is that the maximum eigenvalue function of the nonsymmetric Jacobian matrix of the Poincar e Map is nondi®erentiable and possibly even nonLipschitz at points where multiple eigenvalues coalesce. In addition, di®erent modi¯cations of existing optimization algorithms were required for the success of this optimization. Another issue is that this minimization objective might be used against the satisfaction of other objectives such as torque consumption and/or minimum foot clearance.
Another possible solution to improve the settling time could be given by the concept of Virtual Model Control (VMC), where the virtual force could be used to make the robot transition between di®erent walking cycles. 7 Despite the fact that the VMC is generally applicable and very promising, it is not accompanied by concrete stability properties. In addition, with this methodology, we do not converge to a desired limit cycle per se, but instead allow the robot to walk in a fashion that satis¯es the task objective encoded in the virtual force. Finally, this methodology might induce high torques, such that the motion is energy ine±cient or it violates the actuator limits. A recent methodology which can also employ virtual forces can be found in the work of Veer et al. but as it can be seen the settling time is slow. 8 An alternative to settling time reduction without the shortcomings of the VMC and the minimization of the maximum eigenvalue is to enter the domain of attraction of the target limit cycle very close to the¯xed point of the Poincar e map. Then, convergence to the limit cycle will require less crossings of the Poincar e section, i.e., less time (see Fig. 1). If such an alternative is to be undertaken, a concrete feasibility guarantee is required which in order to be provided, knowledge of the domain of attraction of the target periodic orbit is required. The reason for this requirement is due to the fact that this condition characterizes a transition as feasible if and only if the state of the robot is in the domain of attraction of the target periodic orbit after the impact with the ground. In general, computing the domain of attraction of a limit cycle is challenging. In the case of underactuated walking, however, we can make use of a more speci¯c framework which is the hybrid zero dynamics (HZD). 9,10 With this tool, we can design individual periodic orbits and at the same time perform stability analysis and compute the corresponding domain of attraction. This is due to the fact that the focus is shifted to the zero dynamics which are a 2D manifold and thus the Poincar e Map becomes one dimensional. Using the HZD, we can also impose the feasibility condition. Alternatives to the HZD for gait analysis can be found in the works of Zutven and Dehghani et al. 11,12 In addition, one can use the approach of Djoudi et al. for designing periodic orbits for underactuated walking. 13 In our work Fig. 1. Di®erent orbits around an arbitrary limit cycle Ã ðxÞ marked with the dashed line. The¯xed point of the limit cycle is denoted with x Ã and the Poincar e section with S which in case of walking robots is usually chosen to be the walking surface, i.e., the state of the robot when the foot of the swing leg impacts the ground. In addition, the domain of attraction is denoted with the curled brackets on the Poincar e section S. As is shown, if we are able to enter the domain of attraction closer to the¯xed point x Ã , less orbits around the limit cycle will be required until convergence. however, we utilize the HZD since we want to compare our approach with the transition methodology proposed within the HZD. 9 In our previous work, we solved the settling time reduction problem using Reinforcement Learning. 14 For that, we introduced a database of 81 walking periodic motions and formulated the settling time reduction problem as a Markov Decision Process. Then, a Q-Learning algorithm was applied on each individual possible transition in order to¯nd a transition strategy that enters the domain of attraction of the target periodic orbit close to the¯xed point and therefore converges faster to the target periodic orbit. The outcome of this work was multi-step transitions that performed better than the one-step transition as proposed by Yang et al. 9 for approximately 84% of all possible transitions.
In comparison to the previous work, the contribution and novelty of this paper is to introduce a deterministic approach to reduce the settling time reduction by optimizing the one-step transition instead of searching for a multi-step one. The advantages of a one-step deterministic transition compared to a multi-step probabilistic one is that we avoid the shortcomings of the curse of the dimensionality which arises when dealing with large state-action spaces like in our previous work. 14 In addition, focusing on a one-step transitions allows to gain more insight on the behavior and decisions of the robot during such a motion. In comparison to the proposed approaches, our methodology introduces a more feasible dynamic optimization problem which is accompanied by stability and feasibility guarantees, while at the same time achieves faster convergence after transitioning than the state of art approach.
As in the previous work, we assume that the periodic orbits are given and we focus on the problem of transitioning between them in a way that reduces the settling time to the target periodic orbit. In order to compute this aperiodic transition motion, we employ optimal control to formulate and solve the settling time reduction problem. The optimal control approach is adopted since it allows us to easily impose all the constraints that ensure a feasible, valid and natural motion and there are already examples of its successful utilization in walking robots. 15,16 The initial state of the robot, state and actuators limits as well as modeling assumptions are treated as constraints. The feasibility condition is introduced as an inequality constraint imposed at the end of the motion taking into account the impact that occurs when the swing leg touches the ground. The cost function penalizes deviations from the¯xed point of the Poincar e Map of the target periodic orbit in the zero dynamics manifold and avoids torques with large magnitude, while keeping the transition time low. This is another re¯nement with respect to our previous work, where the multi-step transitions were decided solely based on the distance to the¯xed point of the target periodic orbit.
The results are compared against the one-step approach proposed by Yang et al. 9 and show that our methodology of optimizing a transition has great impact on the improvement of the settling time. Even though we demonstrate the usefulness of our approach with the example of a 5-link underactuated robot, the methodology for settling time reduction can be applied to any underactuated walking robot with one unactuated DoF. Of course, our approach of settling time reduction is not only limited to the case study of walking, but can also be extended to the case of running. [17][18][19][20] This paper is structured as follows: Section 2 presents the dynamic model of a 5-link underactuated robot. Section 3 explains the HZD framework in an intuitive way. The optimal control formulation of the settling time reduction problem is introduced in Sec. 4. The results of the optimal control problem (OCP) are demonstrated in Sec. 5. Section 6 concludes the paper.

Dynamic Modeling of Underactuated Walking
In this work, walking is modeled as a hybrid process with two phases. The¯rst includes the swing phase where the foot of the stance leg is pinned on the ground and the swing leg moves forward until it touches the ground. When that happens, the second phase takes place which consists of an impact after which the role of the swing and stance leg is exchanged.

Swing phase
An underactuated walking robot is a mechanical system with n DoF which can be expressed with the help of Lagrangian dynamics as where x ¼ ½q > q : > > 2 R 2n is the state of the system, q the joint positions and q : the joint velocities. Regarding the other terms, D 2 R nÂn is the mass-inertia matrix, C 2 R nÂn the matrix of Centrifugal and Coriolis terms, G 2 R n the vector of gravitational terms, B 2 R nÂðnÀ1Þ the input matrix, u 2 R nÀ1 the input vector, y 2 R nÀ1 the outputs of the system and h d 2 R nÀ1 is the vector of desired trajectories for the actuated DoFs.
The system is expressed in such a way that q n is the unactuated DoF and is an absolute coordinate, while the remaining DoFs are expressed relative to q n and each other. That results in q n being a cyclic variable and it does not appear in the massinertia matrix D, i.e., @D @q n ¼ 0. This property will prove useful in the sequel. In addition, it results in the last line of the input matrix B being populated only by zeros.
Regarding the selection matrix H 0 it is de¯ned as where I is the identity matrix. Omitting the arguments of the matrix and vector functions for clarity and expressing the system (1) in a±ne form we get

Impact
The robot experiences an impact at the end of the motion, i.e., when the swing leg establishes contact with the ground. The impact is assumed to be inelastic and instantaneous, causes a discontinuity on the joint velocities q : and gives rise to an impact force ±F. The post-impact velocities q : þ and the impact force ±F can be calculated by applying the principle of momentum conservation at the full dynamical system of the robot, i.e., we include the x-y coordinates of one point of the robot À À À in our case the tip of the stance leg (see Fig. 2). In addition, the post-impact velocities q : þ and the impact force ±F can be expressed as a function of the pre-impact joint velocities q where D e 2 R ðnþ2ÞÂðnþ2Þ denotes the mass-inertia matrix of the full dynamic model, E 2 R 2Âðnþ2Þ denotes the Jacobian matrix of the tip of the swing leg and q e ¼ ½q > 0 0 > since the x-y coordinates of the tip of the stance leg ðx st and y st Þ are both zero. Finally, Y ¼ ½I n 0 nÂ2 > . If we solve (4) for q : þ and ±F, we get an expression of the form where the upper part of (5) can be used to reset the swing phase model (1) after the impact, given that a proper re-labeling of the coordinates takes place which allows us to use the same dynamic model for any leg in stance phase.
x y

Zero Dynamics of Underactuated Walking
The design of stable periodic orbits and À À À our focus here À À À transitions between them requires conditions which ensure that such orbits and transitions are dynamically feasible. Such conditions can be provided by the HZD framework. We show that the zero dynamics allow for a Lagrangian formulation which facilitates the derivation of these conditions. Furthermore, we de¯ne the Poincar e Map and its corresponding¯xed point in the zero dynamics manifold. In order to do so, however, we need to start with the computation of the output relative degree of our system.

Output relative degree
Di®erentiating each output twice will result in showing that each output has a relative degree r of 2 and therefore the vector relative degree of the system is ðn À 1Þr ¼ 2n À 2 6 ¼ 2n (Ch. 5). 21 As a consequence, we need two more equations to construct a valid coordinate transformation and proceed with input-output feedback linearization.

Input-Output feedback linearization
The purpose of the input-output feedback linearization is to introduce a coordinate transformation z ¼ TðxÞ and a control uðxÞ ¼ ®ðxÞ þ¯ðxÞv such that a linear relation between the new input v and the outputs y is established and therefore tools of linear control theory can be applied. In addition, the linear input-output relation is desired to be expressed in normal form, so that the control uðxÞ can be easily extracted.
The transformation matrix has to de¯ne a di®eomorphism, such that the transformation is invertible and both the transformation and its inverse are di®erentiable. The¯rst 2n À 2 coordinates can be chosen as de¯ned as @z 2nÀ1 @x gðxÞ ¼ @z 2n @x gðxÞ ¼ 0, since Equation (8) holds for _ z 2n as well. A valid choice for z 2nÀ1 and z 2n is where°0ðqÞ is the nth line of the mass-inertia matrix D and 2 is the conjugate angular momentum around the unactuated DoF q n . The vector c has to be chosen such that the nth element is nonzero.
With this transformation, we can write the time derivative of z as ; and the control law can be chosen as where vðxÞ ¼ ÀK D @h @q q and K D , K P are derivative and proportional gain matrices, respectively, of appropriate dimension. The control in (11) zeroes the outputs hðqÞ from (1). When the outputs are zeroed, zero dynamics arise which are de¯ned as the manifold Furthermore, the variable in (9) can be chosen to be monotonically increasing and replace time, which results in an autonomous closed-loop system. By using as a replacement of time, the original coordinates can be reconstructed as where Once we are in the zero dynamics manifold, the zero dynamics have to be checked for stability. This check can be facilitated by expressing the zero dynamics in a Lagrangian form.

Form of the zero dynamics
The zero dynamics can be brought in a Lagrangian form. This allows to identify a potential and a kinetic energy function and¯nally impose a feasibility condition for periodic and aperiodic walking. In detail, if we use (8) for the coordinate z 2nÀ1 ¼ 1 we get However, in order to be able to introduce Lagrangian dynamics in the zero dynamics manifold, we manipulate the expression for 2 from (9) to come up with an expression of the form For z 2n ¼ 2 , we can derive an expression of the form : 2 ¼ 2 ð 1 Þ as follows: : > @D @q n and since q n is cyclic we get C n ¼ q : > @°> 0 @q . Finally, where V is the potential energy function of (1).

Lagrangian dynamics in the zero dynamics manifold
If we di®erentiate the expression (16) for : 1 , we get :: In this expression of Lagrangian dynamics, we can identify a kinetic energy function and a potential energy function Now we can¯nd conditions such that a motion is dynamically feasible and stable.

Feasibility and stability
The feasibility condition states that the post-impact kinetic energy in the zero dynamics manifold K zero ð þ 2 Þ has to be greater than the maximum value V max zero of the potential energy function. Since K zero ð þ 2 Þ ¼ zero K zero ð À 2 Þ where zero is the kinetic energy exchange ratio at the impact in the zero dynamics manifold, the feasibility condition can be expressed as where the minus superscript in À 2 denotes the value of 2 at the end of the motion (before the impact). At the end of each step the pre-impact kinetic energy K zero ð À 2 Þ is given by: which can also be written as: Now a Poincar e map P can be de¯ned which maps the pre-impact kinetic energy of the current step K k zero ð À 2 Þ to the next one K kþ1 zero ð À 2 Þ ¼ PðK k zero ð À 2 ÞÞ, such that The¯xed point of the Poincar e map P is given as and the domain of attraction D of P is given by The ðÁÞ denotes that the sign is based on the angle notation used (in correspondence to Fig. 2 it should be \-"). Finally, if j zero j < 1, the orbit is exponentially stable.

Transition between periodic orbits
Assume we are given two stable periodic orbits Á i ðtÞ and Á f ðtÞ and the¯xed points of their Poincar e Maps correspond to x À , i.e., the state of the robot before the impact. These orbits are solutions of the dynamics (1) with an impact at the end of the motion ðt ¼ T Þ and ful¯ll di®erent task objectives, such as walking with a desired average velocity. A transition Á i!f ðtÞ between these two orbits is feasible when the post-impact state x þ of the system is inside the domain of attraction of the target orbit. In the zero dynamics manifold and in a similar fashion to (23), this condition is expressed as If (29) holds, then convergence to the target limit cycle is guaranteed, given that j zero; f j < 1.

Settling Time Reduction as an Optimal Control Problem
After having explained the HZD framework, we formulate the objective of the Settling Time Reduction between two stable periodic orbits and we describe how it can be solved with optimal control. The objective of the Settling Time Reduction is to¯nd a transition Á i!f ðtÞ s.t.
. (29) is ful¯lled, . its overall duration T is relatively small and . the distance jjq In order to solve the Settling Time Reduction problem with optimal control, wē rst have to de¯ne constraints that have to be ful¯lled by the transition motion and a cost function which when minimized attaches the desired characteristics to the motion.

Constraints
In an OCP di®erent equality and inequality constraints are imposed involving the system state and controls. These constraints can be either linear or nonlinear. The ones related to the Settling Time Reduction are listed and described below: . The OCP is always subject to the dynamics of the system that are described in (1). . The initial state of the transition is the initial state of Á i ð0Þ such that . The terminal state is constrained only in terms of the joint positions q T , i.e., which correspond to the terminal joint values of Á f . The joint velocities q : are not included because of the underactuation that makes it di±cult to reach them exactly, preventing us from forming a 2 point boundary value problem. Instead, we leave them free and penalize their deviation from the desired ones q : f ðT Þ in the cost function. . Since we are dealing with systems with impact e®ects, we have to impose a constraint that ensures that the impact is valid. In order to do so, the impact force ±F has to respect the friction cone constraint and the vertical impact force component F y has to be positive According to (5) the matrix ¢ F is a function of the terminal con¯guration of the robot. As a consequence, since we know the desired terminal joint positions q T ¼ q f and x st ¼ 0 and y st ¼ 0, we can compute the desired elements of the matrix ¢ F at the end of the transition motion. With that, the constraints on the impact force ±F can be expressed linearly in q : T . In order to express the constraint (33) linearly to q : T , we decompose the desired impact matrix to its lines as ¢ F ¼ ½¢ >;1 F ¢ >;2 F > . In such a way, the constraints on the impact force ±F can be written as . The friction cone constraint that holds for the impact force ±F has to hold also for the ground reaction forces F during the transition, i.e., and the vertical force component F y has to be positive: . The transition must bring the state of the robot inside the domain of attraction of the target orbit as dictated by (29). This condition can be expressed linearly in the terminal joint velocities q : T in the following way: At¯rst, the maximum value of the potential energy at the zero dynamics manifold V max zero;f is known. In addition, we need to multiply the last line of the mass-inertia matrix D n with the postimpact joint velocities q : þ . Using (5) and excluding the last two lines, the postimpact joint velocities are given as q T . The matrix Á q is a function of the terminal con¯guration of the robot and therefore the desired elements of Á q can be computed. This is due to the fact that the desired terminal con¯guration q f ðT Þ is known and for the tip of the stance leg x st ðT Þ ¼ y st ðT Þ ¼ 0. Finally, as already mentioned, we need the last line of the mass-inertia matrix D n which can be evaluated using the post-impact joint positions q þ whose desired values are q f ð0Þ. Bringing everything together yields . The swing foot has to be above the ground during the transition motion: The desired¯nal and initial posture of the robot guarantees that y s ¼ 0 for t ¼ 0 and t ¼ T . . The state x, input u and total duration of the motion T should satisfy the following box constraints: x min x x max ; ð41Þ The constraint on the state x is imposed to ensure the satisfaction of physical and style constraints. The input u is constrained such that we do not exceed any imposed actuator limits. Finally, the time duration T is constrained to prevent the solution search space from unreasonably increasing.

Cost function
As already described, the terminal joint velocities q : T are free and we penalize their deviation from the pre-impact joint velocities q : f ðT Þ of the orbit Á f . At the same time, we want to penalize the¯nal time T such that we enter the domain of attraction of the target periodic orbit not only close to the¯xed point, but also fast. Finally, we want the magnitude of the torques to be low, such that \energy" criteria are also taken into account and a smooth transition motion is produced. Therefore the employed cost function is given by where À ¼°> 0 ðq f ðT ÞÞ°0ðq f ðT ÞÞ. The¯rst term is equal to jj Ã 2 À 2 jj 2 and penalizes deviations from the¯xed point of the target periodic orbit, but through 2 instead of K zero . The use of 2 retains the information of the¯xed point and prevents the term from becoming unnecessarily complicated. The constant penalizes large values of the total time duration T and the matrix W has weighting and scaling purposes and favors or disfavors \energy" consumption against convergence error and total time duration. These weighting factors assist in determining how much di®erent control inputs are penalized or how much the deviation from the desired¯xed point is penalized.

Optimal control formulation
The problem of¯nding a transition motion between two periodic orbits such that the settling time to the target limit cycle is minimized can be formulated as an OCP as For the solution of this OCP, we used the ACADO software. 22 This software implements a direct multiple shooting algorithm with an equidistant grid for control discretization. In such a way, the dynamic optimization problem is split into a set of smaller static Nonlinear Programs. For more details on multiple shooting algorithms, refer to further readings. 23

Discussion
A transition between two stable periodic orbits can be designed with B ezier polynomials of M th order which are employed for each actuated DoF. 9 In this B ezier polynomial-based approach, the¯rst two and the last two coe±cients of each actuated DoF are determined such that the initial and terminal state of the transition are valid with respect to the pre-impact and post-impact states of Á i and Á f , respectively. The remaining M À 3 coe±cients are taken by averaging the corresponding coe±cients of the two periodic orbits ð4 Â ðM À 3Þ coe±cients in total). As stated in the literature, they can also be found through optimization. We decided against such an approach due to the following reason.
As stated in the book of Agoston (Chapter 11), 24 by employing B ezier polynomials, the further a control point is from a point of the curve, the smaller its e®ect on that point. By¯xing the¯rst and last two coe±cients of each polynomial, we have less freedom of shaping the desired trajectories such that they satisfy the aforementioned constraints (especially (32)-(39)). As a consequence, this approach provides limited user in°uence on the distance from the¯xed point Ã 2 of Á f and by that, on the settling time to the limit cycle of Á f .

Numerical Results
In this section, we provide numerical results by solving the presented problem for a transition from a limit cycle corresponding to a velocity of 1 m/s to one that enables the robot to walk with 1.5 m/s. The robot model used is a 5-link biped as depicted in Fig. 2 and its kinematic and dynamic parameters are chosen such that they match those of RABBIT. 25 In addition, the periodic orbits that dictate periodic walking with the aforementioned velocities have been designed using the optimization methodology proposed in the work of Westervelt et al. 26 and the vector c is chosen as c ¼ ½0 0 À 1 À 0:5 À 1 > (refer to the de¯nition of 1 in (8)).
In Table 1, we compare the convergence error between the transition methodology from Yang et al. 9 and our OCP approach for the transition under study. For this comparison, we use di®erent combinations of the weighting factors in the cost function (44), where we choose a diagonal W matrix with the same diagonal value w. The convergence error is de¯ned as and as expected, the less we penalize the magnitude of the torques, the less the convergence error of the OCP approach gets in comparison to the one of the approach of Yang et al. The penalization of the total time duration has not the same impact as w but we can observe an optimal behavior for ¼ 10. Finally, for all the combinations, the system using the OCP approach has entered the domain of attraction of the target periodic orbit faster than the one-step approach.
In the sequel, we will discuss the behavior of the transition motion for the values ¼ 50 and w ¼ 0:05. Regarding the torques, in Fig. 3, we see that the lowest torque requirements are for the knee of the swing leg u 1 , since it has to manipulate only the relatively smaller mass of the tibia of the swing leg and the greatest e®ort is taken by the hips, i.e., u 2 and u 3 where the peak values are observed at the beginning of the motion. This behavior shows that the robot is trying to generate enough momentum at the beginning of the motion to reach a higher velocity fast. The torque pro¯le of the knee of the stance leg u 4 exhibits also a similar behavior. A nice feature of the generated motion is that all torques settle close to zero at the end of the motion. Finally, the peak torques are always inside the assumed limits of AE100 N Á m and the friction constraint is always respected for the assumed static friction coe±cient of s ¼ 0:7 as shown in Fig. 4. We also provide a stick diagram animation of the motion (Fig. 5) where it can be seen that the robot in order to accelerate fast to the desired velocity of 1.5 m/s is utilizing the inertia of its torso and concludes the transition with the torso leaned forward. Finally, we show the values of the matrix ¡ (see (44) ; where it is shown that most of the nondiagonal elements have large values and, as a consequence, there is a correlation between all the variances of the terminal joint velocities. The stronger correlation however is the cross-and auto-correlations of the DoFs of the stance leg (q 3 and q 4 ) and the unactuated DoF q 5 . This¯nding is expected since q 3 and q 4 belong to the stance leg which manipulates the whole mass of the robot and together with q 5 , they have a major impact in accelerating or decelerating the robot during the transition motion.

Further discussion À À À Real robot application
Utilizing the proposed framework on a real robot introduces the additional challenges of dealing with modeling uncertainties and disturbances. In order to cope with them, the idea of Control Lyapunov Functions can be utilized in order to stabilize the OCRresulting transition motions against the uncertainties in the model and the external disturbances. This methodology has been utilized in order to stabilize periodic motions of underactuated walking robots. 27 In addition, it was used in our previous work in order to online correct balancing motions that reject disturbances of various magnitudes, while respecting the actuators, friction and balancing constraints. 28 An alternative to the approach of the Control Lyapunov Functions is to attach robustness characteristics to the optimal transition motion. For that purpose, the stabilization of the trajectories of the OCP-resulting motion can be achieved by means of LQR control and an estimate of the region of attraction of the LQR controller can be calculated with the use of Sums-of-Square optimization. 29 With this information, the robot will be able to know during a transition motion if a given disturbance can be stabilized by using LQR control to track the optimal trajectories or a di®erent motion or approach has to be utilized.

Conclusion
This paper proposes an optimal control algorithm for reducing the settling time of transitions between periodic orbits of underactuated walking robots. Due to the underactuation, the feasibility of such a transition has to be explained with the help of the HZD framework, which we present in an intuitive way. The cost function to be minimized in the optimal control problem takes into account the convergence error to the¯xed point of the target periodic orbit, the torque consumption and the total time duration of the motion. The feasibility condition is stated as an inequality constraint and further constraints regarding the transition and the style of the motion are imposed. For numerical evaluation, we utilize a 5-link biped robot and present the e®ect of di®erent weighting of the terms in the cost function for a sample transition motion. As shown by the results, optimizing such a transition can provide a huge advantage in terms of settling time reduction in comparison to the one-step transition that is usually utilized. 9 As future work, we are interested in pursuing the di®erent methodologies discussed in the Sec. 5.1 and hence be able to apply our optimal control approach for transitioning between periodic orbits on a underactuated walking robot.