Previous Articles    

Trajectory Tracking Control for Under-Actuated Hovercraft Using Differential Flatness and Reinforcement Learning-Based Active Disturbance Rejection Control

KONG Xiangyu, XIA Yuanqing, HU Rui, LIN Min, SUN Zhongqi, DAI Li   

  1. School of Automation, Beijing Institute of Technology, Beijing 100190, China
  • Received:2022-01-14 Published:2022-04-13
  • Contact: XIA Yuanqing. Email: xia_yuanqing@bit.edu.cn
  • Supported by:
    This paper was supported by the National Natural Science Foundation of China under Grant No. 61720106010.

KONG Xiangyu, XIA Yuanqing, HU Rui, LIN Min, SUN Zhongqi, DAI Li. Trajectory Tracking Control for Under-Actuated Hovercraft Using Differential Flatness and Reinforcement Learning-Based Active Disturbance Rejection Control[J]. Journal of Systems Science and Complexity, 2022, 35(2): 502-521.

This paper proposes a scheme of trajectory tracking control for the hovercraft. Since the model of the hovercraft is under-actuated, nonlinear, and strongly coupled, it is a great challenge for the controller design. To solve this problem, the control scheme is divided into two parts. Firstly, we employ differential flatness method to find a set of flat outputs and consider part of the nonlinear terms as uncertainties. Consequently, we convert the under-actuated system into a full-actuated one. Secondly, a reinforcement learning-based active disturbance rejection controller (RL-ADRC) is designed. In this method, an extended state observer (ESO) is designed to estimate the uncertainties of the system, and an actorcritic-based reinforcement learning (RL) algorithm is used to approximate the optimal control strategy. Based on the output of the ESO, the RL-ADRC compensates for the total uncertainties in real-time, and simultaneously, generates the optimal control strategy by RL algorithm. Simulation results show that, compared with the traditional ADRC method, RL-ADRC does not need to manually tune the controller parameters, and the control strategy is more robust.
[1] Okafor B E, Development of a hovercraft prototype, International Journal of Engineering and Technology, 2013, 3(3): 276–281.
[2] Jeong S and Chwa D, Coupled multiple sliding-mode control for robust trajectory tracking of hovercraft with external disturbances, IEEE Transactions on Industrial Electronics, 2017, 65(5): 4103–4113.
[3] Jiang P, Global tracking control of underactuated ships by Lyapunov’s direct method, Automatica, 2002, 38(2): 301–309.
[4] Do D, Practical control of underactuated ships, Ocean Engineering, 2010, 37(13): 1111–1119.
[5] Do D, Jiang P, and Pan J, Underactuated ship global tracking under relaxed conditions, IEEE Transactions on Automatic Control, 2002, 47(9): 1529–1536.
[6] Dong Z, Wan L, Li Y, et al., Trajectory tracking control of underactuated USV based on modified backstepping approach., International Journal of Naval Architecture and Ocean Engineering, 2015, 7(5): 817–832.
[7] Fu H, Analysis and consideration on safety of all-lift hovercraft, Ship & Boat, 2008, 6: 1–3.
[8] Fu M, Gao S, Wang C, et al., Human-centered automatic tracking system for underactuated hovercraft based on adaptive chattering-free full-order terminal sliding mode control, IEEE Access, 2018, 6: 37883–37892.
[9] Duan G, High-order fully actuated system approaches: Part I. Models and basic procedure, International Journal of Systems Science, 2021, 52(2): 422–435.
[10] Duan G, High-order fully actuated system approaches: Part II. Generalized strict-feedback systems. International Journal of Systems Science, 2021, 52(3): 437–454.
[11] Duan G, High-order fully actuated system approaches: Part III. Robust control and high-order backstepping. International Journal of Systems Science, 2021, 52(5): 952–971.
[12] Tee K P and Ge S S, Control of fully actuated ocean surface vessels using a class of feedforward approximators, IEEE Transactions on Control Systems Technology, 2006, 14(4): 750–756.
[13] Zheng P, Tan X, Kocer B, et al., TiltDrone: A fully-actuated tilting quadrotor platform, IEEE Robotics and Automation Letters, 2020, 5(4): 6845–6852.
[14] Zhao Z, He W, and Ge S S, Adaptive neural network control of a fully actuated marine surface vessel with multiple output constraints, IEEE Transactions on Control Systems Technology, 2013, 22(4): 1536–1543.
[15] Martin P and Rouchon P, Any (controllable) driftless system with m inputs and m +2 states is flat. Proceedings of 199534th IEEE Conference on Decision and Control, 1995, 3: 2886–2891.
[16] Ma D, Xia Y, Shen G, et al., Flatness-based adaptive sliding mode tracking control for a quadrotor with disturbances, Journal of the Franklin Institute, 2018, 355(14): 6300–6322.
[17] Yu Y and Lippiello V, 6D pose task trajectory tracking for a class of 3D aerial manipulator from differential flatness, IEEE Access, 2019, 7: 52257–52265.
[18] Xia Y, Lin M, Zhang J, et al., Trajectory planning and tracking for four-wheel steering vehicle based on differential flatness and active disturbance rejection controller, International Journal of Adaptive Control and Signal Processing, 2021, 35(11): 2214–2244.
[19] Han J, From PID to active disturbance rejection control, IEEE transactions on Industrial Electronics, 2009, 56(3): 900–906.
[20] Xia Y, Dai L, Fu M, et al., Application of active disturbance rejection control in tank gun control system, Journal of the Franklin Institute, 2014, 351(4): 2299–2314.
[21] Li J, Xia Y, Qi X, et al., On the necessity, scheme, and basis of the linear nonlinear switching in active disturbance rejection control, IEEE Transactions on Industrial Electronics, 2016, 64(2): 1425–1435.
[22] Xue W, Bai W, Yang S, et al., ADRC with adaptive extended state observer and its application to airfuel ratio control in gasoline engines, IEEE Transactions on Industrial Electronics, 2015, 62(9): 5847–5857.
[23] Lotufo A, Colangelo L, Perez-Montenegro C, et al., UAV quadrotor attitude control: An ADRCEMC combined approach, Control Engineering Practice, 2019, 84: 13–22.
[24] Xia Y, Pu F, Li S, et al., Lateral path tracking control of autonomous land vehicle based on ADRC and differential flatness, IEEE Transactions on Industrial Electronics, 2016, 63(5): 3091–3099.
[25] Li Z, Wei Y, Zhou X, et al., Differential flatness based ADRC scheme for underactuated fractional order systems, International Journal of Robust and Nonlinear Control, 2020, 30(7): 2832–2849.
[26] Gao Z, Scaling and bandwidth-parameterization based controller tuning, Proceedings of the American Control Conference, 2006, 6: 4989–4996.
[27] Li J, Xia Y, Qi X, et al., On the necessity, scheme, and basis of the linear nonlinear switching in active disturbance rejection control, IEEE Transactions on Industrial Electronics, 2016, 64(2): 1425–1435.
[28] Zhou X, Gao H, Zhao B, et al., A GA-based parameters tuning method for an ADRC controller of ISP for aerial remote sensing applications, ISA Transactions, 2018, 81: 318–328.
[29] Duan J, Yi Z, Shi D, et al., Reinforcement-learning-based optimal control of hybrid energy storage systems in hybrid ACDC microgrids, IEEE Transactions on Industrial Informatics, 2019, 15(9): 5355–5364.
[30] Kiumarsi B, Vamvoudakis K G, Modares H, et al., Optimal and autonomous control using reinforcement learning: A survey, IEEE Transactions on Neural Networks and Learning Systems, 2017, 29(6): 2042–2062.
[31] Jiang Y, Fan J, Chai T, et al., Data-driven flotation industrial process operational optimal control based on reinforcement learning, IEEE Transactions on Industrial Informatics, 2017, 14(5): 1974–1989.
[32] Fu M, Gao S, Wang C, et al., Human-centered automatic tracking system for underactuated hovercraft based on adaptive chattering-free full-order terminal sliding mode control, IEEE Access, 2018, 6: 37883–37892.
[33] Fu M, Zhang T, and Ding F, Adaptive finite-time PI sliding mode trajectory tracking control for underactuated hovercraft with drift angle constraint, IEEE Access, 2019, 7: 184885–184895.
[34] Sira-Ramrez H and Ibez C A, On the control of the hovercraft system, Dynamics and control, 2000, 10(2): 151–163.
[35] Al-Tamimi A, Lewis F L, and Abu-Khalaf M, Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2008, 38(4): 943–949.
[36] Lin F, An optimal control approach to robust control design, International Journal of Control, 2000, 73(3): 177–186.
[37] Luo B, Wu H N, Huang T, et al., Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design, Automatica, 2014, 50(12): 3281–3290.
[38] Bian T, Jiang Y, and Jiang Z P, Adaptive dynamic programming and optimal control of nonlinear nonaffine systems, Automatica, 2014, 50(10): 2624–2632.
[1] CHEN Shuhang · DEVRAJ Adithya · BERSTEIN Andrey · MEYN Sean. Revisiting the ODE Method for Recursive Algorithms: Fast Convergence Using Quasi Stochastic Approximation [J]. Journal of Systems Science and Complexity, 2021, 34(5): 1681-1702.
[2] XUE Wenchao,HUANG Yi. Tuning of Sampled-Data ADRC for Nonlinear Uncertain Systems [J]. Journal of Systems Science and Complexity, 2016, 29(5): 1187-1211.
[3] GUO Jianxin,XUE Wenchao,HU Tao. Active Disturbance Rejection Control for PMLM Servo System in CNC Machining [J]. Journal of Systems Science and Complexity, 2016, 29(1): 74-98.
Viewed
Full text


Abstract