Deep reinforcement learning (DRL) is based on rigorous mathematical foundations and adjusts network parameters through interactions with the environment. The stability problem of maintaining a vehicle on a continuous path can be achieved by soft actor-critic (SAC). Furthermore, a model predictive control (MPC) with prediction and control horizons under multivariable constraints can precisely follow the path, but the disadvantage is its large computation. In this paper, a DRL control scheme with MPC is proposed to precisely and effectively implement the path following and obstacle avoidance of tracked vehicle. The DRL controller performs the effective obstacle avoidance and is also in accordance with MPC to precisely follow planning paths. To make the training more realistic, a data-driven state-space dynamic model of the tracked vehicle is first estimated via N4SID system identification algorithm. During the DRL training, the MPC output is used as the reward input of the DRL to learn the MPC characteristics and an additional reward function is designed specifically for obstacle avoidance. The parameters of the DRL agent are adjusted based on the environment input and the MPC output. After the training, the MPC can be skipped since it is used as a part of the reward function, and the DRL has learned to imitate the MPC while achieves obstacle avoidance. The simulation and experimental results show that the overall controller has high stability, accuracy, and efficiency.
|