异构非线性多机器人系统无模型最优同步学习研究

王曲; 尹凯璇; 宋睿卓; 夏丽娜

doi:10.13374/j.issn2095-9389.2026.03.10.003

异构非线性多机器人系统无模型最优同步学习研究

Model-free optimal synchronization learning for heterogeneous multi-robots

摘要

摘要: 本文针对动力学未知的异构多机器人系统，研究其基于无模型强化学习的最优同步控制问题。由于异构多机器人系统中各机器人通常具有不同的动力学特性、结构参数及运动约束，且系统整体存在较强的非线性、不确定性和耦合特征，传统依赖精确模型的同步控制方法在实际应用中往往面临建模困难、控制性能受限等问题。为此，本文在无需系统精确动力学模型的前提下，提出了一种面向异构多机器人系统的无模型最优同步控制方法。首先，构建多自由度机器人动力学模型，并将其转化为标准控制仿射非线性系统形式，为后续控制器设计和强化学习算法实现提供统一的理论基础。其次，针对系统未知动态和同步过程中状态信息难以准确获取的问题，设计了一种新型辨识网络及其权重更新律，实现对未知动力学的在线学习与辨识，同时引导跟随机器人逐步逼近领航者状态。进一步，基于评价网络设计无模型最优同步控制算法，通过在线逼近性能指标函数，在不依赖系统精确模型信息的条件下，实现同步性能与控制代价之间的优化权衡。通过严格的稳定性分析证明，闭环系统中所有信号均一致有界，且多机器人系统的同步误差渐近收敛于零，从而保证了所提算法的稳定性和可行性。最后，通过仿真结果验证了所提算法的有效性和优越性，表明该方法能够在动力学未知条件下实现异构多机器人系统对领航者的高性能最优同步跟踪。

Abstract: This paper investigates the optimal synchronization control problem for heterogeneous multi-robot systems with unknown dynamics under a model-free reinforcement learning framework. Heterogeneous multi-robot systems have attracted increasing attention due to their broad applications in intelligent manufacturing, cooperative transportation, environmental monitoring, and search-and-rescue missions. However, because different robots often possess distinct physical structures, dynamic characteristics, and actuation capabilities, the resulting system is usually featured by strong nonlinearity, parameter uncertainty, and complex coupling effects. These characteristics make the design of high-performance synchronization controllers particularly challenging. In addition, most existing control approaches rely heavily on accurate mathematical models of the controlled plants. In practical engineering scenarios, however, it is often difficult or even impossible to obtain precise dynamic models for all robots in the network. Consequently, traditional model-based control schemes may suffer from degraded performance or limited applicability. Motivated by these challenges, this paper develops a novel model-free optimal synchronization control approach for heterogeneous multi-robot systems without requiring exact knowledge of the system dynamics. First, the dynamic model of a multi-degree-of-freedom robotic system is established and then transformed into the standard control-affine nonlinear system form. This transformation provides a convenient theoretical basis for subsequent controller design and learning algorithm development. Compared with conventional formulations, the adopted representation is more suitable for integrating adaptive approximation techniques and reinforcement learning methods into the synchronization control framework. On this basis, the considered heterogeneous multi-robot system can be described in a unified manner, which facilitates both theoretical analysis and algorithm implementation. Next, to address the difficulties caused by unknown dynamics and the lack of accurate state-related information during the synchronization process, a novel identification network together with its corresponding weight update law is proposed. The designed identification mechanism is capable of approximating the unknown nonlinear dynamics online and effectively capturing the essential behavior of each robot without relying on prior model knowledge. Meanwhile, by incorporating the leader-following synchronization objective into the network design, the proposed identifier not only improves the estimation accuracy of the system dynamics, but also drives each follower robot to gradually approach the leader’s motion trajectory. In this way, the proposed scheme realizes online learning and effective identification of the system states, laying the foundation for optimal control design under unknown dynamic environments. Furthermore, a critic-network-based model-free optimal synchronization control algorithm is developed for the heterogeneous multi-robot system. Different from conventional optimal control methods that require the Hamilton–Jacobi–Bellman equation to be solved based on exact system models, the proposed approach employs reinforcement learning to approximate the performance index function online and derive the corresponding optimal control policy in a data-driven manner. As a result, the algorithm can achieve an effective balance between synchronization performance and control energy consumption, while avoiding dependence on precise model information. This feature significantly enhances the applicability of the method in practical robotic systems with uncertain or partially unknown dynamics. To guarantee the reliability of the proposed approach, rigorous stability analysis is carried out within the closed-loop framework. It is proven that all signals in the resulting closed-loop system remain uniformly bounded throughout the learning and control process. More importantly, the synchronization errors among the robots asymptotically converge to zero, indicating that all follower robots can ultimately achieve coordinated motion with the leader. These theoretical results demonstrate the feasibility, stability, and optimality of the proposed control strategy from a solid analytical perspective. Finally, simulation examples are provided to validate the effectiveness of the proposed method. The simulation results show that, even in the presence of unknown dynamics and heterogeneity among robots, the developed algorithm can successfully realize optimal synchronization tracking of the leader. Moreover, the proposed approach exhibits strong learning capability, satisfactory synchronization accuracy, and desirable control performance. These results confirm that the presented model-free reinforcement learning method offers a promising and effective solution for the optimal synchronization control of heterogeneous multi-robot systems.

HTML全文

参考文献(0)

施引文献

资源附件(0)