深度强化学习及其在工业场景的应用与展望

谭靖; 杨利刚; 李潇睿; 袁兆麟; 崔允端; 姚超; 王宗杰; 班晓娟

doi:10.13374/j.issn2095-9389.2024.10.29.006

摘要: 工业控制系统（Industrial control systems, ICS）在现代工业生产中发挥关键作用，负责监控和控制工业过程，确保高效、安全和稳定的生产. 随着工业4.0和智能制造的发展，传统工业控制方法难以应对日益复杂且动态变化的生产环境. 深度强化学习（Deep reinforcement learning, DRL）结合了深度学习与强化学习的优势，在工业智能控制领域展现出巨大潜力. 本文综述了DRL在工业智能控制中的应用现状和研究进展. 首先介绍了DRL的基本原理及相关算法，并简述工业控制的背景，分析智能控制的应用需求与现存挑战. 随后，详细综述了DRL在工业领域的应用，并对当前研究进行了总结，最后对未来研究方向提出了展望.

Abstract: Industrial production is fundamental to human society. Industrial control systems (ICS) serve as the cornerstone of modern industrial processes and are responsible for monitoring and controlling operations to ensure efficiency, safety, and stability. Central to these systems are control algorithms, which enable the automation of operations, optimization of process parameters, and reduction of operational costs. However, with the rapid advancements in Industry 4.0 and smart manufacturing, traditional control methods are increasingly inadequate to address the growing complexity, high dynamics, and real-time demands of modern industrial environments. Deep reinforcement learning (DRL), which integrates the high-dimensional feature extraction of deep learning with the adaptive decision-making capabilities of reinforcement learning, has emerged as a transformative technology in intelligent industrial control. This paper provides a comprehensive review of DRL’s principles, methodologies, and applications in industrial scenarios. The review begins with an introduction to the fundamental concepts of DRL, including the Markov decision process (MDP) framework and the Bellman equation for optimizing decision-making strategies, followed by an exploration of the latest advancements in both online and offline reinforcement learning algorithms. The paper systematically examines the background and challenges of industrial control systems, highlighting the limitations of traditional methods such as proportional–integral–derivative (PID) control and rule-based systems when faced with multi-variable, nonlinear, and dynamic processes. By analyzing the evolving demands of intelligent control, the review underscores the necessity for advanced, self-learning approaches, such as DRL, that are capable of operating effectively in environments with incomplete information, real-time constraints, and multiple conflicting objectives. As a key contribution, a novel classification framework for DRL applications in industrial scenarios is proposed. Current research is categorized into three domains: (1) adaptive optimization in dynamic environments, enabling systems to respond to changes such as market fluctuations, equipment degradation, and operational disturbances; (2) decision-making under multi-objective and constrained conditions, with DRL balancing competing goals such as efficiency, cost, and sustainability while adhering to technical constraints; and (3) performance enhancement in complex systems, whereby DRL tackles high-dimensional, nonlinear, and coupled processes to improve stability, scalability, and operational excellence. This framework provides new perspectives for designing control algorithms tailored to specific industrial contexts. The review also synthesizes key findings from recent DRL studies, presenting a detailed evaluation of their achievements, limitations, and opportunities for improvement. Case studies across sectors such as energy management, manufacturing, and process optimization illustrate the versatility and effectiveness of DRL in solving diverse industrial problems. However, challenges remain, including the need for high-quality training data, computational efficiency in high-dimensional spaces, and robust algorithms capable of handling uncertainties and safety-critical conditions. To address these challenges, future research directions that are essential for advancing DRL in industrial applications are outlined. These include the development of high-fidelity industrial process simulators, techniques to improve sample efficiency and generalization across varying conditions, and methods to enhance interpretability and transparency in DRL decision-making processes. In conclusion, this paper emphasizes DRL’s transformative potential in redefining industrial control paradigms. By overcoming current limitations and fostering interdisciplinary collaboration, DRL is well-positioned to drive innovation in industrial intelligence and automation. The insights and frameworks presented in this review offer a valuable foundation for future research, accelerating the adoption of DRL technologies in real-world industrial settings and paving the way for the next generation of smart manufacturing systems.

深度强化学习及其在工业场景的应用与展望

Deep reinforcement learning applications and prospects in industrial scenarios