车联网环境下基于深度强化学习的多DAG应用部分卸载

徐建; 苏圣超; 汪义旺

doi:10.13374/j.issn2095-9389.2024.09.05.004

车联网环境下基于深度强化学习的多DAG应用部分卸载

Partial offloading of multi-DAG applications based on deep reinforcement learning in the internet of vehicles

摘要

摘要: 针对边缘计算环境下车联网中多个有向无环图（Directed acyclic graph, DAG）型应用的部分卸载问题，本文提出了一种基于深度强化学习的部分卸载算法. 首先，以最大化时延能耗的综合效用为目标，构建了DAG应用的部分卸载模型；然后，采用执行优先级算法将DAG应用转化为序列结构，用于确定DAG应用中各子模块的执行优先级. 基于此，设计了基于递归神经网络（Recurrent neural networks, RNNs）的序列到序列策略网络. 最后，将多DAG应用的部分卸载问题转换为单DAG应用的部分卸载问题，并基于深度强化学习实现多个DAG型应用的部分卸载. 实验结果表明，在相同的卸载场景下，所提算法实现的综合效用优于基线算法，提高了车联网的服务质量.

Abstract: The integration of mobile edge computing (MEC) with intelligent connected vehicles (ICVs) presents an innovative framework to address resource-constrained environments prevalent in the Internet of Vehicles (IoV). As ICV technology continues to evolve, emerging compute-intensive applications—such as autonomous driving navigation and augmented reality interfaces—are driving the demand for enhanced real-time processing capabilities and resource allocation. However, the inherent limitations of onboard computing resources in ICVs pose significant challenges for the effective execution of these latency-sensitive applications. Although existing research on computation offloading has made progress, two major limitations persist: 1) insufficient consideration of intertask dependencies in complex applications and 2) overly simplistic assumptions that primarily focus on single-application scenarios, thereby overlooking heterogeneous multi-application environments typically utilized by ICVs. To address these gaps, this paper proposes a deep reinforcement learning (DRL)-based partial offloading algorithm specifically designed for MEC-driven ICV scenarios, where tasks exhibit directed acyclic graph (DAG) dependencies across multiple applications. The proposed method employs a two-stage hierarchical modeling architecture. In the first stage, by leveraging dependency-aware scheduling, dynamic execution priorities are assigned to convert the complex DAG topology into a linear task chain. In the second stage, a heterogeneous DAG workflow aggregation strategy is introduced, transforming the multi-DAG offloading problem into a unified single-DAG optimization framework to enable efficient resource coordination across concurrent applications. To model the offloading decision process, the system is formalized as a Markov decision process (MDP), where each state transition corresponds to a binary offloading decision (local execution versus edge server offloading), effectively balancing latency and energy consumption. The solution to the MDP is formulated using a sequence-to-sequence neural network architecture with hierarchical recurrent layers. This architecture captures the spatiotemporal dependencies between subtasks by encoding historical task states with a bidirectional gated recurrent unit and utilizing an attention mechanism in the decoder to predict the optimal offloading operation. Furthermore, the system adopts the Asynchronous Advantage Actor-Critic (A3C) algorithm, which integrates parallel exploration to enhance policy diversity and improve training efficiency. By deploying multiple agents with shared neural parameters, the A3C framework ensures faster convergence by reducing the variance in gradient updates while maintaining a comprehensive exploration of the state-action space. The experimental results validate the effectiveness of the proposed algorithm. Compared with baseline methods in single-DAG, multi-DAG, and real IoV edge environments, the overall utility of the proposed method demonstrates a significant improvement. Specifically, the method achieves an enhancement of 3.2–8.7% in the latency energy tradeoff in real-world scenarios by leveraging the asynchronous update mechanism of parallel agents. In addition, as the density of edge servers increases, the algorithm dynamically adjusts to balance the computational load, outperforming complete offloading and random strategies by 25.1–34.7%. These results confirm that the proposed algorithm effectively coordinates asynchronous computing and dynamic communication constraints, providing a reliable solution for balancing latency and energy consumption in heterogeneous ICV applications. In conclusion, the DRL-based partial offloading algorithm proposed in this study effectively addresses the challenges associated with task dependencies and multi-application environments in MEC-driven ICVs. It demonstrates significant advantages in optimizing resource allocation and enhancing the overall system performance, positioning it as a promising solution for next-generation intelligent transportation systems.

HTML全文

参考文献(39)

施引文献

资源附件(0)