Abstract:
In order to effectively solve the multi-objective optimization problems of path length, time compliance, energy consumption and multi-constraints in logistics UAV cargo transportation, and to address the lack of closed-loop capability of “environment change perception - constraints dynamic integration-real-time adjustment of policy” of the traditional NSGA2 algorithm in dynamic scenarios, the present study proposes MODDPG-NSGA2(Multi Objective Deep Deterministic Policy), which is a multi-objective optimization algorithm for logistics UAV cargo transportation. NSGA2 (Multi Objective Deep Deterministic Policy In this study, we propose MODDPG-NSGA2 (Multi Objective Deep Deterministic Policy Gradient-Non-dominated Sorting Genetic Algorithm) two-layer architecture algorithm. The upper layer NSGA2 algorithm adopts an improved non-dominated sorting policy combined with an elite retention mechanism to construct a multi-objective optimization model to generate a global initial Pareto-optimal solution set covering path lengths, time-to-compliance rates and energy consumption. The lower layer MODDPG algorithm senses the dynamic environment orders, load balancing and other state information in real time, approximates the state-action value function through deep neural network, and dynamically adjusts the path strategy according to the environmental changes to realize the replanning of the local dynamic environment. The deep interactive synergy of the two-layer architecture solves the problem of local imbalance and short-sightedness. The experiments show that compared with the traditional NSGA2 algorithm, the MODDPG-NSGA2 algorithm bursty scenario completes the task time compliance rate by 24.9%, the distribution path and energy consumption are reduced by more than 13.4%, and the effect of multi-objective synergy ability enhancement is more obvious. In order to further verify the robustness of the experiment, five classical multi-objective optimization algorithms are introduced for further comparison, and the results show that the algorithm is more than 23.48% higher than the average value in the optimization of path, time and energy consumption, and the cross-modal optimization ability is better, which is of great significance for improving the efficiency of logistics UAV transportation in complex urban environments, reducing the cost, and enhancing the multi-objective dynamic environment adaptability, and it also provides an opportunity for the application of the multi-objective optimization algorithms in dynamic complex systems. It also provides a new idea for the application of multi-objective optimization algorithms in dynamic complex systems.