Abstract:
This article addresses the bathing problem in multi-machine collaborative operations, proposing a method based on improved self-organizing iterative clustering. This approach circumvents the issues of traditional manual parameter setting in the self-organizing iterative clustering algorithm that is often inconvenient and non-intuitive. The proposed method allows multiple machines to autonomously adjust the parameters involved in the clustering process, given a small number of intuitive hyperparameters. The ultimate goal is to iterate toward reasonable editing results. Initially, this article focuses on selecting feature vectors for the multi-machine collaborative confrontation situation. It applies standardization and principal component analysis to high-dimensional multi-machine situation information to confirm the new vector space. This space mainly encompasses position information in three dimensions and speed information. Subsequently, the paper introduces the concept of neighborhood density discrimination from density clustering. This improves the merging and splitting operations of the traditional self-organizing iterative clustering method. It optimizes and reduces the artificial parameters involved in these operations, enhancing the intelligent autonomy for batch clustering tasks. Before optimization, artificial parameters primarily include the number of expected clusters, minimum number of points within a class, number of iterations, upper limit of standard deviation that limits data distribution within a class, and an allowable shortest distance indicator between classes. Post optimization, the artificial parameters are limited to the expected cluster quantity, minimum number of points, and the number of iterations within a single classification. These optimized parameters are relatively intuitive, and the algorithm output does not strongly correlate with the input parameters. Ultimately, the paper selects algorithm evaluation indicators, including Dunn, Davies–Bouldin, silhouette coefficient, and Calinski–Harabasz. It uses these to evaluate the proposed algorithms ISODATA+ and K-MEANS+, along with the original ISODATA algorithm, against multiple artificially synthesized data sets (completely random data, Gaussian-generated data, and sin-type data) and real-world scenarios. The experimental results suggest that while KMEANS+ shows significant advantages owing to multiple manually set hyperparameters, it requires constant debugging when adjusting parameters, which increases the complexity of the task. Compared with the original self-organizing iterative algorithm ISODATA, statistical results show that the improved algorithm has equivalent capabilities to the original algorithm. This demonstrates that the ISODATA+ algorithm maintains good clustering capabilities even after removing some artificial parameters. The batching results from actual scenario tests further illustrate the effectiveness of the improved self-organizing iterative clustering algorithm in specific application scenarios, demonstrating its practicability for future real-world applications.