Development of a novel rapid and high-precision active learning algorithm: A case study of the prediction of the mechanical properties of MAX phase crystals
-
-
Abstract
In recent years, MAX phase crystals have emerged as a prominent area of global research due to their unique nanolayered crystal structure, which provides advantages such as self-lubrication, high tenacity, and electrical conductivity. M2AX phase crystals have properties associated with both ceramic and metal compounds, such as thermal shock resistance, high tenacity, electrical conductivity, and thermal conductivity. However, research on these materials is challenging due to the difficulty in preparing single-phase samples for such materials. Active learning is a machine learning method that uses a small number of labeled samples to achieve high prediction performance. This paper proposes an improved active learning selection strategy, called RS-EGO, based on the combination of efficient global optimization and residual active learning regression according to their characteristics after analyzing the sampling strategies of active learning and efficient global optimization algorithms. The proposed strategy is applied to predict and determine the optimal values of the bulk modulus, Young’s modulus, and shear modulus based on a dataset of 169 M2AX phase crystals. This analysis is conducted using computational simulations to explore the material properties, reducing the need for ineffective validation experiments. The results showed that RS-EGO has good prediction ability and can rapidly find the optimal value. Its comprehensive performance is not only better than the two original selection strategies but is also more suitable for material property prediction problems with limited sample data. The choice of various parameter combinations can influence the direction of optimization of this improved algorithm. RS-EGO was applied to two publicly available datasets (one with a sample size of 103 and the other with a sample size of 1836), and both analyses achieved smaller root mean square errors, smaller opportunity costs, and larger decidable coefficient values, which demonstrates the effectiveness of the algorithm for both small and large sample datasets. A range of parameter combinations broader than previous experiments is explored, with experiments designed to explore the regularity of the contribution of different parameters to different optimization directions of the model. The results show that larger parameter values cause the algorithm to behave more like the efficient global optimization algorithm with a better ability to find the optimal value. Conversely, the closer the model is to the residual active learning regression algorithm with a better accuracy prediction performance, the better will be its prediction performance. Thus, the focus of the two capabilities can be adjusted by choosing the combination of parameters appropriately.
-
-