机器学习在非晶合金开发中的应用

胡静怡; 徐翔; 季小妹; 徐明贤; 姜岱峰; 王卺

doi:10.13374/j.issn2095-9389.2022.11.11.002

摘要: 在材料科学过去几十年的发展过程中，经验试错法和基于密度泛函理论的方法等传统的非晶合金开发方法，帮助研发人员探索出多种非晶合金体系。但是，这些方法由于开发周期长、效率低等缺点，目前已难以满足研发人员的需求。而机器学习方法因其实验成本低、性能强大以及开发周期短等优点，被越来越广泛地应用到非晶合金材料的设计、分析和性能预测中。本文首先按照机器学习建模的主要流程阐述了各步骤的基本操作和发展情况。其次，着重介绍了数据预处理、模型构建以及模型验证方面的研究工作，在数据预处理章节，简述了数据收集、特征工程以及目前较为流行的数据预采样方法；在模型构建章节，论述了四类在非晶合金开发中常用的机器学习算法，包括人工神经网络、支持向量机、随机森林以及极端梯度提升方法；在模型验证章节，主要介绍了K折交叉验证和留一法交叉验证方法。最后，本文从多个角度对比分析了现有的机器学习应用，为后续的相关研究提供了可能的研究方向和思路。

Abstract: Metallic glasses have received a lot of interest because of their excellent mechanical, physical, and chemical qualities. For example, they have a stronger resistivity than crystalline metals composed of the same elements and a lower viscosity coefficient. However, the difficulty in creating alloy compositions has been a concern for researchers. Traditional amorphous alloy systems design approaches, such as empirical trial-and-error methods and methods based on density functional theory (DFT), have assisted researchers in exploring numerous amorphous alloy systems during the growth of materials science over the last few decades. However, with the continuous development of materials science, these methods have been difficult to meet the needs of researchers due to their long development cycles and low efficiency. Additionally, the complex and long-range disordered structure of metallic glasses makes it difficult to understand their structure and nature in a comprehensive and clear way by conventional methods. Amorphous alloy composition design and property analysis are now often conducted using machine learning techniques because of their low experimental cost, short development cycle, strong data processing capability, and high predictive performance, among other advantages. They present new approaches and chances to address significant key bottlenecks in the field of metallic glass. In this study, the main processes of machine learning model building were introduced. Subsequently, the related studies on data pre-processing, model construction, and model validation were presented. For data pre-processing, data selection, feature engineering, and advanced data balancing methods were primarily described. In the feature engineering part, the model performance with various input features was examined, and it was shown that either employing physical properties or directly using the alloy compositions as the model input might result in high performance. Four machine learning algorithms were used to generate the machine learning model: artificial neural networks (ANN), support vector machines (SVM), random forest (RF), and extreme gradient boosting (XGBoost). A comparison indicates that SVM models work best with small data sets, whereas the performance of all other models tends to get better as the amount of training data increases. Generally, the XGBoost method outperforms several other methods and is, therefore, often used in machine learning competitions. Model validation approaches: K-fold cross-validation and leave-one-out cross-validation methods were presented. A good metallic glass performance prediction method needs to perform well in both validation methods. Finally, this study provides several possible future research directions on feature engineering, dataset construction, validation, and machine learning models.

机器学习在非晶合金开发中的应用

Machine learning in designing amorphous alloys