“公共安全+人工智能”专辑+基于QGDAM的低合金钢氢脆行为预测方法

A QGDAM-Based Prediction Method for Hydrogen Embrittlement Behavior of Low-Alloy Steel

  • 摘要: 氢致脆化是制约低合金钢在关键工程中长期服役的重要因素。氢原子在材料内部的扩散、捕获与富集会引发微观结构脆化和裂纹萌生,显著降低材料的延展性与断裂韧性。氢脆行为受显微组织及环境参数等多因素耦合作用,呈现出高度非线性和复杂性。然而,相关实验成本高、周期长且可重复性有限,导致可获取数据集样本量较小,数据稀缺与特征分布不均问题普遍存在,使现有机器学习模型在小样本条件下难以实现准确预测。为此,本文针对低合金钢氢脆行为预测问题,构建了一种基于分位数高斯数据增强与多模型学习的预测方法(Quantile-Gaussian Data Augmentation with Multi-model learning,QGDAM),以提升有限样本条件下氢脆行为预测的稳定性与精度。该方法在数据增强阶段引入分位数转换以缓解特征分布偏斜,并基于高斯混合模型构建多种具有统计一致性的数据增强策略。在预测阶段,采用多模型回归框架融合随机森林、梯度提升、轻量级梯度提升和K近邻等算法。基于90个实验样本和17个关键特征构建小样本低合金钢氢脆数据集,结果表明QGDAM在不同增强比率下均能生成更高质量的样本,并显著提升预测性能。此外,利用部分依赖图与Shapley加性解释对模型的决策机理进行了可视化分析,进一步验证了模型的预测可靠性与可解释性。

     

    Abstract: Hydrogen embrittlement (HE) poses a serious threat to the structural integrity and service reliability of low-alloy steels, constituting one of the key factors limiting their long-term service performance in critical engineering applications. The diffusion, trapping, and accumulation of hydrogen atoms within materials lead to embrittlement of the microstructure and crack initiation, significantly reducing ductility and fracture toughness. HE behavior is typically influenced by the coupled effects of microstructure and environmental parameters such as temperature, pressure, and hydrogen concentration, exhibiting high nonlinearity and complexity. However, experiments are often costly, time-consuming, and limited in reproducibility, resulting in small sample sizes for available datasets. Issues such as data scarcity and uneven feature distribution are prevalent, making it challenging for existing machine learning models to achieve accurate predictions under limited sample conditions. In recent years, data augmentation techniques have been increasingly introduced into materials science to mitigate data scarcity under small-sample conditions. Data augmentation methods expand the dataset by statistically perturbing and synthesizing original samples while preserving the consistency of the data distribution. Such methods have demonstrated promising results in alloy design, fatigue life prediction, and corrosion modeling. For the hydrogen embrittlement issue of low-alloy steels, this study proposes a Quantile Gaussian Data Augmentation with Multi-model Learning (QGDAM) method for small-sample HE behavior, aiming to achieve robust learning and high-precision prediction under limited data conditions. The method comprises two modules: data augmentation and regression prediction. During the data augmentation phase, a Quantile Transformation Module is introduced to mitigate skewed feature distributions. Three data augmentation strategies based on Gaussian Mixture Models (GMM) are designed to generate augmented samples that closely match the distribution of the original data. During prediction, a multi-model ensemble regression framework was established, incorporating Random Forest (RF), Gradient Boosting (GB), Light Gradient Boosting (LightGBM) and K-Nearest Neighbors (KNN). This study extracted 90 valid samples from publicly available HE experimental data to construct a low-alloy steel HE behavior database containing 17 key features. These features encompass material strength parameters, environmental conditions, and elemental composition. Results demonstrate that the QGDAM method significantly outperforms traditional machine learning approaches in both enhanced sample quality and prediction accuracy. Compared to existing methods, this study demonstrates higher sample quality across four sample augmentation ratios. Compared to the baseline model, QGDAM achieves a significant reduction in mean squared error (MSE) and an average increase of 0.18 in the coefficient of determination (R2). Additionally, it significantly improves prediction accuracy on two external validation sets, indicating strong generalization capability and robustness. Furthermore, this paper compares the feature-response relationships of models before and after augmentation using Partial Dependence Plots (PDP) and Shapley Additive Explanations (SHAP). Results show that the augmented model more accurately captures the influence patterns of feature variable changes on HE sensitivity. In contrast, the dependency curves of the original dataset exhibited scattered distributions and weaker regularity, indicating that QGDAM significantly enhances the model’s ability to fit real physical mechanisms at the feature learning level. Comprehensive results demonstrate that the proposed QGDAM method effectively improves the accuracy and interpretability of HE behavior prediction under small-sample conditions. This provides a generalizable data-driven approach for intelligent modeling in the service performance of complex materials.

     

/

返回文章
返回