专家知识增强的机器学习建模在高强高导铜合金开发中的应用

Application of expert-augmented machine learning modeling in high-strength and high-conductivity copper alloy development

  • 摘要: 材料领域数据具有小样本、噪声大、维度高、关系复杂、专家知识丰富的特点. 利用专家知识增强机器学习建模效果具有必要性和可行性. 本文通过计算自变量与因变量之间的秩相关系数,来定量描述成分状态因素与性能之间单调关系的强弱. 在模型训练过程中,将秩相关系数加入到神经网络损失函数,实时评估模型输出与专家知识的相符程度,得到了专家知识增强的机器学习模型. 对训练过程分析后发现,模型输出的合理性有显著提升,模型的输入输出规律与专家知识的相符程度达到了0.98以上(1.0为完全相符). 基于所建模型,采用遗传算法进行了关于强度和导电率的多目标优化,找到了满足帕累托最优的高强高导铜合金成分并开展了实验验证. 实验结果表明,强度在高达637 MPa的同时,导电率仍能保持在77.5% IACS(国际退火铜标准)的水平;导电率高达80.2% IACS的同时,强度仍能保持在600 MPa的水平. 强度和导电率的预测值与实际值误差在5%以内.

     

    Abstract: Investigation into material data is frequently limited by small sample sizes, high noise levels, complex associations, high dimensionality, and the need for expert knowledge. To improve the effectiveness of machine learning modeling, incorporating expert knowledge is necessary. In this study, we assembled a dataset including 410 data points containing composition, condition, and property data, in which the state symbols of the copper alloy were recoded using the one-hot encoding method. Because of the substantial capacity of neural network algorithms for powerful nonlinear fitting, we employed these algorithms for modeling. The network structures of the strength and conductivity models were optimized to 21–55–70–1 and 21–50–65–1, respectively. After optimizing the network structure, expert knowledge was integrated into the neural network loss function. This approach quantitatively describes the strength of the monotonic relations between the status factors of components and performance by calculating the rank correlation coefficient between the independent and dependent variables. During model training, the rank correlation coefficient was incorporated into the neural network loss function to assess the similarity between the model output and expert knowledge in real-time. For instance, the relation in which strength increases with the hardening level was quantitatively expressed with a Spearman score, and these Spearman scores were added to the loss function. A machine learning model augmented by expert knowledge was trained using genetic algorithm-based optimization of network weights. After updating each network weight, orthogonal data were generated to evaluate the consistency between output data and expert knowledge. The Spearman correlation coefficients between the model input–output data and expert knowledge exceeded 0.98, and the R2 scores of the strength and conductivity models achieved on the test set were >0.90. Multiobjective optimization based on composition, condition, strength, and conductivity models was conducted using a genetic algorithm, and Pareto-optimal solutions were obtained and experimentally validated after 100 generations of iteration. Three sets of components were selected from the Pareto-optimal solutions and were empirically tested for validation. The results showed that the tensile strength had reached 637 MPa, while conductivity was maintained at 77.5% IACS (International annealing copper standard), and when the conductivity was 80.2% IACS, the tensile strength was 600 MPa. The relative errors between the experimental and predicted values were <5%. Microstructure images of three experimental sample sets demonstrated that coarse second phases were present in the as-cast structure; however, these structures were dissolved and redistributed after the solid solution, cold deformation, and aging processes. The precipitated particles distributed along the grain boundary had low strength and conductivity. Our analysis revealed that the Mg and Ti elements were detrimental to the increase in strength, while Fe and Sn effectively increased strength. Additionally, Fe had a lower impact on conductivity than Sn. The results of this study demonstrate that the three optimized components identified can satisfy the performance requirements of interconnected frameworks in ultralarge-scale integrated circuits.

     

/

返回文章
返回