基于强化学习的工控系统恶意软件行为检测方法

高洋; 王礼伟; 任望; 谢丰; 莫晓锋; 罗熊; 王卫苹; 杨玺

doi:10.13374/j.issn2095-9389.2019.09.16.005

基于强化学习的工控系统恶意软件行为检测方法

Reinforcement learning-based detection method for malware behavior in industrial control systems

摘要

摘要: 网络环境下的恶意软件严重威胁着工控系统的安全，随着目前恶意软件变种的逐渐增多，给工控系统恶意软件的检测和安全防护带来了巨大的挑战。现有的检测方法存在着自适应检测识别的智能化程度不高等局限性。针对此问题，围绕威胁工控系统网络安全的恶意软件对象，本文通过结合利用强化学习这一高级的机器学习算法，设计了一个检测应用方法框架。在实现过程中，根据恶意软件行为检测的实际需求，充分结合强化学习的序列决策和动态反馈学习等智能特征，详细讨论并设计了其中的特征提取网络、策略网络和分类网络等关键应用模块。基于恶意软件实际测试数据集进行的应用实验验证了本文方法的有效性，可为一般恶意软件行为检测提供一种智能化的决策辅助手段。

Abstract: Due to the popularity of intelligent mobile devices, malwares in the internet have seriously threatened the security of industrial control systems. Increasing number of malware attacks has become a major concern in the information security community. Currently, with the increase of malware variants in a wide range of application fields, some technical challenges must be addressed to detect malwares and achieve security protection in industrial control systems. Although many traditional solutions have been developed to provide effective ways of detecting malwares, some current approaches have their limitations in intelligently detecting and recognizing malwares, as more complex malwares exist. Given the success of machine learning methods and techniques in data analysis applications, some advanced algorithms can also be applied in the detection and analysis of complex malwares. To detect malwares and consider the advantages of machine learning algorithms, we developed a detection framework for malwares that threatens the network security of industrial control systems through the combination of an advanced machine learning algorithm, i.e., reinforcement learning. During the implementation process, according to the actual needs of malware behavior detection, key modules including feature extraction, policy, and classification networks were designed on the basis of the intelligent features of reinforcement learning algorithms in relation to sequence decision and dynamic feedback learning. Moreover, the training algorithms for the above key modules were presented while providing the detailed functional analysis and implementation framework. In the application experiments, after preprocessing the actual dataset of malwares, the developed method was tested and the satisfactory classification performance for malware was achieved that verified the efficiency and effectiveness of the reinforcement learning-based method. This method can provide an intelligent decision aid for general malware behavior detection.

HTML全文

参考文献(17)

施引文献

资源附件(0)