基于图神经网络增强通信特征的僵尸网络异常通信检测

王云浩; 胡堰; 皇甫伟; 霍佳皓

doi:10.13374/j.issn2095-9389.2024.12.20.001

基于图神经网络增强通信特征的僵尸网络异常通信检测

Detection of Botnet anomalous communication based on GNN-enhanced communication features

摘要

摘要: 工业互联网中的传统工业设备存在大量安全漏洞，在联网过程中易受僵尸网络攻击，其通过恶意控制大量联网设备，实现对目标网络的大规模协同攻击. 传统基于规则或阈值的检测方法过度依赖静态签名或人工阈值设定，很难适应动态变化的网络环境；传统机器学习技术对复杂网络高维通信特征的处理能力有限，导致检测能力受限；基于深度学习的检测技术通常将网络流量视为时间序列或空间数据进行处理，无法对设备拓扑依赖关系进行建模，因而难以识别僵尸网络协同攻击. 为了解决上述局限性，本文采用图结构准确建模复杂通信网络拓扑结构，并提出一种基于图神经网络增强通信特征的僵尸网络异常通信检测技术. 首先从网络流量数据中挖掘细粒度的节点特征与通信特征；然后通过图神经网络的信息传播与聚合机制，获得准确的节点聚合特征表示；再用节点聚合特征增强通信特征，实现准确的异常通信检测；最后在大型公开数据集CTU-13上进行了综合实验，验证所提出方法的有效性. 实验结果表明所提出的方案与现有的卷积神经网络、长短时记忆网络及其融合模型等异常检测算法，以及最新提出的Bot-DM僵尸网络检测方法相比，能更准确地检测僵尸网络异常通信.

Abstract: The Industrial Internet is an important part of the national critical information infrastructure. Enabling comprehensive interconnectivity among humans, machines, and Internet of Things devices allows the formation of a new architecture of industrial production, manufacturing, and service. However, a great number of security vulnerabilities exist in industrial devices, especially legacy industrial devices. They can be maliciously exploited during device interconnection, causing severe security incidents or economic losses. Among the major security threats facing the Industrial Internet today, botnet attacks are particularly concerning. By exploiting zero-day vulnerabilities (e.g., buffer overflows in the programmable logic controller firmware) and propagating and deploying polymorphic malware, attackers can covertly hijack a large number of networked devices and recruit compromised devices into botnets to launch coordinated large-scale attacks on target networks. However, traditional botnet detection methods (e.g., rule-, threshold-, and machine learning-based methods) have significant limitations. Rule- and threshold-based botnet detection techniques, which depend heavily on static signatures (e.g., known malicious Internet Protocol lists) or predefined detection thresholds, face challenges in adapting to the dynamic nature of complex network environments, ultimately leading to constrained detection capabilities. Meanwhile, it is not easy for traditional machine learning-based detection techniques to process complex and high-dimensional network communication features effectively, resulting in poor detection performance. Deep learning-based detection techniques, which generally treat network traffic as isolated time-series or spatial data, fail to model the topological dependencies between devices in complex communication networks; this is a key limitation in identifying coordinated botnet behaviors (e.g., synchronized command-and-control communications). To address these challenges, we leverage the pervasive device-to-device connectivity in the Industrial Internet by modeling the communication network as a graph structure, where nodes represent devices and edges represent communication relationships between devices to achieve accurate topology representation. Based on the graph model, we propose a novel approach for detecting botnet anomalous communication based on graph neural network (GNN)-enhanced communication features. First, our method extracts fine-grained node and communication features from network traffic data and employs a GNN to propagate and aggregate node information across the entire network. By capturing topological dependencies, the method can generate more accurate aggregated node feature representations. In this step, the multihead attention mechanism is integrated with the GNN to perform weighted aggregation of node features in diverse ways, enhancing the flexibility of node feature representation. Afterward, the aggregated node features are used to enhance communication features. Finally, a multilayer perceptron model is used to classify the enhanced communication features into the normal or abnormal categories, thus achieving automatic detection of botnet anomalous communication. To validate the effectiveness of the proposed approach, we conducted a series of experiments on a public large-scale dataset, CTU-13, which includes 13 distinct botnet attack scenarios. We compared the proposed approach against a group of baseline methods, including a convolutional neural network (CNN), long short-term memory (LSTM), CNN-LSTM, and the recently proposed Bot-DM method, across a comprehensive set of metrics such as accuracy, recall, precision, and F1-score. The experimental results demonstrate that our approach outperforms existing botnet detection methods in detection performance.

HTML全文

参考文献(30)

施引文献

资源附件(0)