面向不平衡数据的二阶段网络入侵检测新方法

doi:10.12305/j.issn.1001-506X.2025.06.34

Abstract

Abstract:

Although many current networks traffic intrusion detection models have relatively high detection rates, there are still problems such as low detection rates and poor generalization for imbalanced abnormal network traffic. Therefore, two-stage network intrusion detection method for imbalance data is proposed. In the first stage, a random forest ensemble model is trained to perform initial normal and abnormal binary classification on network traffic to alleviate the impact of imbalance of normal and abnormal traffic on model training. In the second stage, an initial abnormal traffic data is used to train an one-dimensional convolutional neural network-bi-directional long short-term memory model to study the key features of abnormal traffic, and the focal loss function is introduced during model training. This mechanism enables the model to simultaneously focus on difficult classification samples and minority samples in abnormal traffic, further alleviating the impact of data imbalance of abnormal traffic on detection accuracy. In order to verify the effectiveness of the proposed method, experiments are conducted on the UNSW2015 and CIC-IDS2017 dataset. The experimental results show that the proposed method can better extract data features and alleviate data imbalance to a certain extent. Compared with other similar methods proposed in recent years, the proposed model has better overall performance, and the weighted F1 score increased by 0.9% and the macro F1 score increased by 2.7%.

Key words: intrusion detection, imbalance samples, neural network, focal loss

CLC Number:

TP393

Bo WEI, Caifu HU, Ruibin REN. Two-stage novel method for imbalanced data distribution in network intrusion detection[J]. Systems Engineering and Electronics, 2025, 47(6): 2065-2075.

Figures/Tables 15

Fig.1

Fig.2

Fig.3

Table 1

Table 2

Table 3

Fig.4

Table 4

Table 5

Fig.5

Table 6

Table 7

Fig.6

Fig.7

Fig.8

References 26

1	LAN J H , LIU X D , LI B , et al. A novel hierarchical attention-based triplet network with unsupervised domain adaptation for network intrusion detection[J]. Applied Intelligence, 2023, 53 (10): 11705- 11726. doi: 10.1007/s10489-022-04076-0
2	THAKKAR A , LOHIYA R . A survey on intrusion detection system: feature selection, model, performance measures, appli cation perspective, challenges, and future research directions[J]. Artificial Intelligence Review, 2022, 55 (1): 453- 463. doi: 10.1007/s10462-021-10037-9
3	CUI J Y , ZONG L S , XIE J H , et al. A novel multi-module integrated intrusion detection system for high-dimensional imba-lanced data[J]. Applied Intelligence, 2023, 53 (1): 272- 288. doi: 10.1007/s10489-022-03361-2
4	KHAN S H , HAYAT M , BENNAMOUN M , et al. Cost-sensitive learning of deep feature representations from imbalanced data[J]. IEEE Trans. on Neural Networks and Learning Systems, 2018, 29 (8): 3573- 3587. doi: 10.1109/TNNLS.2017.2732482
5	李艳霞, 柴毅, 胡友强, 等. 不平衡数据分类方法综述[J]. 控制与决策, 2019, 34 (4): 673- 688.
	LI Y X , CHAI Y , HU Y Q , et al. Review of imbalanced data classification methods[J]. Control and Decision, 2019, 34 (4): 673- 688.
6	BEDI P , GUPTA N , JINDAL V . Ⅰ-SiamIDS: an improved SiamIDS for handling class imbalance in network-based intrusion detection systems[J]. Applied Intelligence, 2021, 51, 1133- 1151. doi: 10.1007/s10489-020-01886-y
7	潘成胜, 李志祥, 杨雯升, 等. 基于二次特征提取和BiLSTM-Attention的网络流量异常检测方法[J]. 电子与信息学报, 2023, 45 (12): 4539- 4547. doi: 10.11999/JEIT221296
	PAN C S , LI Z X , YANG W S , et al. Anomaly detection method of network traffic based on secondary feature extraction and BiLSTM-attention[J]. Journal of Electronics & Information Technology, 2023, 45 (12): 4539- 4547. doi: 10.11999/JEIT221296
8	LAN Y, TRUONG-HUU T, WU J, et al. Cascaded multi-class network intrusion detection with decision tree and self-attentive model[C]//Proc. of the IEEE International Conference on Data Mining Workshops, 2022.
9	DENNING D E . An intrusion-detection model[J]. IEEE Trans. on Software Engineering, 1987, 13 (2): 222- 232.
10	PORRAS P A, KEMMERER R A. Penetration state transition analysis: a rule-based intrusion detection approach[C]//Proc. of the 8th Annual Computer Security Application Conference, 1992: 220-229.
11	SHEU T F, HUANG N F, LEE H P. NIS04-6: a time-and memory-efficient string matching algorithm for intrusion detection systems[C]//Proc. of the IEEE Global Communications Conference, 2006.
12	PAN Z S, LIAN H, HU G Y, et al. An integrated model of intrusion detection based on neural network and expert system[C]// Proc. of the 17th IEEE International Conference on Tools with Artificial Intelligence, 2005.
13	LUNT T F, JAGANNATHAN R. A prototype real-time intrusion-detection expert system[C]//Proc. of the IEEE Symposium on Security & Privacy, 1988.
14	GU J , LU S . An effective intrusion detection approach using SVM with naive Bayes feature embedding[J]. Computers & Security, 2021, 103, 102158.
15	GUEZZAZ A , BENKIRANE S , AZROUR M , et al. A reliable network intrusion detection approach using decision tree with enhanced data quality[J]. Security and Communication Networks, 2021, 2021, 123059.
16	AZIZJON M, JUMABEK A, KIM W. 1D CNN based network intrusion detection with normalization on imbalanced data[C]//Proc. of the International Conference on Artificial Intelligence in Information and Communication, 2020: 218-224.
17	TIAN Q T , HAN D Z , LI K C , et al. An intrusion detection approach based on improved deep belief network[J]. Applied Intelligence, 2020, 50, 3162- 3178. doi: 10.1007/s10489-020-01694-4
18	FOTIADOU K , VELIVASSAKI T H , VOULKIDIS A , et al. Network traffic anomaly detection via deep learning[J]. Information, 2021, 12 (5): 215. doi: 10.3390/info12050215
19	LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//Proc. of the IEEE International Conference on Computer Vision, 2017: 2980-2988.
20	DING D F, ZHU L, XIE J Y, et al. In-vehicle network intrusion detection system based on Bi-LSTM[C]//Proc. of the 7th International Conference on Intelligent Computing and Signal Processing, 2022: 580-583.
21	MOUSTAFA N, SLAY J. UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set)[C]//Proc. of the Military Communications and Information Systems Conference, 2015.
22	SHARAFALDIN I, LASHKARI A H, GHORBANI A A. Toward generating a new intrusion detection dataset and intrusion traffic characterization[C]//Proc. of the International Con-ference on Information Systems Security & Privacy, 2018: 108-116.
23	AL-TURAIKI I , ALTWAIJRY N . A convolutional neural net work for improved anomaly-based network intrusion detection[J]. Big Data, 2021, 9 (3): 233- 252. doi: 10.1089/big.2020.0263
24	HALBOUNI A , GUNAWAN T S , HABAEBI M H , et al. CNN-LSTM: hybrid deep neural network for network intrusion detection system[J]. IEEE Access, 2022, 10, 99837- 99849. doi: 10.1109/ACCESS.2022.3206425
25	UDAS P B , KARIM M E , ROY K S . SPIDER: a shallow PCA based network intrusion detection system with enhanced recurr ent neural networks[J]. Journal of King Saud University-Computer and Information Sciences, 2022, 34 (10): 10246- 10272. doi: 10.1016/j.jksuci.2022.10.019
26	REH H J , TANG Y H , DONG W Y , et al. DUEN: dynamic ensemble handling class imbalance in network intrusion detection[J]. Expert Systems with Applications, 2023, 229, 120420. doi: 10.1016/j.eswa.2023.120420

流量类型	流量子类型	数量
Benign	Normal	93 000
Intrusion	Generic	58 871
	Exploits	44 525
	Fuzzers	24 246
	DoS	16 353
	Reconnaissance	13 987
	Analysis	2 677
	Backdoor	2 329
	Shellcode	1 511
	Worms	174

流量类型	流量子类型	数量
Benign	Benign	2 273 097
Intrusion	DoS Hulk	231 073
	PortScan	158 930
	DDoS	128 027
	DoS GoldenEye	10 293
	FTP-Patator	7 938
	SSH-Patator	5 897
	DoS slowloris	5 796
	DoS Slowhttptest	5 499
	Bot	1 966
	Infiltration	36
	Heartbleed	11
	Web Attack Brute Force	1 507
	Web Attack Sql Injection	652
	Web Attack XSS	21

真实情况	预测结果
真实情况	正例	反例
正例	TP	FN
反例	FP	TN

方法	年份	准确率	加权平均F1	宏平均F1
文献[8]	2022	-	-	0.559
文献[23]	2021	0.805	0.810	-
文献[24]	2022	0.818	0.809	-
文献[25]	2022	0.729	0.737	0.525
文献[26]	2023	-	-	0.501
本文方法	-	0.818	0.822	0.586

模型(方法)	准确率	精确率	召回率	宏平均F1
1D-CNN-BiLSTM	0.802	0.608	0.510	0.524
SMOTE-NC+1D-CNN-BiLSTM	0.788	0.517	0.603	0.521
RUS+1D-CNN-BiLSTM	0.785	0.597	0.532	0.528
本文方法	0.818	0.645	0.575	0.586

Two-stage novel method for imbalanced data distribution in network intrusion detection

RichHTML

PDF (PC)

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 15

References 26

Related Articles 15

Recommended Articles

Metrics

Comments

模型	方法			UNSW-NB2015数据集
模型	分位数标准化	1D-CNN-BiLSTM	二阶段不平衡处理	准确率	精确率	召回率	宏平均F1	加权平均F1
本文模型	✔	✔	✔	0.818	0.646	0.575	0.586	0.822
w/o二阶段不平衡处理	✔	✔	-	0.808	0.666	0.512	0.524	0.802
w/o 1D-CNN-BiLSTM模型	✔	-	✔	0.818	0.646	0.530	0.537	0.819
w/o分位数标准化	-	✔	✔	0.816	0.595	0.576	0.567	0.817

[1]	Amin DUAN, Zhaohui ZHANG. Quadratic decomposition-based cellular traffic prediction with hybrid neural network [J]. Systems Engineering and Electronics, 2025, 47(5): 1687-1697.
[2]	Kai CHEN, Deping ZHANG. Missile temporal planning method based on graph neural network [J]. Systems Engineering and Electronics, 2025, 47(3): 862-870.
[3]	Weihong FU, Xinyu ZHANG, Naian LIU. Single-channel blind source separation algorithm for co-frequency and co-modulation based on multi-scale fusion neural network [J]. Systems Engineering and Electronics, 2025, 47(2): 641-649.
[4]	Yongqi SHAO, Lihua YANG, Ao CHANG, Lulu REN. Time-varying channel estimation in RIS-assisted OFDM system [J]. Systems Engineering and Electronics, 2025, 47(1): 324-331.
[5]	Xiaobin LI, Dong XU, Xue YANG. Trajectory tracking control with predefined dynamic performance for underactuated autonomous underwater vehicle [J]. Systems Engineering and Electronics, 2024, 46(9): 3185-3197.
[6]	Tianqi ZHANG, Zongfang YANG, Han ZOU, Kunran MA. Blind identification algorithm for polarization code parameters based on encoding matrix estimation [J]. Systems Engineering and Electronics, 2024, 46(9): 3221-3230.
[7]	Lei WANG, Jin ZHANG, Qiuxuan YE. Spectrum sensing method based on cyclic spectrum and residual neural network in LDACS system [J]. Systems Engineering and Electronics, 2024, 46(9): 3231-3238.
[8]	Ruibin ZHANG, Mengtao ZHU, Yunjie LI. Radar transmitting signal generation method for modulation recognition network stealth [J]. Systems Engineering and Electronics, 2024, 46(7): 2256-2268.
[9]	Bing QI, Jianhua CHENG, Yanchi ZHAO, Zili WANG. Precise temperature drift error estimation method for capacitive MEMS accelerometers based on micro-deformation analysis [J]. Systems Engineering and Electronics, 2024, 46(7): 2437-2445.
[10]	Weiyi WU, Yunxian JIA, Xiangzheng JIANG, Xianming SHI, Jie LIU, Bin LIU, Enzhi DONG, Xi ZHU. Method for determining for carrying material varieties of stage task [J]. Systems Engineering and Electronics, 2024, 46(6): 2054-2064.
[11]	Hao QING, Zhigeng FANG, Yuhong WANG, Xirui QIU. Combination prediction of civil aircraft demand based on grey-neural network [J]. Systems Engineering and Electronics, 2024, 46(5): 1665-1672.
[12]	Zongfang YANG, Tianqi ZHANG, Kunran MA, Han ZOU. Blind identification of channel coding types based on deep neural networks [J]. Systems Engineering and Electronics, 2024, 46(5): 1820-1829.
[13]	Tong HE, Qing LU, Jun ZHOU, Zongyi GUO. Line-of-sight angle constraint guidance with neural network interference observer [J]. Systems Engineering and Electronics, 2024, 46(4): 1372-1382.
[14]	Hongjin ZHOU, Hui SONG, Wenliang FAN, Su WANG, Dongliang GU. Ship inertial navigation system position correction method based on Bayesian neural network [J]. Systems Engineering and Electronics, 2024, 46(4): 1393-1400.
[15]	Xianpeng MENG, Limin LIU, Jian DONG, Li WANG, Wenhua HU. Radar frequency agility behavior recognition based on bi-cell recurrent neural network [J]. Systems Engineering and Electronics, 2024, 46(3): 898-905.

模型	UNSW-NB2015数据集
模型	准确率	精确率	召回率	宏平均F1	加权平均F1
1D-CNN-BiLSTM	0.788	0.675	0.568	0.593	0.797
1D-CNN	0.785	0.663	0.562	0.567	0.792
BiLSTM	0.768	0.527	0.486	0.481	0.768