基于决策树的多视觉任务自适应压缩算法

doi:10.12305/j.issn.1001-506X.2025.08.31

系统工程与电子技术 ›› 2025, Vol. 47 ›› Issue (8): 2737-2743.doi: 10.12305/j.issn.1001-506X.2025.08.31

基于决策树的多视觉任务自适应压缩算法

李晓辉¹^,², 杨雯¹^,*(), 吕思婷², 毛亮³

1. 西安电子科技大学广州研究院，广东广州 510555
2. 西安电子科技大学通信工程学院，陕西西安 710071
3. 广州通则康威科技股份有限公司，广东广州 511458

收稿日期:2024-06-17 出版日期:2025-08-31 发布日期:2025-09-04
通讯作者: 杨雯 E-mail:22011210699@stu.xidian.edu.cn
作者简介:李晓辉（1972—），女，教授，博士，主要研究方向为宽带无线通信、语义通信
吕思婷（1998—），女，博士研究生，主要研究方向为语义通信、信道编码
毛　亮（1983—），男，副研究员，博士，主要研究方向为人工智能、大模型
基金资助:
国家自然科学基金（NSFC 62376204）资助课题

Decision tree-based adaptive compression algorithm for multi-visual tasks

Xiaohui LI¹^,², Wen YANG¹^,*(), Siting LYU², Liang MAO³

1. Guangzhou Institute of Technology，Xidian University，Guangzhou 510555，China
2. School of Teleommunications Engineering，Xidian University，Xi’an 710071，China
3. Guangzhou Tozed Kangwei Intelligent Technology Co.，Ltd.，Guangzhou 511458，China

Received:2024-06-17 Online:2025-08-31 Published:2025-09-04
Contact: Wen YANG E-mail:22011210699@stu.xidian.edu.cn

摘要/Abstract

摘要：

针对多视觉任务中传输成本高、解码端计算压力大的问题，提出一种自适应可伸缩视频编码（adaptive scalable video coding，ASVC）传输框架，将视频分为语义层和背景层，分别传输语义和背景信息。此外，提出一种自适应压缩算法，构建了C4.5决策树模型分析网络环境对视频进行压缩的决策判定，并对帧序列进行光流分析，在保留变化显著的帧基础上引入插值机制保持图像的平滑性。仿真结果表明，ASVC方法在不同码率环境下表现更高的识别精准率，视频质量和传输效率的显著提升。

关键词: 自适应压缩算法, C4.5决策树, 光流检测, 多视觉任务

Abstract:

To address high transmission costs and the computational burden of multi-visual tasks at the decoding end, an adaptive scalable video coding （ASVC） transmission framework is proposed. The framework divides video into semantic and background layers, transmitting these separately. Additionally, an adaptive compression algorithm is proposed, utilizing a C4.5 decision tree model to analyze the network environment and make compression decisions. Optical flow analysis is employed to retain frames with significant changes, while an interpolation mechanism ensures image smoothness. Simulation results demonstrate that the ASVC method achieves higher recognition accuracy, improved video quality, and transmission efficiency across various bitrate environments.

Key words: adaptive compression algorithm, C4.5 decision tree, optical flow detection, multi-visual task

中图分类号:

TN 929.5

李晓辉, 杨雯, 吕思婷, 毛亮. 基于决策树的多视觉任务自适应压缩算法[J]. 系统工程与电子技术, 2025, 47(8): 2737-2743.

Xiaohui LI, Wen YANG, Siting LYU, Liang MAO. Decision tree-based adaptive compression algorithm for multi-visual tasks[J]. Systems Engineering and Electronics, 2025, 47(8): 2737-2743.

图/表 10

图1

图2

图3

图4

图5

图6

图7

图8

表1

表2

参考文献 30

1	KALVA H. The H.264 video coding standard[J]. IEEE MultiMedia, 2006, 13 (4): 86- 90. doi: 10.1109/MMUL.2006.93
2	SULLIVAN G J, OHM J R, HAN W J, et al. Overview of the high efficiency video coding （HEVC） standard[J]. IEEE Trans. on Circuits and Systems for Video Technology, 2012, 22 (12): 1649- 1668. doi: 10.1109/TCSVT.2012.2221191
3	GAO W, TAO L F, ZHOU L J, et al. Low-rate image compression with super-resolution learning[C]//Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020: 154−155.
4	HU Z H, LU G, XU D. FVC: a new framework towards deep video compression in feature space[C]//Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 1502−1511.
5	LI J H, LI B, LU Y. Deep contextual video compression[J]. Advances in Neural Information Processing Systems, 2021, 34, 18114- 18125.
6	LI J H, LI B, LU Y. Hybrid spatial-temporal entropy modelling for neural video compression[C]//Proc. of the 30th ACM International Conference on Multimedia, 2022: 1503−1511.
7	LU G, OUYANG W L, XU D, et al. Dvc: an end-to-end deep video compression framework[C]//Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 11006−11015.
8	SABHARWAL A, SCHNITER P, GUO D, et al. In-band full-duplex wireless: challenges and opportunities[J]. IEEE Journal on Selected Areas in Communications, 2014, 32 (9): 1637- 1652. doi: 10.1109/JSAC.2014.2330193
9	GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, 2014: 580−587.
10	REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2016, 39 (6): 1137- 1149.
11	SIMONYAN K, ZISSERMAN A. Two-stream convolutional networks for action recognition in videos[J]. Advances in Neural Information Processing Systems, 2014, 27: 11797475.
12	SHANNON C E. A mathematical theory of communication[J]. The Bell System Technical Journal, 1948, 27 (3): 379- 423. doi: 10.1002/j.1538-7305.1948.tb01338.x
13	石光明, 肖泳, 李莹玉, 等. 面向万物智联的语义通信网络[J]. 物联网学报, 2021, 5 (2): 26- 36.
	SHI G M, XIAO Y, LI Y Y, et al. Semantic communication networking for the intelligence of everything[J]. Chinese Journal on Internet of Things, 2021, 5 (2): 26- 36.
14	朱翔本, 郭彩丽, 杨洋, 等. 数字语义通信中基于语义重要性的量化比特分配方法[J]. 北京邮电大学学报, 2024, 47 (5): 14- 21.
	ZHU X B, GUO C L, YANG Y, et al. A quantization bit allocation method based on semantic importance in digital semantic communication[J]. Journal of Beijing University of Posts and Telecommunications, 2024, 47 (5): 14- 21.
15	朱晓庆, 杨红, 陈洪刚, 等. 面向机器视觉任务的自适应语义通信[J]. 通信技术, 2023, 56 (9): 1043- 1050.
	ZHU X Q, YANG H, CHEN H G, et al. Adaptive semantic communication for machine vision task[J]. Communications Technology, 2023, 56 (9): 1043- 1050.
16	黄发仁, 柯捷铭, 郑楚飞, 等. 基于视频语义的码率控制算法[J]. 无线电工程, 2024, 54 (8): 1890- 1899.
	HUANG F R, KE J M, ZHENG C F, et al. Bitrate control algorithm based on video semantic[J]. Radio Engineering, 2024, 54 (8): 1890- 1899.
17	YAN N, GAO C S, LIU D, et al. SSSIC: semantics-to-signal scalable image coding with learned structural representations[J]. IEEE Trans. on Image Processing, 2021, 30, 8939- 8954. doi: 10.1109/TIP.2021.3121131
18	CHOI H, BAJIC I V. Scalable image coding for humans and machines[J]. IEEE Trans. on Image Processing, 2022, 31, 2739- 2754. doi: 10.1109/TIP.2022.3160602
19	HU Y Y, YANG S, YANG W H, et al. Towards coding for human and machine vision: A scalable image coding approach[C]//Proc. of the IEEE International Conference on Multimedia and Expo, 2020.
20	LIN H B, CHEN B L, ZHANG Z C, et al. DeepSVC: deep scalable video coding for both machine and human vision[C]//Proc. of the 31st ACM International Conference on Multimedia, 2023: 9205−9214.
21	TIAN Y, LU G, ZHAI G T. Free-VSC: free semantics from visual foundation models for unsupervised video semantic compression[C]//Proc. of the Computer Vision-European Conference on Computer Vision, 2024.
22	HE J B, HE X H, XIONG S H, et al. Learned image coding for human-machine collaborative optimization[J]. IEEE Trans. on Broadcasting, 2025, 71 (1): 203- 216.
23	SCHWARZ H, MARPE D, WIEGAND T. Overview of the scalable video coding extension of the H. 264/AVC standard[J]. IEEE Trans. on Circuits and Systems for Video Technology, 2007, 17 (9): 1103- 1120. doi: 10.1109/TCSVT.2007.905532
24	SINGH S, GUPTA P. Comparative study ID3, cart and C4.5 decision tree algorithm: a survey[J]. International Journal of Advanced Information Science and Technology, 2014, 27 (27): 97- 103.
25	CHEN L C, ZHU Y K, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]//Proc. of the European Conference on Computer Vision, 2018: 801−818.
26	WANG H Q, KATSAVOUNIDIS I, ZHOU J T, et al. VideoSet: a large-scale compressed video quality dataset based on JND measurement[J]. Journal of Visual Communication and Image Representation, 2017, 46, 292- 302. doi: 10.1016/j.jvcir.2017.04.009
27	SOHAN M, SAI RAM T, REDDY R, et al. A review on yolov8 and its advancements[C]//Proc. of the International Conference on Data Intelligence and Cognitive Informatics, 2024: 529−545.
28	PARK J, LEE C, KIM C S. Asymmetric bilateral motion estimation for video frame interpolation[C]//Proc. of the IEEE/CVF International Conference on Computer Vision, 2021: 14539−14548.
29	PARK J, KO K, LEE C, et al. BMBC: bilateral motion estimation with bilateral cost volume for video interpolation[C]//Proc. of the Computer Vision–European Conference on Computer Vision, 2020: 109−125.
30	REDA F, KONTKANEN J, TABELLION E, et al. FILM: frame interpolation for large motion[EB/OL]. [2024-05-17]. https://arxiv.org/abs/2202.04901.

算法	码率/Mbps	丢包率	mAP
H.264	1.2	0.58	0.22
	0.96	0.48	0.24
	0.72	0.31	0.35
ASVC	1.2	0	0.66
	0.96	0	0.61
	0.72	0	0.56

算法	带宽/Mbps	丢包率	VMAF	SSIM	PSNR
H.264	1	0.15	0.42	0.81	0.45
	0.75	0.36	0.34	0.78	0.42
	0.5	0.57	0.22	0.75	0.38
ASVC	1	0	0.7	0.93	0.56
	0.75	0	0.71	0.93	0.57
	0.5	0	0.68	0.91	0.55

[1]	高波, 傅友华, 刘陈. 太赫兹超大规模MIMO混合场信道估计[J]. 系统工程与电子技术, 2025, 47(8): 2744-2752.
[2]	袁建国, 张芳, 王竟鑫, 王永, 林金朝, 庞宇. 基于公平度和惩罚函数的OFDMA自适应资源分配[J]. 系统工程与电子技术, 2018, 40(2): 427-434.

基于决策树的多视觉任务自适应压缩算法

Decision tree-based adaptive compression algorithm for multi-visual tasks

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

图/表 10

参考文献 30

相关文章 2

编辑推荐

Metrics

本文评价