基于CNN-BiLSTM-MHA时空融合框架的毫米波雷达人体姿态估计

doi:10.12305/j.issn.1001-506X.2026.05.06

系统工程与电子技术 ›› 2026, Vol. 48 ›› Issue (5): 1502-1514.doi: 10.12305/j.issn.1001-506X.2026.05.06

基于CNN-BiLSTM-MHA时空融合框架的毫米波雷达人体姿态估计

罗雨泉¹(), 何雨强¹(), 李雅鑫²^,*(), 梁松², 王俊¹

1. 北京航空航天大学电子信息工程学院，北京 100191
2. 北京航空航天大学杭州创新研究院，浙江杭州 310051

收稿日期:2025-02-24 接受日期:2025-06-23 出版日期:2026-05-27 发布日期:2026-05-27
通讯作者: 李雅鑫 E-mail:luoyuquanhz@163.com;buaahyq@buaa.edu.cn;lyx_hnu@126.com
作者简介:罗雨泉（1994—），男，博士研究生，主要研究方向为雷达信号处理、深度学习、人体感知
何雨强（1995—），男，博士研究生，主要研究方向为雷达信号处理、目标跟踪
梁　松（1997—），男，助理工程师，硕士，主要研究方向为雷达信号处理、人体感知
王　俊（1972—），男，教授，博士，主要研究方向为信号处理、目标识别与跟踪
基金资助:
浙江省科技计划-“尖兵”“领雁”研发攻关计划（2023C01148）；杭州市领军型创新创业团队（TD2022006）资助课题

Millimeter-wave radar human pose estimation based on CNN-BiLSTM-MHA spatio-temporal fusion framework

Yuquan LUO¹(), Yuqiang HE¹(), Yaxin LI²^,*(), Song LIANG², Jun WANG¹

1. School of Electronic Information Engineering，Beihang University，Beijing 100191，China
2. Hangzhou Innovation Institute of Beihang University，Hangzhou 310051，China

Received:2025-02-24 Accepted:2025-06-23 Online:2026-05-27 Published:2026-05-27
Contact: Yaxin LI E-mail:luoyuquanhz@163.com;buaahyq@buaa.edu.cn;lyx_hnu@126.com

摘要/Abstract

摘要：

人体姿态估计在人机交互、活动识别与健康监测等领域具有广泛的应用前景。传统基于光学传感器的方法易受光照条件限制且存在隐私泄露风险，而基于可穿戴设备的技术则存在使用繁琐、长期佩戴不适等问题。为此，提出一种基于卷积神经网络（convolutional neural network, CNN）、双向长短期记忆（bidirectional long-short term memory, BiLSTM）网络和多头注意力（multi-head attention, MHA）机制时空融合框架的毫米波雷达人体姿态估计方法。通过自主研发的毫米波雷达设备生成高质量点云数据，引入滑动窗口机制将单帧点云扩展为多帧时间序列数据。结合CNN提取空间特征，采用BiLSTM进行时序建模，引入MHA机制进一步优化全局特征表达能力。基于多帧点云数据的时空信息融合框架能够充分挖掘时空特征，有效缓解雷达点云稀疏性问题，显著提升了姿态估计的精度与鲁棒性。实验结果表明，所提方法能够实现25个骨骼关节点的定位，x，y，z轴平均误差分别为2.69 cm、2.49 cm与2.98 cm，为毫米波雷达在人体姿态估计中的应用提供了解决方案，具有广泛的实际应用潜力。

关键词: 毫米波雷达, 人体姿态估计, 卷积神经网络, 双向长短期记忆网络, 多头注意力机制

Abstract:

Human pose estimation has many applications in human-computer interaction, activity recognition, and health monitoring. Traditional methods based on optical sensors are often limited by lighting conditions and privacy leakage risks, while wearable device-based technologies face issues such as cumbersome usage and discomfort during long-term wear. To address these challenges, a millimeter-wave radar human pose estimation method is proposed that leverages a spatio-temporal fusion framework combining convolutional neural network （CNN）, bidirectional long short-term memory （BiLSTM） network and multi-head attention （MHA）. High-quality point cloud data are generated using self-developed millimeter-wave radar equipment, and a sliding window mechanism is introduced to expand single-frame point clouds into multi-frame time series datas. Spatial features are extracted through CNN, and time-series modeling is performed using BiLSTM, and further optimization of global feature expression through MHA. This spatio-temporal information fusion framework, based on multi-frame point cloud datas, effectively exploits spatio-temporal features, mitigates the radar point cloud sparsity issue, and significantly enhances the accuracy and robustness of pose estimation. Experimental results show that, compared to existing methods, the proposed method successfully localizes all 25 skeletal joints, with average localization errors of 2.69 cm, 2.49 cm, and 2.98 cm along the x, y, and z axes, respectively. This provides a solution for millimeter-wave radar human pose estimation and demonstrates strong practical application potential.

Key words: millimeter-wave radar, human pose estimation, convolutional neural network（CNN）, bidirectional long short-term memory（BiLSTM） network, multi-head attention（MHA）

中图分类号:

TN 957

罗雨泉, 何雨强, 李雅鑫, 梁松, 王俊. 基于CNN-BiLSTM-MHA时空融合框架的毫米波雷达人体姿态估计[J]. 系统工程与电子技术, 2026, 48(5): 1502-1514.

Yuquan LUO, Yuqiang HE, Yaxin LI, Song LIANG, Jun WANG. Millimeter-wave radar human pose estimation based on CNN-BiLSTM-MHA spatio-temporal fusion framework[J]. Systems Engineering and Electronics, 2026, 48(5): 1502-1514.

图/表 23

图1

图2

图3

图4

图5

图6

图7

图8

图9

图10

图11

图12

图13

表1

表2

表3

图14

表4

表5

表6

图15

图16

表7

参考文献 32

1	SONG Y X, ZHAO Y, YU H, et al. mmWave radar-based WPT/VMD noncontact repetitive motion counter[J]. IEEE Sensors Journal, 2023, 23 (19): 23145- 23157. doi: 10.1109/JSEN.2023.3307390
2	FUKUSHIMA T, BLAUBERGER P, RUSSOMANNO T G, et al. The potential of human pose estimation for motion capture in sports: a validation study[J]. Sports Engineering, 2024, 27 (1): 19. doi: 10.1007/s12283-024-00460-w
3	ZHOU Y J, HUANG H, YUAN S, et al. MetaFi++: WiFi-enabled transformer-based human pose estimation for metaverse avatar simulation[J]. IEEE Internet of Things Journal, 2023, 10 (16): 14128- 14136. doi: 10.1109/JIOT.2023.3262940
4	张春杰, 王冠博, 陈奇, 等. 基于纯自注意力机制的毫米波雷达手势识别[J]. 系统工程与电子技术, 2024, 46 (3): 859- 867.
	ZHANG C J, WANG G B, CHEN Q, et al. Gesture recognition based on millimeter-wave radar with pure self-attention mechanism[J]. Systems Engineering and Electronics, 2024, 46 (3): 859- 867.
5	杨德贵, 许道峰. 基于时频域特征融合的IR-UWB穿墙雷达人体行为识别方法[J]. 系统工程与电子技术, 2024, 46 (3): 849- 858.
	YANG D G, XU D F. Human behavior recognition method of IR-UWB through wall radar based on time-frequency domain feature fusion[J]. Systems Engineering and Electronics, 2024, 46 (3): 849- 858.
6	CHEN J, CHEN D Y, JIANG H X, et al. Skeleton-based 3D human pose estimation with low-resolution infrared array sensor using attention-based CNN-BiGRU[J]. International Journal of Machine Learning and Cybernetics, 2024, 15 (5): 2049- 2062. doi: 10.1007/s13042-023-02015-0
7	NANDAGOPAL S, KARTHY G, OLIVER A S, et al. Optimal deep convolutional neural network with pose estimation for human activity recognition[J]. Computer Systems Science and Engineering, 2023, 44 (2): 1719- 1733. doi: 10.32604/csse.2023.028003
8	BOUDLAL H, SERRHINI M, TAHIRI A, et al. A novel approach for simultaneous human activity recognition and pose estimation via skeleton-based leveraging WiFi CSI with YOLOv8 and mediapipe frameworks[J]. Signal, Image and Video Processing, 2024, 18 (4): 3673- 3689. doi: 10.1007/s11760-024-03031-5
9	CHOI J Y, HA E, SON M, et al. Human joint angle estimation using deep learning-based three-dimensional human pose estimation for application in a real environment[J]. Sensors, 2024, 24 (12): 3823. doi: 10.3390/s24123823
10	WU X, ZHANG H Y, KONG C X, et al. LiDAR-based 3-D human pose estimation and action recognition for medical scenes[J]. IEEE Sensors Journal, 2024, 24 (9): 15531- 15539. doi: 10.1109/JSEN.2024.3373192
11	LAN S C, YE L, ZHANG K. Applying mmWave radar sensors to vocabulary-level dynamic chinese sign language recognition for the community with deafness and hearing loss[J]. IEEE Sensors Journal, 2023, 23 (22): 27273- 27283. doi: 10.1109/JSEN.2023.3324369
12	ZHU Y A, XIAO M R, XIE Y J, et al. In-bed human pose estimation using multi-source information fusion for health monitoring in real-world scenarios[J]. Information Fusion, 2024, 105, 102209. doi: 10.1016/j.inffus.2023.102209
13	SHIAO Y, CHEN G Y, HOANG T. Three-dimensional human posture recognition by extremity angle estimation with minimal IMU sensor[J]. Sensors, 2024, 24 (13): 4306. doi: 10.3390/s24134306
14	YU S, ZHAI D H, XIA Y, et al. Synthetic depth image-based category-level object pose estimation with effective pose decoupling and shape optimization[J]. IEEE Trans. on Instrumentation and Measurement, 2024, 73, 5026718.
15	LI D, MU Q, YUAN Y L, et al. 6D pose estimation based on 3D edge binocular reprojection optimization for robotic assembly[J]. IEEE Robotics and Automation Letters, 2023, 8 (12): 8319- 8326. doi: 10.1109/LRA.2023.3327933
16	LI X, ZHANG D, LI M, et al. Accurate head pose estimation using image rectification and a lightweight convolutional neural network[J]. IEEE Trans. on Multimedia, 2023, 25, 2239- 2251. doi: 10.1109/TMM.2022.3144893
17	LIU S N, ZHAO L, YANG X, et al. Remote drowsiness detection based on the mmWave FMCW radar[J]. IEEE Sensors Journal, 2022, 22 (15): 15222- 15234. doi: 10.1109/JSEN.2022.3186486
18	SKARIA S, HENDY N, AL-HOURANI A, et al. Machine-learning methods for material identification using mmWave radar sensor[J]. IEEE Sensors Journal, 2023, 23 (2): 1471- 1478. doi: 10.1109/JSEN.2022.3227207
19	蔡嘉怡, 初萍, 庄伦涛, 等. 基于空间属性特征的毫米波雷达身体干扰识别[J]. 系统工程与电子技术, 2024, 46 (10): 3365- 3374.
	CAI J Y, CHU P, ZHUANG L T, et al. Millimeter-wave radar body interference recognition based on spatial attribute features[J]. Systems Engineering and Electronics, 2024, 46 (10): 3365- 3374.
20	YU C X, XU Z, YAN K, et al. Noninvasive human activity recognition using millimeter-wave radar[J]. IEEE Systems Journal, 2022, 16 (2): 3036- 3047. doi: 10.1109/JSYST.2022.3140546
21	HUANG Y C, LI W, DOU Z, et al. Activity recognition based on millimeter-wave radar by fusing point cloud and range-Doppler information[J]. Signals, 2022, 3 (2): 266- 283. doi: 10.3390/signals3020017
22	MOON J Y, KIM B K, KANG J, et al. Fixed point cloud normalization and none-sequential modeling for hand gesture recognition based on short-range mmWave radar sensor’s sparse time-series point cloud[J]. IEEE Sensors Journal, 2024, 24 (7): 10656- 10668. doi: 10.1109/JSEN.2024.3362473
23	WANG Z M, JIANG D, SUN B, et al. A data augmentation method for human activity recognition based on mmWave radar point cloud[J]. IEEE Sensors Letters, 2023, 7 (5): 6002404.
24	SENGUPTA A, JIN F, ZHANG R Y, et al. mm-pose: real-time human skeletal posture estimation using mmWave radars and CNNs[J]. IEEE Sensors Journal, 2020, 20 (17): 10032- 10044. doi: 10.1109/JSEN.2020.2991741
25	LI G Z, ZHANG Z, YANG H M, et al. Capturing human pose using mmWave radar[C]//Proc. of the IEEE International Conference on Pervasive Computing and Communications Workshops, 2020.
26	AN S Z, OGRAS U Y. MARS: mmWave-based assistive rehabilitation system for smart healthcare[J]. ACM Transaction on Embedded Computing Systems, 2021, 20 (5s): 72.
27	AN S Z, OGRAS U Y. Fast and scalable human pose estimation using mmWave point cloud[C]//Proc. of the 59th ACM/IEEE Design Automation Conference, 2022: 889−894.
28	SENGUPTA A, CAO S Y. MmPose-NLP: a natural language processing approach to precise skeletal pose estimation using mmWave radars[J]. IEEE Trans. on Neural Networks and Learning Systems, 2023, 34 (11): 8418- 8429. doi: 10.1109/TNNLS.2022.3151101
29	HU S T, CAO S Y, TOOSIZADEH N, et al. MmPose-FK: a forward kinematics approach to dynamic skeletal pose estimation using mmWave radars[J]. IEEE Sensors Journal, 2024, 24 (5): 6469- 6481. doi: 10.1109/JSEN.2023.3348199
30	舒月, 傅东宁, 陈展野, 等. 基于 RD-ANM 的毫米波雷达动目标超分辨 DOA 估计方法[J]. 雷达学报, 2023, 12 (5): 986- 999.
	SHU Y, FU D N, CHEN Z Y, et al. Super-resolution DOA estimation method for a moving target equipped with a millimeter-wave radar based on RD-ANM[J]. Journal of Radar, 2023, 12 (5): 986- 999.
31	LEE K M, LEE I S, SHIN H S, et al. Reconstruction of range-Doppler map corrupted by FMCW radar asynchronization[J]. Sensors, 2023, 23 (12): 5605. doi: 10.3390/s23125605
32	ZHAO L F, LIN Z X, SUN R Y, et al. A review of state-of-the-art methodologies and applications in action recognition[J]. Electronics, 2024, 13 (23): 4733. doi: 10.3390/electronics13234733

数据集	模型	x		y		z		平均值
数据集	模型	MAE	RMSE	MAE	RMSE	MAE	RMSE	MAE	RMSE
文献[26]	MARS	9.25	12.26	5.15	6.92	9.63	12.91	8.01	10.70
	mmPose	8.55	12.24	5.20	6.94	8.69	12.55	7.48	10.58
	FUSE	6.70	8.87	3.74	5.02	6.95	9.33	5.79	7.74
	mmPose-NLP	4.00	6.86	2.94	4.38	4.11	7.82	3.68	6.25
	CNN-BiLSTM-MHA	3.20	5.85	2.49	3.81	3.40	6.42	3.13	5.45

数据集	模型	x		y		z		平均值
数据集	模型	MAE	RMSE	MAE	RMSE	MAE	RMSE	MAE	RMSE
自建数据集	MARS	5.90	7.99	3.69	4.89	5.98	8.47	5.19	7.12
	mmPose	5.30	7.92	3.52	4.83	5.21	8.40	4.68	7.05
	FUSE	4.00	5.43	2.99	4.14	4.06	5.88	3.69	5.15
	mmPose-NLP	4.13	6.17	2.94	4.08	4.06	6.54	3.71	5.59
	CNN-BiLSTM-MHA	2.61	4.34	2.37	3.59	2.90	4.72	2.63	4.22

点云特征	x	y	z	平均值
xyz	3.12	2.94	3.33	3.13
xyz+v	2.61	2.37	2.90	2.63
xyz+I	4.08	3.23	4.34	3.88
xyz+v+I	4.20	3.24	4.20	3.88

CNN结构	x	y	z	平均值
BN× MP×	2.94	2.86	3.10	2.97
BN√ MP×	2.61	2.37	2.90	2.63
BN× MP√	3.28	2.88	3.21	3.12
BN√ MP√	4.64	4.82	4.44	4.63

模块	x	y	z	平均
BiLSTM×	5.83	4.78	6.20	5.60
LSTM	3.09	2.74	3.30	3.04
MLP	3.90	3.28	4.06	3.75
Transformer	3.28	2.77	3.46	3.17
BiLSTM	2.61	2.37	2.90	2.63

基于CNN-BiLSTM-MHA时空融合框架的毫米波雷达人体姿态估计

Millimeter-wave radar human pose estimation based on CNN-BiLSTM-MHA spatio-temporal fusion framework

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

图/表 23

参考文献 32

相关文章 15

编辑推荐

Metrics

本文评价

模型配置	x	y	z	平均
无MHA	2.69	2.49	2.98	2.72
有MHA	2.61	2.37	2.90	2.63

实验场景	x	y	z	平均值
有遮挡	3.14	2.89	3.66	3.23
无遮挡	2.88	2.58	3.28	2.91

[1]	黄琼丹, 刘露露, 韩洁婧, 王佳鹏, 康仕林. 基于差异特征反投影融合的图像超分辨重建[J]. 系统工程与电子技术, 2026, 48(6): 1809-1818.
[2]	孙景荣, 张华, 陈哲哲, 赵方正. 基于卷积神经网络的雾天道路能见度测量方法[J]. 系统工程与电子技术, 2026, 48(4): 1112-1124.
[3]	蒋明煜, 张顺生, 肖思瑶. 面向轻量级交叉注意力卷积网络的SAR目标识别[J]. 系统工程与电子技术, 2025, 47(9): 2853-2861.
[4]	陈文洁, 张浦, 史高翔, 刘林, 刘烜. 基于余弦校验关系的卷积神经网络LDPC码盲识别[J]. 系统工程与电子技术, 2025, 47(9): 3117-3125.
[5]	姜智杰, 宋恒, 胡楠, 段兰茜, 曹平. 隧道环境毫米波雷达目标识别与分类算法[J]. 系统工程与电子技术, 2025, 47(5): 1453-1460.
[6]	段阿敏, 张朝辉. 基于二次分解的混合神经网络蜂窝流量预测[J]. 系统工程与电子技术, 2025, 47(5): 1687-1697.
[7]	王奇, 王子瑶, 郑峻峰. 考虑多源噪声及信号传输的雷达系统仿真模型[J]. 系统工程与电子技术, 2025, 47(3): 768-778.
[8]	付卫红, 张鑫钰, 刘乃安. 基于多尺度融合神经网络的同频同调制单通道盲源分离算法[J]. 系统工程与电子技术, 2025, 47(2): 641-649.
[9]	孔令宸, 刘桐辛, 周晨, 赵正予. 基于区域电离层高分辨率重构的到达时差定位分析[J]. 系统工程与电子技术, 2025, 47(11): 3521-3530.
[10]	师英杰, 周东东, 雷泰然, 黄三傲, 徐科. 纵波EMAT优化与非接触管道液位监测[J]. 系统工程与电子技术, 2025, 47(11): 3531-3542.
[11]	房瑞祥, 石晓进, 张云华. 基于MIAA的稀疏轨迹扫描毫米波三维成像算法[J]. 系统工程与电子技术, 2025, 47(11): 3612-3625.
[12]	韦娟, 何德华, 宁方立. 基于自适应多分支卷积的声学场景分类[J]. 系统工程与电子技术, 2025, 47(10): 3148-3154.
[13]	邵永琪, 杨丽花, 常澳, 任露露. RIS辅助的OFDM系统中时变信道估计方法[J]. 系统工程与电子技术, 2025, 47(1): 324-331.
[14]	张明龙, 吴雨林, 魏文强, 沈园杰, 郭世盛, 崔国龙. 基于稀疏矩阵填充的级联毫米波雷达高分辨测角方法[J]. 系统工程与电子技术, 2024, 46(8): 2629-2640.
[15]	张冬, 邢福逸, 徐允鹤, 钱鹏. 基于双模式切换的机载惯性/雷达组合导航方法[J]. 系统工程与电子技术, 2024, 46(8): 2770-2778.