摘要: |
宽带跳频与深度强化学习结合的智能跳频通信模式能有效提高通信抗干扰能力。针对同时调整信号频点和功率的双动作空间智能决策由于频点离散但功率非离散使得决策依赖的深度强化学习算法难以设计的问题,基于离散型深度确定性策略梯度算法(Wolpertinger Deep Deterministic Policy Gradient,W-DDPG),提出了一种适于宽带跳频通信且具有发射频率和功率组成的双动作空间智能抗干扰决策方法。该决策方法面向频率/功率双动作空间,在频率空间中使用Wolpertinger架构处理频率动作,并与功率动作组成联合动作,然后使用DDPG算法进行训练,使该算法能够适用于宽带跳频双动作空间的抗干扰场景,在复杂的电磁环境下能够快速作出决策。仿真结果表明,该方法在宽带跳频双动作空间干扰模式下的收敛速度及抗干扰性能较传统抗干扰算法提升了大约25%。 |
关键词: 通信抗干扰 深度强化学习 双动作空间 智能决策 |
DOI:10.20079/j.issn.1001-893x.230619002 |
|
基金项目:国家自然科学基金资助项目(61771256) |
|
An Intelligent Decision Algorithm for Broadband Frequency Hopping Anti-jamming with Dual Action Networks |
XIA Chongyang,WU Xiaofu,JIN Yue |
(School of Communications and Information Engineering,Nanjing University ofPosts and Telecommunication,Nanjing 210009,China) |
Abstract: |
The intelligent frequency hopping(FH) communication mode combining broadband FH and deep reinforcement learning(DRL) can effectively improve the communication anti-jamming ability.However,the frequency and power dual action spatial dimensions of broadband intelligent FH slow down the decision-making speed of DRL and affect the anti-jamming effect.For this problem,a dual action spatial intelligent anti-jamming fast decision-making method based on the Wolpertinger Deep Deterministic Policy Gradient(W-DDPG) algorithm is proposed,which is suitable for broadband FH communication and consists of transmission frequency and power.This deep decision algorithm is oriented towards the frequency/power dual action space,where the Wolpertinger architecture is used to process frequency actions and form joint actions with power actions.Then,the DDPG algorithm is used for training.This algorithm can be applied to broadband FH anti-jamming scenarios in dual action spaces and can make decisions quickly in complex electromagnetic environments.The simulation results show that the convergence speed and anti-jamming performance of this method in broadband FH dual action spatial jamming mode are improved by about 25% compared with that of traditional anti-jamming algorithms. |
Key words: anti-jamming communication deep reinforcement learning dual action space intelligent decision-making |