(Peer-Reviewed) Learning-based joint UAV trajectory and power allocation optimization for secure IoT networks
Dan Deng 邓单 ¹, Xingwang Li 李兴旺 ², Varun Menon ³, Md Jalil Piran ⁴, Hui Chen 陈慧 ², Mian Ahmad Jan ⁵
¹ School of Information Engineering, Guangzhou Panyu Polytechnic, Guangzhou, 410630, China
中国 广州 广州番禺职业技术学院信息工程学院
² School of Physics and Electronic Information Engineering, Henan Polytechnic University, Jiaozuo, 454150, China
中国 焦作 河南理工大学 信息科学与工程学院
³ Department of Computer Science and Engineering, SCMS School of Engineering and Technology, India
⁴ Department of Computer Science and Engineering, Sejong University, South Korea
⁵ Department of Computer Science, Abdul Wali Khan University Mardan, Pakistan
Non-Orthogonal Multiplex Access (NOMA) can be deployed in Unmanned Aerial Vehicle (UAV) networks to improve spectrum efficiency. Due to the broadcasting feature of NOMA-UAV networks, it is essential to focus on the security of the wireless system. This paper focuses on maximizing the secrecy sum rate under the constraint of the achievable rate of the legitimate channels. To tackle the non-convexity optimization problem, a reinforcement learning-based alternative optimization algorithm is proposed.
Firstly, with the help of successive convex approximations, the optimal power allocation scheme with a given UAV trajectory is obtained by using convex optimization tools. Afterwards, through plenty of explorations of the wireless environment, the Q-learning networks approach the optimal location transition strategy of the UAV, even without the wireless channel state information.