On Designing Multi-UAV aided Wireless Powered Dynamic Communication via Hierarchical Deep Reinforcement Learning
Abstract: This paper proposes a novel design on the wireless powered communication network (WPCN) in dynamic environments under the assistance of multiple unmanned aerial vehicles (UAVs). Unlike the existing studies, where the low-power wireless nodes (WNs) often conform to the coherent harvest-then-transmit protocol, under our newly proposed double-threshold based WN type updating rule, each WN can dynamically and repeatedly update its WN type as an E-node for non-linear energy harvesting over time slots or an I-node for transmitting data over sub-slots. To maximize the total transmission data size of all the WNs over T slots, each of the UAVs individually determines its trajectory and binary wireless energy transmission (WET) decisions over times slots and its binary wireless data collection (WDC) decisions over sub-slots, under the constraints of each UAV's limited on-board energy and each WN's node type updating rule. However, due to the UAVs' tightly-coupled trajectories with their WET and WDC decisions, as well as each WN's time-varying battery energy, this problem is difficult to solve optimally. We then propose a new multi-agent based hierarchical deep reinforcement learning (MAHDRL) framework with two tiers to solve the problem efficiently, where the soft actor critic (SAC) policy is designed in tier-1 to determine each UAV's continuous trajectory and binary WET decision over time slots, and the deep-Q learning (DQN) policy is designed in tier-2 to determine each UAV's binary WDC decisions over sub-slots under the given UAV trajectory from tier-1. Both of the SAC policy and the DQN policy are executed distributively at each UAV. Finally, extensive simulation results are provided to validate the outweighed performance of the proposed MAHDRL approach over various state-of-the-art benchmarks.
- J. Xu, Y. Zeng, and R. Zhang, “UAV-enabled wireless power transfer: Trajectory design and energy optimization,” IEEE transactions on wireless communications, vol. 17, no. 8, pp. 5092–5106, 2018.
- L. Xie, J. Xu, and R. Zhang, “Throughput maximization for UAV-enabled wireless powered communication networks,” IEEE Internet of Things Journal, vol. 6, no. 2, pp. 1690–1703, 2018.
- O. S. Oubbati, M. Atiquzzaman, H. Lim, A. Rachedi, and A. Lakas, “Synchronizing UAV teams for timely data collection and energy transfer by deep reinforcement learning,” IEEE Transactions on Vehicular Technology, vol. 71, no. 6, pp. 6682–6697, 2022.
- L. Wang, Y. L. Che, J. Long, L. Duan, and K. Wu, “Multiple access mmwave design for UAV-aided 5g communications,” IEEE Wireless Communications, vol. 26, no. 1, pp. 64–71, 2019.
- A. Al-Hourani, S. Kandeepan, and S. Lardner, “Optimal LAP altitude for maximum coverage,” IEEE Wireless Communications Letters, vol. 3, no. 6, pp. 569–572, 2014.
- C. Zhan, Y. Zeng, and R. Zhang, “Energy-efficient data collection in UAV enabled wireless sensor network,” IEEE Wireless Communications Letters, vol. 7, no. 3, pp. 328–331, 2017.
- J. Liu, P. Tong, X. Wang, B. Bai, and H. Dai, “UAV-aided data collection for information freshness in wireless sensor networks,” IEEE Transactions on Wireless Communications, vol. 20, no. 4, pp. 2368–2382, 2020.
- P. Tong, J. Liu, X. Wang, B. Bai, and H. Dai, “UAV-enabled age-optimal data collection in wireless sensor networks,” in 2019 IEEE International Conference on Communications Workshops (ICC Workshops), 2019, pp. 1–6.
- Q. Wu, Y. Zeng, and R. Zhang, “Joint trajectory and communication design for multi-UAV enabled wireless networks,” IEEE Transactions on Wireless Communications, vol. 17, no. 3, pp. 2109–2121, 2018.
- S. S. Hassan, Y. M. Park, Y. K. Tun, W. Saad, Z. Han, and C. S. Hong, “3to: THz-enabled throughput and trajectory optimization of UAVs in 6g networks by proximal policy optimization deep reinforcement learning,” in ICC 2022-IEEE International Conference on Communications. IEEE, 2022, pp. 5712–5718.
- G. Chen, X. B. Zhai, and C. Li, “Joint optimization of trajectory and user association via reinforcement learning for UAV-aided data collection in wireless networks,” IEEE Transactions on Wireless Communications, vol. 22, no. 5, pp. 3128–3143, 2023.
- Q. Wang, W. Zhang, Y. Liu, and Y. Liu, “Multi-UAV dynamic wireless networking with deep reinforcement learning,” IEEE Communications Letters, vol. 23, no. 12, pp. 2243–2246, 2019.
- B. Clerckx, R. Zhang, R. Schober, D. W. K. Ng, D. I. Kim, and H. V. Poor, “Fundamentals of wireless information and power transfer: From RF energy harvester models to signal and system designs,” IEEE Journal on Selected Areas in Communications, vol. 37, no. 1, pp. 4–33, 2018.
- J. Shi, P. Cong, L. Zhao, X. Wang, S. Wan, and M. Guizani, “A two-stage strategy for UAV-enabled wireless power transfer in unknown environments,” IEEE Transactions on Mobile Computing, 2023.
- H. Ren, Z. Zhang, Z. Peng, L. Li, and C. Pan, “Energy minimization in RIS-assisted UAV-enabled wireless power transfer systems,” IEEE Internet of Things Journal, vol. 10, no. 7, pp. 5794–5809, 2022.
- J. Baek, S. I. Han, and Y. Han, “Optimal UAV route in wireless charging sensor networks,” IEEE Internet of Things Journal, vol. 7, no. 2, pp. 1327–1335, 2019.
- J. Mu and Z. Sun, “Trajectory design for multi-UAV-aided wireless power transfer toward future wireless systems,” Sensors, vol. 22, no. 18, p. 6859, 2022.
- L. Xie, X. Cao, J. Xu, and R. Zhang, “UAV-enabled wireless power transfer: A tutorial overview,” IEEE Transactions on Green Communications and Networking, vol. 5, no. 4, pp. 2042–2064, 2021.
- Z. Y. Zhao, Y. L. Che, S. Luo, K. Wu, and V. C. M. Leung, “Multi-agent graph reinforcement learning based on-demand wireless energy transfer in multi-UAV-aided IoT network,” arXiv preprint arXiv:2307.14626, 2023.
- L. Liu, K. Xiong, J. Cao, Y. Lu, P. Fan, and K. B. Letaief, “Average AoI minimization in UAV-assisted data collection with rf wireless power transfer: A deep reinforcement learning scheme,” IEEE Internet of Things Journal, vol. 9, no. 7, pp. 5216–5228, 2021.
- W. Luo, Y. Shen, B. Yang, S. Wang, and X. Guan, “Joint 3-D trajectory and resource optimization in multi-UAV-enabled IoT networks with wireless power transfer,” IEEE Internet of Things Journal, vol. 8, no. 10, pp. 7833–7848, 2020.
- Y. Che, Y. Lai, S. Luo, K. Wu, and L. Duan, “UAV-aided information and energy transmissions for cognitive and sustainable 5g networks,” IEEE Transactions on Wireless Communications, vol. 20, no. 3, pp. 1668–1683, 2020.
- P. N. Alevizos and A. Bletsas, “Sensitive and nonlinear far-field RF energy harvesting in wireless communications,” IEEE Transactions on Wireless Communications, vol. 17, no. 6, pp. 3670–3685, 2018.
- PowerCast Module. Accessed. (2020, Jul). [Online]. Available: http://www.mouser.com/ds/2/329/P2110B-DatasheetRev-3-1091766.pdf
- Y. L. Che, L. Duan, and R. Zhang, “Spatial throughput maximization of wireless powered communication networks,” IEEE Journal on Selected Areas in Communications, vol. 33, no. 8, pp. 1534–1548, 2015.
- Y. Zeng, J. Xu, and R. Zhang, “Energy minimization for wireless communication with rotary-wing UAV,” IEEE Transactions on Wireless Communications, vol. 18, no. 4, pp. 2329–2345, 2019.
- Z. Dai, C. H. Liu, Y. Ye, R. Han, Y. Yuan, G. Wang, and J. Tang, “AoI-minimal UAV crowdsensing by model-based graph convolutional reinforcement learning,” in IEEE INFOCOM 2022-IEEE Conference on Computer Communications. IEEE, 2022, pp. 1029–1038.
- L. Wang, K. Wang, C. Pan, W. Xu, N. Aslam, and A. Nallanathan, “Deep reinforcement learning based dynamic trajectory control for UAV-assisted mobile edge computing,” IEEE Transactions on Mobile Computing, vol. 21, no. 10, pp. 3536–3550, 2021.
- H. Peng and X. Shen, “Multi-agent reinforcement learning based resource management in MEC and UAV-assisted vehicular networks,” IEEE Journal on Selected Areas in Communications, vol. 39, no. 1, pp. 131–141, 2020.
- V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski et al., “Human-level control through deep reinforcement learning,” nature, vol. 518, no. 7540, pp. 529–533, 2015.
- T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine, “Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor,” in International conference on machine learning. PMLR, 2018, pp. 1861–1870.
- D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.