[1]顾玉宛,杨秋媛,竺智华,等.连续状态空间下机器人避障方法研究[J].常州大学学报(自然科学版),2023,35(01):68-77.[doi:10.3969/j.issn.2095-0411.2023.01.009]
 GU Yuwan,YANG Qiuyuan,ZHU Zhihua,et al.Research of robot obstacle avoidance in continuous state space[J].Journal of Changzhou University(Natural Science Edition),2023,35(01):68-77.[doi:10.3969/j.issn.2095-0411.2023.01.009]
点击复制

连续状态空间下机器人避障方法研究()
分享到:

常州大学学报(自然科学版)[ISSN:2095-0411/CN:32-1822/N]

卷:
第35卷
期数:
2023年01期
页码:
68-77
栏目:
计算机与信息工程
出版日期:
2023-01-28

文章信息/Info

Title:
Research of robot obstacle avoidance in continuous state space
文章编号:
2095-0411(2023)01-0068-10
作者:
顾玉宛杨秋媛竺智华徐守坤
(常州大学计算机与人工智能学院,江苏常州213164)
Author(s):
GU Yuwan YANG Qiuyuan ZHU Zhihua XU Shoukun
(School of Computer Science and Artificial Intelligence, Changzhou University, Changzhou 213164, China)
关键词:
深度强化学习 像素点碰撞检测 机器人避障 环境泛化
Keywords:
deep reinforcement learning pixel collision detection robot obstacle avoidance environment generalization
分类号:
TP 3
DOI:
10.3969/j.issn.2095-0411.2023.01.009
文献标志码:
A
摘要:
针对机器人避障研究在连续状态空间下的环境泛化问题,提出一种基于深度强化学习的机器人避障方法。该方法引入像素点碰撞检测模块,并结合像素点碰撞模拟距离传感器,获得机器人与任意形态障碍物之间的距离和是否碰撞等信息。在深度强化学习过程中,移动机器人面对未知环境,通过行走获得经验数据训练神经网络,更新网络参数,优化机器人行为决策,实现避障任务。实验结果表明,在机器人避障过程中引入像素点碰撞检测能有效解决环境泛化问题,且动静态环境中训练出的网络模型具有较好的泛化能力。
Abstract:
Aiming at the environment generalization of robot obstacle avoidance research in continuous state space, a robot obstacle avoidance method based on deep reinforcement learning is proposed. In the method, the pixel collision detection module is introduced in the simulation environment, and combines the pixel collision to simulate the distance sensors,in order to obtain the distance and collision information between the robot and any obstacles. In the process of deep reinforcement learning, the mobile robot faces unknown environment, trains the neural network by walking to acquire the experience data, updates the network parameters, optimizes its behavior decision, and realizes the obstacle avoidance tasks. Experimental results show that the problem of environment generalization can be effectively solved by introducing pixel collision detection in the process of robot obstacle avoidance, and the network model trained in dynamic environment has good generalization ability.

参考文献/References:

[1] 储开斌, 朱磊, 张继. 融合KCF和HOG的改进TLD目标跟踪算法[J]. 常州大学学报(自然科学版), 2022, 34(1): 60-67.
[2] 吴鹏, 周倩如, 余双, 等. 基于组合策略的无人艇路径规划方法[J]. 常州大学学报(自然科学版), 2020, 32(3): 47-52.
[3] FU B, CHEN L, ZHOU Y T, et al. An improved A* algorithm for the industrial robot path planning with high success rate and short length[J]. Robotics and Autonomous Systems, 2018, 106: 26-37.
[4] XU Y, GUAN G F, SONG Q W, et al. Heuristic and random search algorithm in optimization of route planning for robot's geomagnetic navigation[J]. Computer Communications, 2020, 154: 12-17.
[5] OROZCO-ROSASU, MONTIELO, SEPLVEDAR. Mobile robot path planning using membrane evolutionary artificial potential field[J]. Applied Soft Computing, 2019, 77: 236-251.
[6] LEE J, KIM D W. An effective initialization method for genetic algorithm-based robot path planning using a directed acyclic graph[J]. Information Sciences, 2016, 332: 1-18.
[7] BHARADWAJ H, E V K. Comparative study of neural networks in path planning for catering robots[J]. Procedia Computer Science, 2018, 133: 417-423.
[8] MAOUDJ A, HENTOUT A. Optimal path planning approach based on Q-learning algorithm for mobile robots[J]. Applied Soft Computing, 2020, 97: 106796.
[9] MALARVEL M, SETHUMADHAVAN G, BHAGI P C R, et al. An improved version of Otsu's method for segmentation of weld defects on X-radiography images[J]. Optik, 2017, 142: 109-118.
[10] RAAJAN J, SRIHARI P V, SATYAJ P, et al. Real time path planning of robot using deep reinforcement learning[J]. IFAC-PapersOnLine, 2020, 53(2): 15602-15607.
[11] HAN X F, HE H W, WU J D, et al. Energy management based on reinforcement learning with double deep Q-learning for a hybrid electric tracked vehicle[J]. Applied Energy, 2019, 254: 113708.
[12] OU J J, GUO X, ZHU M, et al. Autonomous quadrotor obstacle avoidance based on dueling double deep recurrent Q-learning with monocular vision[J]. Neurocomputing, 2021, 441: 300-310.
[13] ZHAO X Y, ZONG Q, TIAN B L, et al. Fast task allocation for heterogeneous unmanned aerial vehicles through reinforcement learning[J]. Aerospace Science and Technology, 2019, 92: 588-594.
[14] ZHANG W Y, GAI J Y, ZHANG Z G, et al. Double-DQN based path smoothing and tracking control method for robotic vehicle navigation[J]. Computers and Electronics in Agriculture, 2019, 166: 104985.
[15] ZHANG Q C, LIN M, YANG L T, et al. Energy-efficient scheduling for real-time systems based on deep Q-learning model[J]. IEEE Transactions on Sustainable Computing, 2019, 4(1): 132-141.
[16] LIU L B, HODGINS J. Learning to schedule control fragments for physics-based characters using deep Q-learning[J]. ACM Transactions on Graphics, 2017, 36(3): 1-14.
[17] LYU L H, ZHANG S J, DING D R, et al. Path planning via an improved DQN-based learning policy[J]. IEEE Access, 2019, 7: 67319-67330.
[18] SUTTON R S, BARTO A G. Reinforcement learning: an introduction 2nd edition[M]. Cambridge: The MIT Press, 2017: 98-102.
[19] ZHENG Y B, LI B, AN D Y, et al. A multi-agent path planning algorithm based on hierarchical reinforcement learning and artificial potential field[J]. 2015 11th International Conference on Natural Computation(ICNC), 2015: 363-369.
[20] WANG J X, ELFWING S, UCHIBE E. Modular deep reinforcement learning from reward and punishment for robot navigation[J]. Neural Networks, 2021, 135: 115-126.
[21] HAN X N, LIU H P, SUN F C, et al. Active object detection with multistep action prediction using deep Q-network[J]. IEEE Transactions on Industrial Informatics, 2019, 15(6): 3723-3731.
[22] PATEL D, HAZAN H, SAUNDERSD J, et al. Improved robustness of reinforcement learning policies upon conversion to spiking neuronal network platforms applied to atari breakout game[J]. Neural Networks: the Official Journal of the International Neural Network Society, 2019, 120: 108-115.
[23] LI X H, ZHAO G, LI B T. Generating optimal path by level set approach for a mobile robot moving in static/dynamic environments[J]. Applied Mathematical Modelling, 2020, 85: 210-230.
[24] GUO T, JIANG N, LI B Y, et al. UAV navigation in high dynamic environments: a deep reinforcement learning approach[J]. Chinese Journal of Aeronautics, 2021, 34(2): 479-489.
[25] SONG Y, LI Y B, LI C H, et al. An efficient initialization approach of Q-learning for mobile robots[J]. International Journal of Control, Automation and Systems, 2012, 10(1): 166-172.

备注/Memo

备注/Memo:
收稿日期: 2022-09-23。
基金项目: 国家自然科学基金资助项目(61906021); 江苏省高等学校大学生创新创业训练计划资助项目(2020-D-07)。
作者简介: 顾玉宛(1982—), 女, 江苏苏州人, 博士, 讲师。通信联系人: 徐守坤(1972—), E-mail: shoukxu@126.com
更新日期/Last Update: 1900-01-01