Neuroevolución de redes neuronales híbridas en un agente robótico (NRNH-AR)

Carlos Vasquez-Jalpa; Mariko Nakano-Miyatake; Hector Perez-Meana

doi:10.29057/icbi.v10iEspecial4.9070

Carlos Alberto Vazquez-Jalpa Instituto Politécnico Nacional https://orcid.org/0000-0002-3911-7906
Mariko Nakano-Miyatake Instituto Politécnico Nacional https://orcid.org/0000-0003-1346-7825
Héctor Pérez-Meana Instituto Politecnico Nacional https://orcid.org/0000-0002-7786-2050

DOI: https://doi.org/10.29057/icbi.v10iEspecial4.9070

Keywords: Edge Computing, Deep Reinforcement Learning, Neuroevolution, DDPG, Policy Gradient

Abstract

A Robotic Agent capable of learning from the dynamic environment through which it navigates has been developed, which aims to find a specific object. For the growth of their learning, the Neuroevolution of Hybrid Neural Networks in a Robotic Agent (NRNH-AR) has been created that combines networks such as CNN to understand the environment and ANN to perform actions, this is complemented by Deep Deterministic Policy Gradient (DDPG) composed by Deep Reinforcement Learning and Policy Gradient. However, for the algorithm to be successful practically in a physical robot, two blocks have also been considered: the Hardware and the mechanics involved, as it will be trained online to avoid latency problems and bandwidth limitation. With this research, it has been shown that with NRNH-AR it is possible to implement Deep Reinforcement Learning within a Robot, performing edge computing, in which there is not latency problem, optimizing time and computational cost through an evolutionary learning.

Downloads

Download data is not yet available.

References

Chen, W., Zhou, S., Pan, Z., Zheng, H., Liu, Y (2019). Mapless Collaborative Navigation for a Multi-Robot System Based on the Deep Reinforcement Learning. Volume26 Issue 5 doi: 10.3390/app9204198

Diaz-Arango, G., Vazquez-Leal, H., Hernandez-Martinez, L., Jimenez-Fernandez, V. M., Heredia-Jimenez, A., Ambrosio, R. C., Huerta-Chua, J., De Cos-Cholula, H., Hernandez-Mendez, S. (2020). Multiple-Target Homotopic Quasi-Complete Path Planning Method for Mobile Robot Using a Piecewise Linear Approach. MDPIST ALBAN-ANLAGE 66, Ch-4052 Basel, Switzerland. Volume 20, Issue 11, Article Number 3265, doi: 10.3390/s20113265

Dong Gi, G., Kyon Mo, Y., Min Ro, P., Jehun, H., Jaewan, K., Joonwoo, L., Kap Ho, S. (2021). Marker-Based Method for Recognition of Camera Position for Mobile Robots. Sensors 2021, 21, 1077. Doi: 10.3390/s21041077

Faisal, M., Hedjar, R. Sulaiman, M. A. (2013). Fuzzy Logic Navigation and Obstacle Avoidance by a Mobile Robot in an Unknown Dynamic Environment. International Journal of Advanced Robotic Systems, College of Computer and Information Sciences, King Saud University, Saudi Arabia. doi: 10.5772/54427

Fujimoto, S., Hoof, H., Meger, D. (2018). Addressing Function Approximation Error in Actor-Critic Methods. Proceedings of the 35 th International Conference on Machine Learning, Stockholm, Sweden PMLR 80.

Gon-Woo, K (2014). Expanded Guide Circle-based Obstacle Avoidance for the Remotely Operated Mobile Robot. Springer singapore pte ltd#04-01 cencon i, 1 tannery rd, Singapore 347719, SINGAPORE. Volume 9, Issue 3, Page 1034-1042, doi: 10.5370/JEET.2014.9.3.1034

Grigoryeva, S., Alimkhanova, A., Batalova, M. (2021). Research of indoor temperature data transmission using visible light communication technology. IMET 2020 Journal of Physics: Conference Series. 1843 012004 doi: 10.1088/1742-6596/1843/1/012004

İrem, M., Alper, K. T., Beyda, T. Ahmet, B. T., Oğuz, Y. (2020). FUHAR: A transformable wheel-legged hybrid mobile robot. Elsevierradarweg 29, 1043 nx amsterdam, netherlands. Volume 133, Article Number 103627. Doi: 10.1016/j.robot.2020.103627

Lepej, P., Maurer, J., Uran, S. (2015). Dynamic Arc Fitting Path Follower for Skid-steered Mobile Robots. Sage publications INC2455 Teller rd, thousand Oaks, CA 91320. Volume 12, Article Number 139, doi: 10.5772/61199

Li, N., Zhao, X., Yang, Y., Zou, X. (2016). Objects Classification by Learning-Based Visual Saliency Model and Convolutional Neural Network. Hindawi ltdadam house, 3rd flr, 1 fitzroy sq, London w1t 5hf, England. Volume 2016, Article Number 7942501. Doi: 10.1155/2016/7942501

Lillywhite, K., Lee, D., Tippetts, B., Archibald, J. (2013). A feature construction method for general object recognition. Pattern Recognition, vol. 46, no. 12, pp. 3300–3314, doi: 10.1016/j.patcog.2013.06.002.

Lindeberg, T. (2012), Scale invariant feature transform. Scholarpedia, vol. 7, no. 5, pp. 2012–2021. Doi: 10.4249/scholarpedia.10491

Lowe, D.G. (2004). Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision 60, 91–11 doi: 10.1023/B:VISI.0000029664.99615.94

Lv, P., Wang, X., Cheng, Y., Duan, Z. (2019). Stochastic Double Deep Q-Network. National Natural Science Foundation of China, 61772532. Doi: 10.1109/ACCESS.2019.2922706

Ma, L., Alborati, M. (2019). Wireless Inter-Vehicle Communication among VEX Robots. Proceedings of the 2019 IEEE 11th International Conference on Engineering Education, ICEED 2019 8994949, pp. 78-83 doi: 10.1109/ICEED47294.2019.8994949.

Marjovi, A., Marques, L. (2011). Multi-robot olfactory search in structured environments. Robotics and Autonomous Systems. 59867-881. doi: 10.1016/j.robot.2011.07.010

Me_k_tronico. (2020). Cinemática de robot móvil tipo triciclo. [Archivo de video]. https://www.youtube.com/watch?v=3xKEOaKttos

Oscar Ramos (2021). Cinemática de Robots Móviles (parte 1/2). [Archivo de video]. https://www.youtube.com/watch?v=2Asd4RH3Gmw

Placed, J., Castellanos, J. (2020). A Deep Reinforcement Learning Approach for Active SLAM. Applied science,10, 8386. doi:10.3390/app10238386

Raja, P., Pugazhenthi, S. (2012). On-line path planning for mobile robots in dynamic environments. Neural network world. Volume 22, Issue 1, Page 67-83. Doi: 10.14311/NNW.2012.22.005

Satyanarayanan, M. (2017) The Emergence of Edge Computing. Computer 50(1):30–39. Doi: 10.1109/MC.2017.9

Stateczny, A., Gierlowski, K., Hoeft, M (2022). Wireless Local Area Network Technologies as Communication Solutions for Unmanned Surface Vehicles. Sensors 2022, 22, 655. Doi: 10.3390/s22020655

Tao Z., Jian W., Jilin Z., Congfeng J. (2022). Deep Reinforcement Learning-Based Workload Scheduling for Edge Computing. Journal of Cloud Computing: Advances, Systems and Applications, doi: 10.1186/s13677-021-00276-0

Vasquez-Jalpa, C., Nakano-Miyatake, M., Escamilla Hernandez, E. (2021). A deep reinforcement learning algorithm based on modified Twin delay DDPG method for robotic applications. 21st International Conference on Control, Automation and Systems (ICCAS), pp. 743-748, doi: 10.23919/ICCAS52745.2021.9649882.

Wang, H., Shi, J. (2019). Design and Modeling of a Novel Transformable Land/Air Robot. Hindawi ltdadam house, 3rd flr, 1 fitzroy sq, London w1t 5hf, England. Volume 2019, Article Number 2064131, doi: 10.1155/2019/2064131

Wróbel, K., Karwatowski, M., Wielgosz, M., Pietroń M., Wiatr, K. (2020). Compressing sentiment analysis cnn models for efficient hardware processing. Computer Science, 21(1). Doi: 10.7494/csci.2020.21.1.3375

Zhu, K, Zhang, T (2021). Deep Reinforcement Learning Based Mobile Robot Navigation: A Review. Volume26 Issue 5 Page 674-691. Doi: 10.26599/TST.2021.9010012