0% Complete
صفحه اصلی
/
سی و سومین کنفرانس بین المللی مهندسی برق
Better Exploration In Single-Agent Q-Learning Using Controlled Linear Perturbation
نویسندگان :
Sadredin Hokmi
1
Mohammad Haeri
2
1- Sharif university of technology
2- Sharif university of technology
کلمات کلیدی :
Q-learning،Exploration،Controlled Linear perturbation،Convergence rate،Maze،Cart-Pole
چکیده :
Reinforcement learning algorithms, especially model-free algorithms like Q-learning, have shown reliable results in finding optimal solutions for many real-time applications. However, challenges such as exploration in real-time and the convergence rate need to be addressed, and many researches have proposed algorithms to tackle these challenges. Algorithms like speedy Q-learning, Zap Q-learning, algorithms based on adding a regularization term, noise injection, and many others have been introduced. In this paper, an algorithm based on controlled linear perturbation is presented, which, according to the numerical results, can significantly reduce unnecessary explorations that are risky in real-time. Additionally, the proposed algorithm does not depend on the learning rate \mathbit{\alpha}, \mathbit{\gamma}, or changes in coefficients. However, to be effective, the parameters of the algorithm should be chosen within the correct range. The results of applying the proposed algorithm have been compared with three reliable algorithms: standard Q-learning, speedy Q-learning, and noise injection. These comparisons were conducted in a 9x9 maze scenario and in the cart-pole environment.
لیست مقالات
لیست مقالات بایگانی شده
Application of Artificial Neural Network on Diagnosing Location and Extent of Disk Space Variations in Transformer Windings Using Frequency Response Analysis
Reza Behkam - Hossein Karami - Mahdi Salay Naderi - Gevork Gharehpetian
Deep Learning Meets Explainable AI: A Robust Framework for X-Ray Fracture Detection
Ali Tamizifar - Shakiba Berenjkoub - Mina Amiri
Enhanced Forward Model for Photoacoustic Imaging with Speed of Sound Compensation
Amirreza Jodeiry - Zahra Kavehvash
Total Transfer Capability Improvement Using High Temperature Low Sag Conductors
Seyed Sina Mousavi-Seyedi - Mohammad Reza Rezaei - Mohammad Reza Miveh
A Cost-Effective Solution for Traffic Sign Recognition and Geographic Localization Using a Monocular Camera
Mohadeseh Atyabi - Fardin Ayar - Mahdi Javanmardi
Design and Simulation of Ultra High power X-band Rotary Joint with a Matching Choke
Mohammad Bod - Seyed mohammad Hashemi
Non-Line-of-Sight imaging using raster scanning at NIR wavelength
Mohammad Roueinfar - Mahdi Salmanian
Evaluating the effect of electric vehicle charging station locations on line flows:An analytical approach
Mohammad Hasan Nikkhah - Mahdi Samadi
Message Overhead Control Using P-Epidemic Routing Method in Resource-Constrained Heterogeneous DTN
Mohammad Yousef Darmani - Shiva Karimi
Higher Derivatives Extremum Seeking with Very Slow/ Drifting Sensor
Farzaneh Karimi - Mohsen Mojiri
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 42.8.0