0% Complete
صفحه اصلی
/
سی و سومین کنفرانس بین المللی مهندسی برق
Better Exploration In Single-Agent Q-Learning Using Controlled Linear Perturbation
نویسندگان :
Sadredin Hokmi
1
Mohammad Haeri
2
1- Sharif university of technology
2- Sharif university of technology
کلمات کلیدی :
Q-learning،Exploration،Controlled Linear perturbation،Convergence rate،Maze،Cart-Pole
چکیده :
Reinforcement learning algorithms, especially model-free algorithms like Q-learning, have shown reliable results in finding optimal solutions for many real-time applications. However, challenges such as exploration in real-time and the convergence rate need to be addressed, and many researches have proposed algorithms to tackle these challenges. Algorithms like speedy Q-learning, Zap Q-learning, algorithms based on adding a regularization term, noise injection, and many others have been introduced. In this paper, an algorithm based on controlled linear perturbation is presented, which, according to the numerical results, can significantly reduce unnecessary explorations that are risky in real-time. Additionally, the proposed algorithm does not depend on the learning rate \mathbit{\alpha}, \mathbit{\gamma}, or changes in coefficients. However, to be effective, the parameters of the algorithm should be chosen within the correct range. The results of applying the proposed algorithm have been compared with three reliable algorithms: standard Q-learning, speedy Q-learning, and noise injection. These comparisons were conducted in a 9x9 maze scenario and in the cart-pole environment.
لیست مقالات
لیست مقالات بایگانی شده
Ground-based Power Line Sag Measurement by Combining Data from a Smartphone and a Laser Rangefinder
Mohammad Javad Abdollahifard - Reza Bahrami
Primary Frequency Support in Clustered Unit Commitment with Battery Energy Storage and High Renewable Penetration
Abbas Abdollahi-Veshvaee - Turaj Amraee
ℒ1 Adaptive Control Design Using CMPC: Applied to Single-Link Flexible Joint Manipulator
Hossein Ahmadian - Heidar Ali Talebi - Iman Sharifi
Impact of Sierpinski fractal shape on the performance of ultrathin-film silicon solar cells
Mohammad Ali Shameli - Sayyed Reza Mirnaziry - Leila Yousefi
The most descriptive surprise definition for brain’s EEG response to visual and auditory oddball tasks
Mohammad Mahdi Kiani - Zahra Mousavi - Hamid Aghajan
An Analysis of Nash Equilibrium Learning through Myopic Decision-making in Incomplete Information Double Sided Auction Games within Markets
Hesam Farzaneh - Parsa Zholideh
Effects of Derating Factor and Minimum Short Circuit Current on the BOP Cable Sizing of a Power Plant
Hossein Zamanpour abyaneh
A CMOS Low-Noise and Low-Power Transimpedance Amplifier
Mehrdad Amirkhan Dehkordi - Seyed Mehdi Mirsanei - Soorena Zohoori
An Improved Nonlinear Observer-Based Integrated Guidance and Control for Hypersonic Flight Vehicle with Angle Constraints
Seyedeh Mahsa Zakipour Bahambari - Saeed Khankalantary
Conversion of Linear Polarized Light-to-Orbital Angular Momentum with Variable Topological Charges, Using the Surface Plasmons of Elliptical Holes Etched in a Gold Layer
Amir Mohammad Ghanei - Abolfazl Aghili - Sara Darbari
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 42.5.3