0% Complete
صفحه اصلی
/
سی و سومین کنفرانس بین المللی مهندسی برق
Better Exploration In Single-Agent Q-Learning Using Controlled Linear Perturbation
نویسندگان :
Sadredin Hokmi
1
Mohammad Haeri
2
1- Sharif university of technology
2- Sharif university of technology
کلمات کلیدی :
Q-learning،Exploration،Controlled Linear perturbation،Convergence rate،Maze،Cart-Pole
چکیده :
Reinforcement learning algorithms, especially model-free algorithms like Q-learning, have shown reliable results in finding optimal solutions for many real-time applications. However, challenges such as exploration in real-time and the convergence rate need to be addressed, and many researches have proposed algorithms to tackle these challenges. Algorithms like speedy Q-learning, Zap Q-learning, algorithms based on adding a regularization term, noise injection, and many others have been introduced. In this paper, an algorithm based on controlled linear perturbation is presented, which, according to the numerical results, can significantly reduce unnecessary explorations that are risky in real-time. Additionally, the proposed algorithm does not depend on the learning rate \mathbit{\alpha}, \mathbit{\gamma}, or changes in coefficients. However, to be effective, the parameters of the algorithm should be chosen within the correct range. The results of applying the proposed algorithm have been compared with three reliable algorithms: standard Q-learning, speedy Q-learning, and noise injection. These comparisons were conducted in a 9x9 maze scenario and in the cart-pole environment.
لیست مقالات
لیست مقالات بایگانی شده
MoS2 Grating on a Grounded Periodic SiO2 as a Wideband THz Absorber
Mohammad Amin Zolghadr - Mahmood Rafaei Booket
IRS-aided NOMA in a Cell Free Massive MIMO System
Anahid Rafieifar - Hosein Ahmadinejad - Abolfazl Falahati
Human Action Recognition in Still Images Using ConViT
Seyed Rohollah Hosseyni - Sanaz Seyedin - Hassan Taheri
Multi-Objective Particle Swarm Optimization Of Spiral Antenna for Microwave Imaging Applications
Mehdi Yousefnia - Jaber Allahgholipor - Ataollah Ebrahimzadeh
Back-Stepping Integral Sliding Mode Control with Iterative Learning Control Algorithm for Quadrotor UAV Transporting Cable-Suspended Payload
Davood Allahverdy - Ahmad Fakharian - Mohammad Bagher Menhaj
An incentive compatible reward sharing approach for shard-based blockchains
Mojdeh Hemati - Mehdi Shajari
On the Design of Highly Efficient Harmonic Tuned Wideband Class F-1/F Power Amplifier
Mohammad Reza Zeinali - Amir Hossein Aalipour - Hossein Shamsi
Optimized 5G-MMW Compact Yagi-Uda Antenna Based on Machine Learning Methodology
Alireza Jafarieh - Mahdi Nouri - Hamid Behroozi
Zero control effort approach to perturbed coupled orbit-attitude periodic solution at three-body problem: Earth-Mars system
Amirreza Kosari - Ehsan Abbasali - Majid Bakhtiari
بررسی نامتعادلی در مبدل DC به DC تمامپل شیفت فاز با یکسوکنندهی دوبرابرکنندهی جریان
رضا نرئی - یاسر کریمی - محمدهادی زارع
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 40.3.1