0% Complete
صفحه اصلی
/
سی و سومین کنفرانس بین المللی مهندسی برق
Better Exploration In Single-Agent Q-Learning Using Controlled Linear Perturbation
نویسندگان :
Sadredin Hokmi
1
Mohammad Haeri
2
1- Sharif university of technology
2- Sharif university of technology
کلمات کلیدی :
Q-learning،Exploration،Controlled Linear perturbation،Convergence rate،Maze،Cart-Pole
چکیده :
Reinforcement learning algorithms, especially model-free algorithms like Q-learning, have shown reliable results in finding optimal solutions for many real-time applications. However, challenges such as exploration in real-time and the convergence rate need to be addressed, and many researches have proposed algorithms to tackle these challenges. Algorithms like speedy Q-learning, Zap Q-learning, algorithms based on adding a regularization term, noise injection, and many others have been introduced. In this paper, an algorithm based on controlled linear perturbation is presented, which, according to the numerical results, can significantly reduce unnecessary explorations that are risky in real-time. Additionally, the proposed algorithm does not depend on the learning rate \mathbit{\alpha}, \mathbit{\gamma}, or changes in coefficients. However, to be effective, the parameters of the algorithm should be chosen within the correct range. The results of applying the proposed algorithm have been compared with three reliable algorithms: standard Q-learning, speedy Q-learning, and noise injection. These comparisons were conducted in a 9x9 maze scenario and in the cart-pole environment.
لیست مقالات
لیست مقالات بایگانی شده
Design and Implementation of a fast flexible and efficient multichannel digital filter for hearing aids
Mohammadsadegh Poushnegar - Mahmoud Tabandeh - Meysam Nesary Moghadam - Farzam Gilani - Ali Aghakasiri
A modified Dempster Shafer approach to classification in surgical skill assessment
Arash Iranfar - Mohammad Soleymannejad - Behzad Moshiri - Hamid D. Taghirad
40Hz Auditory Entrainment Promotes Synchronization Between Frontal and Parietal Regions of the Brain
Mojtaba Lahijanian - Hamid Aghajan
Small Target Detection Using an Enhanced Optimization Based Filter and Trajectory Tracking Via Pattern Matching Algorithm
Seyedeh Mahsa Zakipour Bahambari - Saeed Khankalantary
Displacement Estimation for Ultrasound Elastography based on a Robust Uniform Stretching Method
Zahra Hosseini - Ali Khadem - Mohammadreza Hassannejad Bibalan
ساخت حسگر مقاومتی بخار اتانول مبتنی بر هتروساختار باریم تیتانات / اکسید روی آلاییده با نانوذرات نقره
محسن طاهری پور - نوید یثربی - شیرین نصراصفهانی - محمد حسین شیخی
Evanescent-to-Propagating Wave Conversion Using Continuous High-Order Dielectric Metasurfaces
Hamid Akbari Chelaresi - Pooria Salami - Leila Yousefi
E-RESO: An Enhanced Time Redundancy-based Error Detection Approach for Arithmetic Operations
Sina Shahoveisi - Athena Abdi
An Active Inductor-Based Differential Ring VCO with Wide Tuning Range for UWB Applications
Mahdi Alijani - Mohammadmahdi Javanmardi - Vahid Khodadadi - Adib Abrishamifar
Developing Low Profile Carpet Cloaks using ENZ slabs
Amin Monemian Esfahani - Leila Yousefi
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 42.3.2