0% Complete
صفحه اصلی
/
سی و سومین کنفرانس بین المللی مهندسی برق
Better Exploration In Single-Agent Q-Learning Using Controlled Linear Perturbation
نویسندگان :
Sadredin Hokmi
1
Mohammad Haeri
2
1- Sharif university of technology
2- Sharif university of technology
کلمات کلیدی :
Q-learning،Exploration،Controlled Linear perturbation،Convergence rate،Maze،Cart-Pole
چکیده :
Reinforcement learning algorithms, especially model-free algorithms like Q-learning, have shown reliable results in finding optimal solutions for many real-time applications. However, challenges such as exploration in real-time and the convergence rate need to be addressed, and many researches have proposed algorithms to tackle these challenges. Algorithms like speedy Q-learning, Zap Q-learning, algorithms based on adding a regularization term, noise injection, and many others have been introduced. In this paper, an algorithm based on controlled linear perturbation is presented, which, according to the numerical results, can significantly reduce unnecessary explorations that are risky in real-time. Additionally, the proposed algorithm does not depend on the learning rate \mathbit{\alpha}, \mathbit{\gamma}, or changes in coefficients. However, to be effective, the parameters of the algorithm should be chosen within the correct range. The results of applying the proposed algorithm have been compared with three reliable algorithms: standard Q-learning, speedy Q-learning, and noise injection. These comparisons were conducted in a 9x9 maze scenario and in the cart-pole environment.
لیست مقالات
لیست مقالات بایگانی شده
Event-triggered SOF Control of Descriptor Switched Systems
Hamidreza Ahmadzadeh - Masoud Shafiee - Iman Zamani
Proposing an indirect distributed approach to apply SSSEP vibrational stimulation
SAHAR SADEGHI - Ali Maleki
Evanescent-to-Propagating Wave Conversion Using Continuous High-Order Dielectric Metasurfaces
Hamid Akbari Chelaresi - Pooria Salami - Leila Yousefi
اولویتبندی کلیدهای قدرت جهت پیادهسازی سیستم پایش وضعیت
محمدرضا قطبالدینی - احمد میرزائی - محمدمهدی منصوری مجومرد
Anomaly Detection in Urban Water Distribution Grids Using Fog Computing Architecture
Sara Mirzaie - Mohammadreza Avazaghaei - Omid Bushehrian
Adaptive fault tolerant neural control of heterogeneous second-order multi-agent systems
Mohammad Hadi Rezaei - Ali Abooee
A Novel Low Torque Ripple Hexagon Biased Flux Doubly Salient Permanent Magnet Motor
Mohammad Amirkhani - Behnam Mohammadian Mosammam - Mojtaba Mirsalim
گیت Xor/Xnor جدید با مصرف توان پایین مبتنی بر تکنولوژی اسپینترونیک
ایمان علیبیگی - محمود تابنده - سعید باقری شورکی - رامین رجایی
TID-based PSS2B to Overcome LFO Issue in Multi-machine Power Systems
Javad Morsali
Novel Wideband Dual-Polarized Base-Station Antenna
Farzad Alizadeh - Changiz Ghobadi - Javad Nourinia - Keyhan Hosseini - Bahman Mohammadi
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 43.6.0