0% Complete
صفحه اصلی
/
سی و سومین کنفرانس بین المللی مهندسی برق
Better Exploration In Single-Agent Q-Learning Using Controlled Linear Perturbation
نویسندگان :
Sadredin Hokmi
1
Mohammad Haeri
2
1- Sharif university of technology
2- Sharif university of technology
کلمات کلیدی :
Q-learning،Exploration،Controlled Linear perturbation،Convergence rate،Maze،Cart-Pole
چکیده :
Reinforcement learning algorithms, especially model-free algorithms like Q-learning, have shown reliable results in finding optimal solutions for many real-time applications. However, challenges such as exploration in real-time and the convergence rate need to be addressed, and many researches have proposed algorithms to tackle these challenges. Algorithms like speedy Q-learning, Zap Q-learning, algorithms based on adding a regularization term, noise injection, and many others have been introduced. In this paper, an algorithm based on controlled linear perturbation is presented, which, according to the numerical results, can significantly reduce unnecessary explorations that are risky in real-time. Additionally, the proposed algorithm does not depend on the learning rate \mathbit{\alpha}, \mathbit{\gamma}, or changes in coefficients. However, to be effective, the parameters of the algorithm should be chosen within the correct range. The results of applying the proposed algorithm have been compared with three reliable algorithms: standard Q-learning, speedy Q-learning, and noise injection. These comparisons were conducted in a 9x9 maze scenario and in the cart-pole environment.
لیست مقالات
لیست مقالات بایگانی شده
Emotion Recognition from EEG Signals During REM Sleep
Asghar Zarei - Ali Mahmoudi
Study of Performance Characteristics of a Line-Start Synchronous Reluctance Motor Over its Synchronization Region
Ali Jamali-Fard - Mojtaba Mirsalim
Gesture recognition of hand movements using mechanomyography
Ashkan Elyasinia - Raheleh Davoodi - Sedighe Dehghani
Design of a highly efficient photoconductive terahertz modulator enhanced by photonic crystal resonant cavity
Faramarz Alihosseini - Zahra Heshmatpanah - Hesam Zandi
Distributed Data Processing for Multi-Agent Systems Via Wave Model
Saeedreza Tofighi - Masoud Shafiee
A Comprehensive Analysis of a Digital Control Strategy for Photovoltaic-Based Single-Phase Grid-Tied Inverter Systems
Soheil Hasani Sangani - Mohamad Reza Moslemnejad - Mojtaba Saeedi - Alireza Jalalitalab - Reza Beiranvand
Noninvasive Blood Pressure Classification Based on Photoplethysmography Using Machine Learning Techniques
Hanieh Mohammadi - Bahram Tarvirdizadeh - Khalil Alipour - Mohammad Ghamari
Design and Simulation of a MEMS Capacitive Switch With Low Pull-in Voltage and High Switching Speed
Davoud Razaghpoor - Mir Majid Ghasemi - Saeid Afrang - Amir Fathi - Asma Akbarli
Optimal Placement of Followers Within the Convex Hull of Leaders: A Distributed Subgradient Approach
Seyedeh Mahsa Zakipour Bahambari - Saeed Khankalantary
Optimal Placement of Unified Power Flow Controller in Power System Considering Transient Stability and Voltage Stability Criteria
Esmail Zahmatkeshan - Mohsen Bandekhoda
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 43.6.0