0% Complete
صفحه اصلی
/
سی و سومین کنفرانس بین المللی مهندسی برق
Better Exploration In Single-Agent Q-Learning Using Controlled Linear Perturbation
نویسندگان :
Sadredin Hokmi
1
Mohammad Haeri
2
1- Sharif university of technology
2- Sharif university of technology
کلمات کلیدی :
Q-learning،Exploration،Controlled Linear perturbation،Convergence rate،Maze،Cart-Pole
چکیده :
Reinforcement learning algorithms, especially model-free algorithms like Q-learning, have shown reliable results in finding optimal solutions for many real-time applications. However, challenges such as exploration in real-time and the convergence rate need to be addressed, and many researches have proposed algorithms to tackle these challenges. Algorithms like speedy Q-learning, Zap Q-learning, algorithms based on adding a regularization term, noise injection, and many others have been introduced. In this paper, an algorithm based on controlled linear perturbation is presented, which, according to the numerical results, can significantly reduce unnecessary explorations that are risky in real-time. Additionally, the proposed algorithm does not depend on the learning rate \mathbit{\alpha}, \mathbit{\gamma}, or changes in coefficients. However, to be effective, the parameters of the algorithm should be chosen within the correct range. The results of applying the proposed algorithm have been compared with three reliable algorithms: standard Q-learning, speedy Q-learning, and noise injection. These comparisons were conducted in a 9x9 maze scenario and in the cart-pole environment.
لیست مقالات
لیست مقالات بایگانی شده
Highly Efficient Implementation of Chaotic Systems Utilizing High-Level Synthesis Tools
Mobin Vaziri - Hadi Jahanirad
Posture Stabilization of Tractor-Trailer Wheeled Mobile Robot Using Nonlinear MPC
Kevin Babakhanloo - Khalil Alipour - Bahram Tarvirdizadeh - Majid Sorouri - Mohammad Ghamari
Recurrence Quantification and Machine Learning: A Novel Approach for Parkinson’s Disease Diagnosis from EEG Signals
Asghar Zarei - Alireza Talesh Jafadideh
Smartly, reduce the latency of high-priority vehicles using IoT technology
Mahdi Talebi - Masoud Sabaei
Fatigue Detection in SSVEP-Based BCIs Using Biomarkers: A Comparative Study
Maedeh Azadi Moghadam - Ali Maleki
Photonic Crystal-based Plasmonic Biosensor with Low-cost and High-sensitivity Properties
Mahdieh Ahmadi Motlagh - Mahdieh Bozorgi - Mahmood Rafaei-Booket
Coherent Direction of Arrival Estimation using Multiple Toeplitz Space Time Spatial Smoothing
Sepehr Kouzegaran - MASOUMEH AZGHANI
A Bi-Level Attack-Defense Model for the Forecasting False Data Injection Attacks on the Integrated Energy Systems
Maryam Azimi - Hamed Delkhosh - Mahdi Ghaedi
Optimization and Analysis of Transformer Hot Spot Temperature Under Harmonic Conditions with Different Windings
Mehran Nemati - Hamed Karimi - Alireza Siadatan - Maryam Sepehrinour
Risk-based Expansion planning of Active Distribution Networks in the Presence of Electric Vehicles to improve the Reliability
Ali Razzaghi
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 40.4.2