0% Complete
صفحه اصلی
/
سی و سومین کنفرانس بین المللی مهندسی برق
Better Exploration In Single-Agent Q-Learning Using Controlled Linear Perturbation
نویسندگان :
Sadredin Hokmi
1
Mohammad Haeri
2
1- Sharif university of technology
2- Sharif university of technology
کلمات کلیدی :
Q-learning،Exploration،Controlled Linear perturbation،Convergence rate،Maze،Cart-Pole
چکیده :
Reinforcement learning algorithms, especially model-free algorithms like Q-learning, have shown reliable results in finding optimal solutions for many real-time applications. However, challenges such as exploration in real-time and the convergence rate need to be addressed, and many researches have proposed algorithms to tackle these challenges. Algorithms like speedy Q-learning, Zap Q-learning, algorithms based on adding a regularization term, noise injection, and many others have been introduced. In this paper, an algorithm based on controlled linear perturbation is presented, which, according to the numerical results, can significantly reduce unnecessary explorations that are risky in real-time. Additionally, the proposed algorithm does not depend on the learning rate \mathbit{\alpha}, \mathbit{\gamma}, or changes in coefficients. However, to be effective, the parameters of the algorithm should be chosen within the correct range. The results of applying the proposed algorithm have been compared with three reliable algorithms: standard Q-learning, speedy Q-learning, and noise injection. These comparisons were conducted in a 9x9 maze scenario and in the cart-pole environment.
لیست مقالات
لیست مقالات بایگانی شده
Delay Independent Controller Design for Delayed Discrete Singular Systems with Input Saturation
Emad Jafari - Tahereh Binazadeh
Enhancing Fetal Brain MRI Segmentation with Adaptive Attention Mechanisms and Residual Blocks
Nazanin Valaee - Vajiheh Sabeti
Comparison of Channel Selection Methods for EEG Signal Classification
Soraya Charkas - MohammadBagher Shamsollahi
Crypto Currency Price Prediction Using Preprocessed Scaled Inputs LSTM Model Enhanced by Improved Gray Wolf Optimization
Amir RabbaniParsa - Mahboobeh Hoshmand - Seyyed Abed Hosseini
Unsupervised Change Detection in SAR Images Using a Six-Branch CNN and Adaptive Window Approach
Abbas Kakoolvand - Maryam Imani - Hassan Ghassemian
Hybrid PI-SOSM Controller for Battery and Supercapacitor Integration in Electric Vehicles
Maede Azimi - Ghasem Rezazadeh - Mohsen Hamzeh
طراحی و پیاده سازی ژنراتور تولید کننده پالس PFN-Marx فشرده و ماژولار برای تولید پالس 25 کیلوولتی
محمد حسین رنجبر - محمدجواد گل علی پور
Finite-Time Bipartite Time-Varying Formation tracking for Heterogeneous Nonlinear Multi-Agent Systems
Mohammad Reza Mehrabi Koushki - Javad Askari - Marzieh Kamali
Network-based functional connectivity in MDD with suicide ideation before and after TMS: An fMRI case study
Moslem Khafi - Morteza Fattahi - Hamid Soltanian-Zadeh - Reza Rostami
A Single-Switch Single-Inductor High Step-Up DC-DC Converter with Single-Input and Dual-Output Ports
Ali Nadermohammadi - Saed Mahmoud Alilou - Mohammad Maalandish - Seyed Hossein Hosseini - Mehdi Abapour - Kazrm Zare
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 42.5.3