0% Complete
صفحه اصلی
/
سی و سومین کنفرانس بین المللی مهندسی برق
Better Exploration In Single-Agent Q-Learning Using Controlled Linear Perturbation
نویسندگان :
Sadredin Hokmi
1
Mohammad Haeri
2
1- Sharif university of technology
2- Sharif university of technology
کلمات کلیدی :
Q-learning،Exploration،Controlled Linear perturbation،Convergence rate،Maze،Cart-Pole
چکیده :
Reinforcement learning algorithms, especially model-free algorithms like Q-learning, have shown reliable results in finding optimal solutions for many real-time applications. However, challenges such as exploration in real-time and the convergence rate need to be addressed, and many researches have proposed algorithms to tackle these challenges. Algorithms like speedy Q-learning, Zap Q-learning, algorithms based on adding a regularization term, noise injection, and many others have been introduced. In this paper, an algorithm based on controlled linear perturbation is presented, which, according to the numerical results, can significantly reduce unnecessary explorations that are risky in real-time. Additionally, the proposed algorithm does not depend on the learning rate \mathbit{\alpha}, \mathbit{\gamma}, or changes in coefficients. However, to be effective, the parameters of the algorithm should be chosen within the correct range. The results of applying the proposed algorithm have been compared with three reliable algorithms: standard Q-learning, speedy Q-learning, and noise injection. These comparisons were conducted in a 9x9 maze scenario and in the cart-pole environment.
لیست مقالات
لیست مقالات بایگانی شده
Ultra-Compact and Fast All-Optical Half-Subtractor Photonic Crystal Logic Gate
Ehsan Veisi - Mahmood Seifouri - Saeed Olyaee
Analytical Model for Estimating the Range of Troposcatter Active Radar
Mahdi Shiri - Mohammadreza Edalatzadeh
Improved Attention U-Net combined with Conditional Random Field for Ischemic Lesion Segmentation from Magnetic Resonance Images
Ali Rezaei - Asieh Khosravanian - Habibollah Danyali - Kamran Kazemi - Ardalan Aarabi
Breast Cancer Detection by Time-Reversal Imaging Using Ultra-Wideband Modified Circular Patch Antenna Array
Mohammad Haghpanah - Zahra Ghattan Kashani - Atefeh Khalili Param
Synergy of Deep Learning and Artificial Potential Field Methods for Robot Path Planning in the Presence of Static and Dynamic Obstacles
Mohammad Amin Basiri - Shirin Chehelgami - Erfan Ashtari - Mehdi Tale Masouleh - Ahmad Kalhor
Novel Wideband Dual-Polarized Base-Station Antenna
Farzad Alizadeh - Changiz Ghobadi - Javad Nourinia - Keyhan Hosseini - Bahman Mohammadi
A Communication-Aware Scheduler for Containers in a Kubernetes Environment Using Girvan-Newman Clustering
Marzie Norouzi Dehnashi - Mahmoud Momtazpour - Seyyed Ahmad Javadi
طراحی ماتریس باتلر 8×4 در ساختارSIW با کاهش سطح گلبرگ جانبی در باند فرکانسی 60GHz
زهرا مهرزاد - غلامرضا مرادی - ایاز قربانی
An active learning approach for classification of several arrhythmias in ECG signal
Nastaran Darbani - Danial Katoozian - Hossein Hosseini-Nejad
Optimal Bidding Strategy with Smooth Budget Delivery in Online Advertising
Mohammad Afzali - Keykhosro Khosravani - Maryam Babazadeh
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 43.6.0