0% Complete
صفحه اصلی
/
سی و سومین کنفرانس بین المللی مهندسی برق
Better Exploration In Single-Agent Q-Learning Using Controlled Linear Perturbation
نویسندگان :
Sadredin Hokmi
1
Mohammad Haeri
2
1- Sharif university of technology
2- Sharif university of technology
کلمات کلیدی :
Q-learning،Exploration،Controlled Linear perturbation،Convergence rate،Maze،Cart-Pole
چکیده :
Reinforcement learning algorithms, especially model-free algorithms like Q-learning, have shown reliable results in finding optimal solutions for many real-time applications. However, challenges such as exploration in real-time and the convergence rate need to be addressed, and many researches have proposed algorithms to tackle these challenges. Algorithms like speedy Q-learning, Zap Q-learning, algorithms based on adding a regularization term, noise injection, and many others have been introduced. In this paper, an algorithm based on controlled linear perturbation is presented, which, according to the numerical results, can significantly reduce unnecessary explorations that are risky in real-time. Additionally, the proposed algorithm does not depend on the learning rate \mathbit{\alpha}, \mathbit{\gamma}, or changes in coefficients. However, to be effective, the parameters of the algorithm should be chosen within the correct range. The results of applying the proposed algorithm have been compared with three reliable algorithms: standard Q-learning, speedy Q-learning, and noise injection. These comparisons were conducted in a 9x9 maze scenario and in the cart-pole environment.
لیست مقالات
لیست مقالات بایگانی شده
بررسی کنترل مغناطیسی پاسخ کایرواپتیکی ساختارهای مگنتوکایرال
کی سیاوش کیکاوسی - حمیده دشتی خویدکی - جواد احمدی شکوه - مجید رشیدی هویه
Control of a Wheeled Robot in the Presence of Wheels Sliding Using Robust Adaptive Control in Differential Game Format
Alireza Azimi - Roya Amjadifard - Aliakbar Ghasemzadeh
الگوریتم تشخیصی برای طبقه بندی سرطان خون لوسمی لنفوسیتی حاد با استفاده از شبکه های عصبی عمیق در یادگیری آنلاین
رضا گودرزی - علی جلالی - امید هاشمی پورتفرشی
Underwater Image Quality Assessment via Color and Contrast Analysis
Meysam Ghalyani - Maryam Karimi
کنترل حرارت مبتنی بر روش LQG در پیل سوختی غشاء پلیمری
احمدرضا ولی - محمدعلی علیرضاپوری - محمدمهدی برزگری
Model Predictive Control for Interconnected Systems with Communication Delays
Reza Mohammadikia - Mahsan Tavakoli-Kakhki
Three Improved Boost Topologies with Continuous Input/Output Currents Suitable for High-Voltage Applications
Hossein Gholizadeh - Hesam Ehsan - Alireza Poursalan - Mohammad Hamed Samimi
(Room Temperature Chemiresistor H2S Gas Sensor based on ZnS/PbS Core-Shell Quantum Dots(CSQDs
Mojtaba Azimi - Ali Rostami
Optimization of a three-phase Induction Motor for Electric Vehicles Based on Hook-Jews Optimization Method
Arash Mousaei - Naghi Rostami - Mohammad Bagher Bannae Sharifian
PCSA-TMRAM: Precharge Sense Amplifier-Based Ternary MRAM
Mohammad Mahdi Mazaheri - Motahareh BahmanAbadi - Mohammad Hossein Moaiyeri
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 42.0.4