0% Complete
صفحه اصلی
/
سی و سومین کنفرانس بین المللی مهندسی برق
Better Exploration In Single-Agent Q-Learning Using Controlled Linear Perturbation
نویسندگان :
Sadredin Hokmi
1
Mohammad Haeri
2
1- Sharif university of technology
2- Sharif university of technology
کلمات کلیدی :
Q-learning،Exploration،Controlled Linear perturbation،Convergence rate،Maze،Cart-Pole
چکیده :
Reinforcement learning algorithms, especially model-free algorithms like Q-learning, have shown reliable results in finding optimal solutions for many real-time applications. However, challenges such as exploration in real-time and the convergence rate need to be addressed, and many researches have proposed algorithms to tackle these challenges. Algorithms like speedy Q-learning, Zap Q-learning, algorithms based on adding a regularization term, noise injection, and many others have been introduced. In this paper, an algorithm based on controlled linear perturbation is presented, which, according to the numerical results, can significantly reduce unnecessary explorations that are risky in real-time. Additionally, the proposed algorithm does not depend on the learning rate \mathbit{\alpha}, \mathbit{\gamma}, or changes in coefficients. However, to be effective, the parameters of the algorithm should be chosen within the correct range. The results of applying the proposed algorithm have been compared with three reliable algorithms: standard Q-learning, speedy Q-learning, and noise injection. These comparisons were conducted in a 9x9 maze scenario and in the cart-pole environment.
لیست مقالات
لیست مقالات بایگانی شده
بهرهگیری از رویکرد برنامهریزی ریاضیاتی برای حل مسئلهی مجموعه رأس بازخورد، تحت شرط مستقل بودن یا همبندی
فاطمه سلطانی دزکی - حسین فلسفین
Low-cost dielectrophoresis-based microfluidic chip for label-free particle separation with 3D electrodes
Fatemeh Esmaeili - Zeynab Alipour - Mehdi Fardmanesh
طراحی و شبیه سازی شتاب سنج خازنی MEMS برای استفاده در سمعک های تمام کاشت
میلاد کریمی پور - مهدیه مهران
تعیین آرایش بهینه خطوط جهت کاهش فرسایش یقه پایه های بتنی ناشی از تنشهای باد
میثم پوراحمدی نخلی - حمیدرضا فیروزآبادی
Investigation of Li3P as Electrolyte and Lithium-ion conductor: An Ab-Initio Study
Keyvan Khosh Abady - ََamin Niksirat - Negar Karpourazar - Mahdi Pourfath
Robust Object Detection Against Adversarial Perturbations with Gabor Filter
Mohammad Parsa Karimi - Abdollah Amirkhani - Shahriar B. Shokouhi
Efficient Full Adders for Approximate Arithmetic Units in the Image Processing Applications
Bahram Rashidi
Extended Phase Shift Control in Dual Active Bridge Converter Considering Magnetizing Inductance of Transformer
Masood Soleimanifard - Ali Yazdian Varjani
A New High Voltage Gain Non-isolated DC-DC Converter
Ahmadreza Ghanaatian - Reza Takarli - Abolfazl Vahedi
A Simulation of Bayesian Surprise
Alireza Maleki - Ali Taherinassaj - Hoda Mohammadzade
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 42.8.0