0% Complete
صفحه اصلی
/
سی و سومین کنفرانس بین المللی مهندسی برق
Better Exploration In Single-Agent Q-Learning Using Controlled Linear Perturbation
نویسندگان :
Sadredin Hokmi
1
Mohammad Haeri
2
1- Sharif university of technology
2- Sharif university of technology
کلمات کلیدی :
Q-learning،Exploration،Controlled Linear perturbation،Convergence rate،Maze،Cart-Pole
چکیده :
Reinforcement learning algorithms, especially model-free algorithms like Q-learning, have shown reliable results in finding optimal solutions for many real-time applications. However, challenges such as exploration in real-time and the convergence rate need to be addressed, and many researches have proposed algorithms to tackle these challenges. Algorithms like speedy Q-learning, Zap Q-learning, algorithms based on adding a regularization term, noise injection, and many others have been introduced. In this paper, an algorithm based on controlled linear perturbation is presented, which, according to the numerical results, can significantly reduce unnecessary explorations that are risky in real-time. Additionally, the proposed algorithm does not depend on the learning rate \mathbit{\alpha}, \mathbit{\gamma}, or changes in coefficients. However, to be effective, the parameters of the algorithm should be chosen within the correct range. The results of applying the proposed algorithm have been compared with three reliable algorithms: standard Q-learning, speedy Q-learning, and noise injection. These comparisons were conducted in a 9x9 maze scenario and in the cart-pole environment.
لیست مقالات
لیست مقالات بایگانی شده
Three-Winding Coupled-Inductor-based Boost Converter with Voltage Multiplier Cell and Active Clamp Circuit for Low-Power Photovoltaic Application
Danesh Amani - Ali Valizadeh - Reza Beiranvand - Ali Yazdian Varjani
طراحی کنترلکننده استروباسکوپ زمان واقعی مبتنی بر هوش مصنوعی برای سیستم های دورانی
مهدی مظفری - سعید جعفری نسب - حامد پورکاوه - سعید شمقدری
تفکیک منبع تخلیه جزئی شدید در کابل های قدرت به کمک روش یادگیری عمیق
سید محسن علی پور - کیان شاهین فر - سید محمد شهرتاش
Joint Energy and Throughput Optimization in Energy Harvesting Cognitive Sensor Networks
Morteza Sharifi - Mahmood Mohassel Feghhi
Bi-level Bidding Strategy of a Wind Power Producer Considering Local Intraday Demand Response Exchange Market
Ehsan Nokandi - Mostafa Vahedipour-Dahraie - Saeed Reza Goldani
طراحی تنظیمکنندهی خروجی بهینهی مبتنی بر یادگیری تقویتی ایمن با استفاده از تابع مانع کنترلی نمایی
سیدرضا اصغری - سعید شمقدری
Bilabial Consonants Recognition in CV Persian Syllable Based on Computer Vision
Melika Khajeh - Azam Bastanfard - Dariush Amirkhani
Integrating Model-Agnostic Meta-Learning with Advanced Language Embeddings for Few-Shot Intent Classification
Ali Rahimi - Hadi Veisi
Radio frequency energy harvesting with multi band rectenna in GSM 1800, UMTS2100 and WiFi
Sahar Bayat - Zahra Bahrami - Asghar Keshtkar
Holographic Principle Inspired Metal-Only Spoof Surface Plasmon Polariton Leaky-wave Antenna with Circular Polarization
Sajjad Zohrevand - Mohammad Amin Chaychi zadeh - Nader Komjani
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 43.6.0