0% Complete
صفحه اصلی
/
سی و سومین کنفرانس بین المللی مهندسی برق
Better Exploration In Single-Agent Q-Learning Using Controlled Linear Perturbation
نویسندگان :
Sadredin Hokmi
1
Mohammad Haeri
2
1- Sharif university of technology
2- Sharif university of technology
کلمات کلیدی :
Q-learning،Exploration،Controlled Linear perturbation،Convergence rate،Maze،Cart-Pole
چکیده :
Reinforcement learning algorithms, especially model-free algorithms like Q-learning, have shown reliable results in finding optimal solutions for many real-time applications. However, challenges such as exploration in real-time and the convergence rate need to be addressed, and many researches have proposed algorithms to tackle these challenges. Algorithms like speedy Q-learning, Zap Q-learning, algorithms based on adding a regularization term, noise injection, and many others have been introduced. In this paper, an algorithm based on controlled linear perturbation is presented, which, according to the numerical results, can significantly reduce unnecessary explorations that are risky in real-time. Additionally, the proposed algorithm does not depend on the learning rate \mathbit{\alpha}, \mathbit{\gamma}, or changes in coefficients. However, to be effective, the parameters of the algorithm should be chosen within the correct range. The results of applying the proposed algorithm have been compared with three reliable algorithms: standard Q-learning, speedy Q-learning, and noise injection. These comparisons were conducted in a 9x9 maze scenario and in the cart-pole environment.
لیست مقالات
لیست مقالات بایگانی شده
کاربردِ تعامل اثر ضریب شکست نزدیک به صفر در گرافن و ITO برای طراحی مدولاتورهای نوری کم مصرف و پرسرعت
افروز رفعت ماه - مهدی میری - نوید یثربی
Analysis and Simulation of the Formation and dimensions of Gate-Defined Double Quantum Dots
Mahya Mostafavi - Majid Shalchian
Photonic Crystal-based Plasmonic Biosensor with Low-cost and High-sensitivity Properties
Mahdieh Ahmadi Motlagh - Mahdieh Bozorgi - Mahmood Rafaei-Booket
Multinomial Emoji Prediction Using Deep Bidirectional Transformers and Topic Modeling
Zahra Ebrahimian - Ramin Toosi - Mohammad Ali Akhaee
Angular Stable Multiband Miniaturized Flexible Frequency Selective Surface
Mozhgun Moazzamnia - Javad Nourinia - Changiz Ghobadi - Keyhan Hosseini - Mohsen Karamirad - Baman Mohammadi
FPGA-Based Multiplier with a New Approximate Full Adder for Error-Resilient Applications
Ali Ranjbar - Elham Esmaeili - Shabnam Rafiei - Nabiollah Shiri
نحوه کنترل سطوح هوشمند با قابلیت تنظیم مجدد در راستای مقابله با استراق سمع کننده ها
محمد کاظم ناطقی - زلفا زینل پور یزدی
Clustering of Fuzzy Data Based on Particle Swarm Optimization
Najme Ghanbari - Seyed-hamid Zahiri - Hadi Shahraki
Secure Control System Using Iterative Secret Sharing
Younes Esmaeili - Mohammad Haeri - Saeed Adelipour
Bidirectional DISO DC-DC Converter Based on Fixed-Frequency Sliding Mode Control Strategy
Amirhosein Hoseini - Saeed Hosseinnattaj - Jafar Adabi
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 41.7.4