0% Complete
صفحه اصلی
/
سی و دومین کنفرانس بین المللی مهندسی برق
A novel clustering-based over-sampling technique for imbalanced data sets
نویسندگان :
Behzad Mirzaei
1
Hossein Nezamabadi-pour
2
Javad Mahmoodi
3
1- دانشگاه شهید باهنر کرمان
2- دانشگاه شهید باهنر کرمان
3- دانشگاه شهید باهنر کرمان
کلمات کلیدی :
Imbalanced data،Clustering،K-means algorithm،Over-sampling،Preprocessing methods
چکیده :
One of the most challenging problems in machine learning is the classification of imbalanced data. This problem arises when the samples of data are distributed unevenly among the classes, such that compared to one class (the majority or negative class), the other class (the minority or positive class) has far fewer samples. The classical classifiers are inappropriate to classify data sets of this nature. To address these classifiers' shortcoming in class imbalance situations, we present a novel clustering-based over-sampling technique in this paper. First, the k-means clustering algorithm is used to cluster the minority class samples. Then, sparse clusters including fewer samples are chosen. Finally, we use the nearest neighbor of each cluster center to create synthetic samples for the minority class. Also, to select clusters based on probabilities, we apply the roulette wheel selection operator during over-sampling. The C4.5 decision tree classifier is utilized in our experiments, and the F-measure criterion is considered to evaluate methods. According to the results, our method outperforms six other methods over fifteen imbalanced data sets.
لیست مقالات
لیست مقالات بایگانی شده
Autonomous, Bio-inspired vision-based navigation system for indoor flying using hybrid optical flow and stereopsis methods
Masoud Mohtadifar - Hadi Seyedarabi
A Simulation of Bayesian Surprise
Alireza Maleki - Ali Taherinassaj - Hoda Mohammadzade
Deception Attack Detection and Resilient Control in Platoon of Smart Vehicles
Hassan Mokari - Elnaz Firouzmand - Iman Sharifi - Ali Doustmohammadi
یک روش اقتصادی برای تعیین مکان بهینه ریکلوزرها در فیدرهای توزیع شعاعی با هدف بهبود قابلیت اطمینان
محمودرضا شاکرمی - میثم دوستی زاده - هومن بسطامی - مهران امیری - ابراهیم شریفی پور - شمس الدین کمالوند
A New Method Based on Emprical Wavelet Transform in Order to Detect Current Transformer Saturation in Distance Relay
Amir Ali Ahmadi Pishkohi - Seyed Amir Hosseini - Behrooz Taheri
An Open-Loop Time Amplifier With Zero-Gain Delay in Output for Coarse-Fine Time to Digital Converters
Seyyed Morteza Golzan - Jafar Sobhi - Ziaddin Daie Koozehkanani
Devloping a clustering routing algorithm based on the efficient hybrid methodology for WSN performance optimization
Neda Mazloomi - Sajad Haghzad Klidbary
Ultra-broadband and compact beamsplitters using subwavelength-grating-assisted zero gap directional couplers
Kamalodin Arik - Mahmood Akbari - Amin Khavasi
Robust H∞ Control Design for Variable-Speed Wind Turbines Using Bilinear Matrix Inequalities
Hamidreza Javanmardi - Alireza Hamedi - Mahya Rahimzadeh
Second-order Sliding Mode Control for DC-DC buck converter with input Voltage Ripple Elimination
Maede Azimi - Mehdi Asadi - Adel Zakipour
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 40.3.1