0% Complete
صفحه اصلی
/
سی و دومین کنفرانس بین المللی مهندسی برق
A novel clustering-based over-sampling technique for imbalanced data sets
نویسندگان :
Behzad Mirzaei
1
Hossein Nezamabadi-pour
2
Javad Mahmoodi
3
1- دانشگاه شهید باهنر کرمان
2- دانشگاه شهید باهنر کرمان
3- دانشگاه شهید باهنر کرمان
کلمات کلیدی :
Imbalanced data،Clustering،K-means algorithm،Over-sampling،Preprocessing methods
چکیده :
One of the most challenging problems in machine learning is the classification of imbalanced data. This problem arises when the samples of data are distributed unevenly among the classes, such that compared to one class (the majority or negative class), the other class (the minority or positive class) has far fewer samples. The classical classifiers are inappropriate to classify data sets of this nature. To address these classifiers' shortcoming in class imbalance situations, we present a novel clustering-based over-sampling technique in this paper. First, the k-means clustering algorithm is used to cluster the minority class samples. Then, sparse clusters including fewer samples are chosen. Finally, we use the nearest neighbor of each cluster center to create synthetic samples for the minority class. Also, to select clusters based on probabilities, we apply the roulette wheel selection operator during over-sampling. The C4.5 decision tree classifier is utilized in our experiments, and the F-measure criterion is considered to evaluate methods. According to the results, our method outperforms six other methods over fifteen imbalanced data sets.
لیست مقالات
لیست مقالات بایگانی شده
Techno-Economic Dispatch of Distributed Energy Resources for Optimal Grid-Connected Operation of a Microgrid
Selma Cheshmeh khavar - Arya Abdolahi
Designing Of Type-2 Fuzzy Formation Controller For A Class Of Nonlinear Multiagent System Using JAYA Algorithm
Arvin Attar - Mohammad Ali Badamchizadeh - Sehraneh Ghaemi
Floquet model of spatiotemporally modulated graphene-based structures
Mahsa Valizadeh - Leila Yousefi - MirFaez Miri
ارائه روشی مبتنی بر دایجسترای پویا جهت مسیریابی بهینه در شبکه ترافیک شهری
طه واجدسمیعی - منیره عبدوس
Enhancing SCGAN’s Disentangled Representation Learning with Contrastive SSIM Similarity Constraints
Iman Yazdanpanah - Ali Eslamian
High efficiency Continuous class J/B power amplifier design with 130% Fractional Bandwidth
Sara Aghajani - Mahmoud Kamarei - Marzieh Chegini
Higher-order semi-blind source separation approaches using Canonical Polyadic (CP) decomposition
Mohammad Jalilpour Monesi - Sepideh Hajipour Sardouie
Using a Novel Connection Triangle as a Classifier to Discriminate between Different Faults in the Frequency Response Analysis
Mohammad Hamed Samimi
Identifying Singular 2-D Systems Using 1-D Methods
Masoud Shafiee - Kamyar Azarakhsh
Design Of Observer-Based Nonlinear Controller For Tracking Maximum Power Point In The Solar Cell
Kobra Siahi - Mohammad Reza Arvan - Vahid Behnamgol - Mahdi Mosayebi
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 42.8.0