0% Complete
صفحه اصلی
/
سی و دومین کنفرانس بین المللی مهندسی برق
A novel clustering-based over-sampling technique for imbalanced data sets
نویسندگان :
Behzad Mirzaei
1
Hossein Nezamabadi-pour
2
Javad Mahmoodi
3
1- دانشگاه شهید باهنر کرمان
2- دانشگاه شهید باهنر کرمان
3- دانشگاه شهید باهنر کرمان
کلمات کلیدی :
Imbalanced data،Clustering،K-means algorithm،Over-sampling،Preprocessing methods
چکیده :
One of the most challenging problems in machine learning is the classification of imbalanced data. This problem arises when the samples of data are distributed unevenly among the classes, such that compared to one class (the majority or negative class), the other class (the minority or positive class) has far fewer samples. The classical classifiers are inappropriate to classify data sets of this nature. To address these classifiers' shortcoming in class imbalance situations, we present a novel clustering-based over-sampling technique in this paper. First, the k-means clustering algorithm is used to cluster the minority class samples. Then, sparse clusters including fewer samples are chosen. Finally, we use the nearest neighbor of each cluster center to create synthetic samples for the minority class. Also, to select clusters based on probabilities, we apply the roulette wheel selection operator during over-sampling. The C4.5 decision tree classifier is utilized in our experiments, and the F-measure criterion is considered to evaluate methods. According to the results, our method outperforms six other methods over fifteen imbalanced data sets.
لیست مقالات
لیست مقالات بایگانی شده
A 5kW Bidirectional Isolated On-Board EV Battery Charger Using Hybrid PFC/Inverter
Amir Hossein Dabbagh - Hamed Arvani - Ebrahim Afjei
Low power SRAM using an optimal number of split bit lines and single-ended sensing
Mahdie Nazemian - Sayed Masoud Sayedi
Stray Load Losses Determination Methods of Induction Motors-A Review
Moslem Geravandi - Hassan Moradi CheshmehBeigi
تشخیص و مکان یابی خطاها در آرایه های فتوولتائیک متصل به شبکه
سعید انصاری - حیدر صامت - تیمور قنبری
تحلیل ارتباطات موثر و عملکردی سیگنالهای فیزیولوژیکی راننده جهت بهبود تشخیص حواس پرتی
نیلوفر وثوق - زهرا بهمنی دهکردی - امین محمدیان
Adaptive Smooth Super Twisting Sliding Mode Control for Parkinson's Tremor Treatment
Reyhaneh Valibeik - ّFatemeh Jahangiri - Mostafa Abedi
Low-cost Broadband Reflectarray Antenna Using Cross Bow-Tie elements
Mahdieh Bozorgi - Mahmood Rafaei-Booket
بازسازی تصاویر رادار دهانه مصنوعی با استفاده از نمایش تنک مبتنی بر گروه
محبوبه خدرزاده - صادق صمدی
An Integrated Technical Analysis and Machine Learning Trading Model for Noisy and Volatile Financial Markets
Arvin Esfandiari - Ali Doustmohammadi
Reactive Power Compensation in Distribution Grids: An Application of Trinary Cascaded H-bridge Multilevel Inverter
Yousef Neyshabouri - Mohammad Farhadi-Kangarlu
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 41.7.4