0% Complete
صفحه اصلی
/
سی و دومین کنفرانس بین المللی مهندسی برق
A novel clustering-based over-sampling technique for imbalanced data sets
نویسندگان :
Behzad Mirzaei
1
Hossein Nezamabadi-pour
2
Javad Mahmoodi
3
1- دانشگاه شهید باهنر کرمان
2- دانشگاه شهید باهنر کرمان
3- دانشگاه شهید باهنر کرمان
کلمات کلیدی :
Imbalanced data،Clustering،K-means algorithm،Over-sampling،Preprocessing methods
چکیده :
One of the most challenging problems in machine learning is the classification of imbalanced data. This problem arises when the samples of data are distributed unevenly among the classes, such that compared to one class (the majority or negative class), the other class (the minority or positive class) has far fewer samples. The classical classifiers are inappropriate to classify data sets of this nature. To address these classifiers' shortcoming in class imbalance situations, we present a novel clustering-based over-sampling technique in this paper. First, the k-means clustering algorithm is used to cluster the minority class samples. Then, sparse clusters including fewer samples are chosen. Finally, we use the nearest neighbor of each cluster center to create synthetic samples for the minority class. Also, to select clusters based on probabilities, we apply the roulette wheel selection operator during over-sampling. The C4.5 decision tree classifier is utilized in our experiments, and the F-measure criterion is considered to evaluate methods. According to the results, our method outperforms six other methods over fifteen imbalanced data sets.
لیست مقالات
لیست مقالات بایگانی شده
کنترل تطبیقی بازوی رباتی دو درجه آزادی با استفاده از یادگیری گروهی مبتنیبر الگوریتم اکثریت وزندار شده تصادفی
علی چراغی - امیرحسین جراره - سعید شمقدری
Analytical Model for Estimating the Range of Troposcatter Active Radar
Mahdi Shiri - Mohammadreza Edalatzadeh
Contextual and Spectral Feature Fusion Using Local Binary Graph for Hyperspectral Images Classification
Zahra Farmahini Farahani - Hassan Ghassemian - Maryam Imani
طراحی بهینه پارکینگ خودروهای برقی با در نظر گرفتن عدم قطعیت منابع انرژی تجدیدپذیر
سید محمد هاشمی مصیر - میثم جعفری نوکندی - محمد بزرگپور رودباری
Multi-physics electromagnetic-mechanical analysis of a high-speed switched reluctance motor for vacuum cleaner application
Nasrin Majlesi - Morteza Saghaian-Nejad - Amir Rashidi
Plasmonic Refractive Index Sensor Using a Metal-Insulator-Metal Waveguide with a Disk-shaped Cavity and Silver Nanorod Defects
Mohammad Ghanavati - Mohammad-Azim Karami
Leader-Following H_∞ Fault-Tolerant Consensus of Nonlinear Multi-agent Systems with External Disturbances
Maryam Salimifard - Heidar Ali Talebi
اولویتبندی کلیدهای قدرت جهت پیادهسازی سیستم پایش وضعیت
محمدرضا قطبالدینی - احمد میرزائی - محمدمهدی منصوری مجومرد
Fast and Low Power Modified Carry Look-Ahead Adder
Sanaz Salem - Amir hossein Owji
Temperature Prediction of Lithium-Ion Batteries for Thermal Management Systems Using Graph Convolutional Networks
Sepehr Ghalebi - Elaheh Sadat Ahmadi Mousavi - Farzaneh Abdollahi - Farschad Torabi
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 40.4.2