0% Complete
صفحه اصلی
/
سی و دومین کنفرانس بین المللی مهندسی برق
A novel clustering-based over-sampling technique for imbalanced data sets
نویسندگان :
Behzad Mirzaei
1
Hossein Nezamabadi-pour
2
Javad Mahmoodi
3
1- دانشگاه شهید باهنر کرمان
2- دانشگاه شهید باهنر کرمان
3- دانشگاه شهید باهنر کرمان
کلمات کلیدی :
Imbalanced data،Clustering،K-means algorithm،Over-sampling،Preprocessing methods
چکیده :
One of the most challenging problems in machine learning is the classification of imbalanced data. This problem arises when the samples of data are distributed unevenly among the classes, such that compared to one class (the majority or negative class), the other class (the minority or positive class) has far fewer samples. The classical classifiers are inappropriate to classify data sets of this nature. To address these classifiers' shortcoming in class imbalance situations, we present a novel clustering-based over-sampling technique in this paper. First, the k-means clustering algorithm is used to cluster the minority class samples. Then, sparse clusters including fewer samples are chosen. Finally, we use the nearest neighbor of each cluster center to create synthetic samples for the minority class. Also, to select clusters based on probabilities, we apply the roulette wheel selection operator during over-sampling. The C4.5 decision tree classifier is utilized in our experiments, and the F-measure criterion is considered to evaluate methods. According to the results, our method outperforms six other methods over fifteen imbalanced data sets.
لیست مقالات
لیست مقالات بایگانی شده
Analysis of the RCS of Luneburg Reflector in Bistatic Mode
Mohammad Amin Abdollahi - Gholamreza Moradi
An Ensemble Model for Sleep Stages Classification
Sahar Hassanzadeh Mostafaei - Jafar Tanha - Amir Sharafkhaneh - Zohair Hassanzadeh Mostafaei - Mohammed Hussein Ali Al-jaf - Alireza Fakhim babaei
Evaluation Study of Different Integration Methods of LCC Compensation Network for Various Types of Magnetic Structures of Wireless Power Transfer
Nima Rasekh - Navid Rasekh - Mojtaba Mirsalim
Design and Simulation of Ultra High power X-band Rotary Joint with a Matching Choke
Mohammad Bod - Seyed mohammad Hashemi
Application of Statistical Techniques and Machine Learning in Forecasting Distribution Network Load: A Real Case Study on the Iranian Power System
Hossein Jafari - Mohammad Sadegh Sepasian - Fatemeh Teimori
Multi-Agent Systems for Quadcopter under Nonlinear Dynamics and Actuator Modeling with MPC and LQR Controller
Navid Mohammadi - Saeed Khankalantary
A Third-Order Noise-Shaping SAR ADC With Optimized NTF Zeros for IOT Applications
Mansoure Yousefirad - Mohammad Yavari
ZYNQ Based Real-Time Data Logger with 256 kSPS Sampling using Ethernet Interface
Alireza Eteghad - Ataollah Panahgholi - Esmaeil Najafiaghdam
Developing a superlens with High Resolution using Quantum Dot Nano-Particles
Amin Monemian Esfahani - Leila Yousefi
A Bi-Level Attack-Defense Model for the Forecasting False Data Injection Attacks on the Integrated Energy Systems
Maryam Azimi - Hamed Delkhosh - Mahdi Ghaedi
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 43.6.0