0% Complete
صفحه اصلی
/
سی و دومین کنفرانس بین المللی مهندسی برق
A novel clustering-based over-sampling technique for imbalanced data sets
نویسندگان :
Behzad Mirzaei
1
Hossein Nezamabadi-pour
2
Javad Mahmoodi
3
1- دانشگاه شهید باهنر کرمان
2- دانشگاه شهید باهنر کرمان
3- دانشگاه شهید باهنر کرمان
کلمات کلیدی :
Imbalanced data،Clustering،K-means algorithm،Over-sampling،Preprocessing methods
چکیده :
One of the most challenging problems in machine learning is the classification of imbalanced data. This problem arises when the samples of data are distributed unevenly among the classes, such that compared to one class (the majority or negative class), the other class (the minority or positive class) has far fewer samples. The classical classifiers are inappropriate to classify data sets of this nature. To address these classifiers' shortcoming in class imbalance situations, we present a novel clustering-based over-sampling technique in this paper. First, the k-means clustering algorithm is used to cluster the minority class samples. Then, sparse clusters including fewer samples are chosen. Finally, we use the nearest neighbor of each cluster center to create synthetic samples for the minority class. Also, to select clusters based on probabilities, we apply the roulette wheel selection operator during over-sampling. The C4.5 decision tree classifier is utilized in our experiments, and the F-measure criterion is considered to evaluate methods. According to the results, our method outperforms six other methods over fifteen imbalanced data sets.
لیست مقالات
لیست مقالات بایگانی شده
بررسی عملکرد الگوریتم یادگیری تقلیدی در آموزش شبکه عصبی کاملا متصل برای حل مسئله مسیریابی در محیطهای چندعامله
محمد روغنی - سمانه حسینی سمنانی
Exploring Graph Biomarkers and Connectivity in Epilepsy Through Graph Learning
Ali Khosravipour - Sepideh Hajipour Sardouie
Effect of the Number of Quantum-Dot Layers on the Performance of the 1.3 µm InAs/GaAs VCSELs
Sara Alaei - Mahmood Seifouri - Saeed Olyaee - Gholamreza Babaabbasi
طراحی و شبیه سازی شتاب سنج خازنی MEMS برای استفاده در سمعک های تمام کاشت
میلاد کریمی پور - مهدیه مهران
A modified Dempster Shafer approach to classification in surgical skill assessment
Arash Iranfar - Mohammad Soleymannejad - Behzad Moshiri - Hamid D. Taghirad
طراحی خودرمزگذار متغیر جهت تشخیص عیب در بیرینگهای غلتشی
مریم آهنگ - مهدی علیاری شورهدلی
Q-Learning-Oriented Distributed Energy Management of Grid-Connected Microgrid
Esmat Samadi - Ali Badri - Reza Ebrahimpour
Design and Modeling of Graphene Based Electro-absorption Modulator Integrated with Hybrid Plasmonic Waveguides
Hadi Soofi - Shima Karkon Bagheri - Hamid Vahed
Optimal Path Planning and Control of a Hexarotor with Mass Uncertainty in the Presence of Dynamic Obstacles and Wind Using Sliding Mode and Adaptive PSO Algorithm
Nima Sina - Peyman Amiri - Mohammad Danesh
Cloudy: A Pythonic Cloud Simulator
Ahmad Siavashi - Mahmoud Momtazpour
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 42.0.4