0% Complete
صفحه اصلی
/
سی و یکمین کنفرانس بین المللی مهندسی برق
Vision Transformer and Parallel Convolutional Neural Network for Speech Emotion Recognition
نویسندگان :
Saber Hashemi
1
Mohammad Asgari
2
1- دانشگاه صداوسیما
2- دانشگاه صدا و سیما
کلمات کلیدی :
speech emotion recognition،vision transformer،convolutional neural network،attention mechanism
چکیده :
Vision transformer (ViT) is a new approach for image processing tasks. The vision transformer splits the image into patches and converts it into a sequence of vectors. This sequence is suitable for the transformer structure. This paper uses the ViT method for speech emotion recognition. Unlike ViT, which splits the image into square patches, we use time frames as patches. Alongside using the frame-based ViT to benefit from its ability to learn global features, we are using a convolutional neural network. The convolutional neural network extracts local features and focuses on the two-dimensional structure of the input. Mel-Frequency Cepstral Coefficients extracted from audio files are used as input for the proposed neural network. Using this model in the RAVDESS dataset, we achieved an unweighted accuracy of 79.2%.
لیست مقالات
لیست مقالات بایگانی شده
Non-contact Radar Technology and Machine Learning for Automated Sleep Apnea-Hypopnea Syndrome Detection
ُSaman Faridsoltani - Mohaddeseh Sadeghi - Zahra Rahmani - Somayyeh Chamaani
LSTM and Markov-Based Mobility Prediction for Multi-access Edge Computing
Hadi Ghavaminejad - Nasser Yazdani - Golboo Rashidi
Kalman Filter Fusion Based on Interactive Multiple Model for Target Tracking in Wireless Sensor Networks
Zahra Zamani - Behrouz Safarinejadian
Analytical Model for Estimating the Range of Troposcatter Active Radar
Mahdi Shiri - Mohammadreza Edalatzadeh
تجزیه و تحلیل امواج فیبریلاتور دهلیزی به منظور طبقهبندی AF با استفاده از موجک لیدر
سارا میهن دوست
Adaptive Smooth Super Twisting Sliding Mode Control for Parkinson's Tremor Treatment
Reyhaneh Valibeik - ّFatemeh Jahangiri - Mostafa Abedi
Photonic Crystal-based Plasmonic Biosensor with Low-cost and High-sensitivity Properties
Mahdieh Ahmadi Motlagh - Mahdieh Bozorgi - Mahmood Rafaei-Booket
Family of Multifunctional Controllable Converters for Grid, Battery, and PV-Powered EV Charging Station Applications
Homayon Soltani Gohari - Amir Safaeinasab - Karim Abbaszadeh
A Non-Isolated Common Ground Dual-Input DC-DC Converter with a High Voltage Gain for Photovoltaic Power Generation Systems
Hamed Abdi - Naghi Rostami - Ebrahim Babaei
A Novel Tunable LC Filter For Ultra High Frequency Applications
Davoud Razaghpour - Mir Majid Ghasemi - Amir Fathi
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 42.8.0