کنفرانس مهندسی برق ایران

صفحه اصلی / سی و یکمین کنفرانس بین المللی مهندسی برق

Vision Transformer and Parallel Convolutional Neural Network for Speech Emotion Recognition

نویسندگان :

Saber Hashemi¹ Mohammad Asgari²

1- دانشگاه صداوسیما 2- دانشگاه صدا و سیما

کلمات کلیدی :

speech emotion recognition،vision transformer،convolutional neural network،attention mechanism

چکیده :

Vision transformer (ViT) is a new approach for image processing tasks. The vision transformer splits the image into patches and converts it into a sequence of vectors. This sequence is suitable for the transformer structure. This paper uses the ViT method for speech emotion recognition. Unlike ViT, which splits the image into square patches, we use time frames as patches. Alongside using the frame-based ViT to benefit from its ability to learn global features, we are using a convolutional neural network. The convolutional neural network extracts local features and focuses on the two-dimensional structure of the input. Mel-Frequency Cepstral Coefficients extracted from audio files are used as input for the proposed neural network. Using this model in the RAVDESS dataset, we achieved an unweighted accuracy of 79.2%.

لیست مقالات

لیست مقالات بایگانی شده

Type-2 Fuzzy Wavelet Control for a Quadruple-Tank System based on Disturbance Rejection

Mohammadreza Esmaeilidehkordi - Alireza Nezamzadeh - Maryam Zekri - Iman Izadi - Farid Sheikholeslam

MAD-TI: Meta-path Aggregated-Graph Attention Network for Drug Target Interaction Prediction

Reza Shami Tanha - Maryam Sadighian - Arash Zabihian - Mohsen Hooshmand - Mohsen Afsharchi

Optimized ANFIS-based Control Design Using Genetic Algorithm to Obtain the Vaccination and Isolation Rates for the COVID-19

Zohreh Abbasi - Mohsen Shafieirad - Amir Hossein Amiri Mehra - Iman Zamani

A Two Stage Low Power 0.73-4.4 GHz LNA Using Current Reuse and Noise Reduction Techniques

Sajjad Shojaei Baghini - Seyed-Ali Samareh-TaheriNasab - Samad Sheikhaei

Efficient and Fast Analysis of SIW Microwave Devices Using the Multiple Multipole Technique

Ahmad Bakhtafrouz - Mohammad Moemenian - Mohsen Maddahali - Mohsen Karimian Kakolaki

Design and Implementation of a Compact 4×4 Microstrip Patch Antenna Array with Circular Polarization for Nano Satellite Applications

Mohammad Ghasemi - Majid Afsahi - Shoresh Namdari

Nonlinear Observer Design via Emulation Method for Sampled-data Teleoperation Systems

Ali Firouzi Abriz - Amir Aminzadeh Ghavifekr - Ashkan Safari

Improved Spectral Efficiency of RIS-aided 6G Communication using Deep Learning

Zahra Zahedi - Mehrdad Ardebilipur - Fatemeh Dehrouye

Enhancing the Performance of Piezoelectric Sensors Using RTV2 Elastomer Coatings

Amirreza Kamali - Amirhossein Jafari - Ehsan Maani Miandoab

مدلسازی، تحلیل و شبیه سازی مبدل رزونانسی LC-LC با قابلیت همزمان جریان ثابت و ولتاژ ثابت در خروجی مناسب برای شارژ باتری

کامران داودی

بیشتر

ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 43.6.0