0% Complete
صفحه اصلی
/
سی و یکمین کنفرانس بین المللی مهندسی برق
Vision Transformer and Parallel Convolutional Neural Network for Speech Emotion Recognition
نویسندگان :
Saber Hashemi
1
Mohammad Asgari
2
1- دانشگاه صداوسیما
2- دانشگاه صدا و سیما
کلمات کلیدی :
speech emotion recognition،vision transformer،convolutional neural network،attention mechanism
چکیده :
Vision transformer (ViT) is a new approach for image processing tasks. The vision transformer splits the image into patches and converts it into a sequence of vectors. This sequence is suitable for the transformer structure. This paper uses the ViT method for speech emotion recognition. Unlike ViT, which splits the image into square patches, we use time frames as patches. Alongside using the frame-based ViT to benefit from its ability to learn global features, we are using a convolutional neural network. The convolutional neural network extracts local features and focuses on the two-dimensional structure of the input. Mel-Frequency Cepstral Coefficients extracted from audio files are used as input for the proposed neural network. Using this model in the RAVDESS dataset, we achieved an unweighted accuracy of 79.2%.
لیست مقالات
لیست مقالات بایگانی شده
A New Atrial Fibrillation Detection System with Noise Cancellation and Signal Annotation
Amirali Banaei Kashani - Bardia Baraeinejad - Mohammad Fakharzadeh
Lateral Stability of Electric Vehicles in Car-Following Scenario Using High-Accuracy NMPC
Mohammad Behzad Roohi - Mohammad Javad Yazdanpanah
Improving the Accuracy of the Annotation Algorithm in Pattern-Based Tennis Game Video
Azam Bastanfard - Dariush Amirkhani
Application of Max Flow- Min Cut Theory to find the best placement Of Electronic-based DC-PFCs for enhancing static security in MT-HVDC Meshed Grids
Mir Hamed Pour Mir Asghariyan - Jafar Milimonfared - Seyed Saeid Heidari Yazdi - Ali Haji Ali Biglo - Kumars Rouzbehi
A Hybrid Data-Driven Algorithm for Real-Time Friction Force Estimation in Hydraulic Cylinders
Mohamad Amin Jamshidi - Mehrbod Zarifi - Zolfa Anvari - Hamed Ghafarirad - Mohammad Zareinejad
Index and impulse in Singular Biological Continuous Systems
Behnam Babaei - Masoud Shafiee
Modeling the Cable Bridge Based on Two Dimensional System and Analysing the Stability of Desired Model Based on Wave Advanced Model
Mehdi Mirshahi - Masoud Shafiee - Mehdi Mohammadi
A Circularly Polarized Metal-Only Holographic Leaky-Wave Antenna Based on Spoof Surface Plasmon Polaritons
Reza Ashrafi Mohabadi - Sajjad Zohrevand - Mohammad Amin Chaychizadeh - Nader Komjani
Unveiling Enhanced Image Quality in Sparse-View CT with OSEM- ANLM Algorithm
Sayna Jamaati - Seyed Abolfazl Hosseini - Mohammad Ghorbanzadeh - Hossein Arabi
تخمین بهینه پارامترهای مدل یک ماژول فتوولتائیک توسط الگوریتم بهینه سازی Mayfly
پریسا اکبری - نجمه اقبال
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 41.7.4