0% Complete
صفحه اصلی
/
سی و یکمین کنفرانس بین المللی مهندسی برق
Vision Transformer and Parallel Convolutional Neural Network for Speech Emotion Recognition
نویسندگان :
Saber Hashemi
1
Mohammad Asgari
2
1- دانشگاه صداوسیما
2- دانشگاه صدا و سیما
کلمات کلیدی :
speech emotion recognition،vision transformer،convolutional neural network،attention mechanism
چکیده :
Vision transformer (ViT) is a new approach for image processing tasks. The vision transformer splits the image into patches and converts it into a sequence of vectors. This sequence is suitable for the transformer structure. This paper uses the ViT method for speech emotion recognition. Unlike ViT, which splits the image into square patches, we use time frames as patches. Alongside using the frame-based ViT to benefit from its ability to learn global features, we are using a convolutional neural network. The convolutional neural network extracts local features and focuses on the two-dimensional structure of the input. Mel-Frequency Cepstral Coefficients extracted from audio files are used as input for the proposed neural network. Using this model in the RAVDESS dataset, we achieved an unweighted accuracy of 79.2%.
لیست مقالات
لیست مقالات بایگانی شده
Optimization of a three-phase Induction Motor for Electric Vehicles Based on Hook-Jews Optimization Method
Arash Mousaei - Naghi Rostami - Mohammad Bagher Bannae Sharifian
A Single-Switch Single-Inductor High Step-Up DC-DC Converter with Single-Input and Dual-Output Ports
Ali Nadermohammadi - Saed Mahmoud Alilou - Mohammad Maalandish - Seyed Hossein Hosseini - Mehdi Abapour - Kazrm Zare
Design and Control of a Novel Multi-port Bidirectional Buck-Boost Converter Suitable for Hybrid Electric Vehicle Charging Stations
Amir Safaeinasab - Homayon Soltani Gohari - Karim Abbaszadeh
Employing Integrated Quantum Photonic Computers for Gaussian Boson Sampling
Mehrdad Ghasemi - Hassan Kaatuzian - Houshyar Noshad - Mahmood Hassani - Mobin Motaharifar - Mahdi NoroozOliaei
Adaptive Control of Telerehabilitation Systems in The Framework of Multi-Agent Systems
Mohammadreza Sheykh - Heidar Ali Talebi - ّIman Sharifi
Unsupervised Change Detection in SAR Images Using a Six-Branch CNN and Adaptive Window Approach
Abbas Kakoolvand - Maryam Imani - Hassan Ghassemian
An Enhanced Chaotic System Based Color Image Encryption using DNA Encoding
Mobin Vaziri - Mohammad Mehdi Rahimifar - Hadi Jahanirad
Enhanced Forward Model for Photoacoustic Imaging with Speed of Sound Compensation
Amirreza Jodeiry - Zahra Kavehvash
Joint Energy and Throughput Optimization in Energy Harvesting Cognitive Sensor Networks
Morteza Sharifi - Mahmood Mohassel Feghhi
Beam arrangement comparison for two-beam ultrasonic flowmeter
Rouzbeh Choupan - Moosa Ayati
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 40.3.1