0% Complete
صفحه اصلی
/
سی و یکمین کنفرانس بین المللی مهندسی برق
Vision Transformer and Parallel Convolutional Neural Network for Speech Emotion Recognition
نویسندگان :
Saber Hashemi
1
Mohammad Asgari
2
1- دانشگاه صداوسیما
2- دانشگاه صدا و سیما
کلمات کلیدی :
speech emotion recognition،vision transformer،convolutional neural network،attention mechanism
چکیده :
Vision transformer (ViT) is a new approach for image processing tasks. The vision transformer splits the image into patches and converts it into a sequence of vectors. This sequence is suitable for the transformer structure. This paper uses the ViT method for speech emotion recognition. Unlike ViT, which splits the image into square patches, we use time frames as patches. Alongside using the frame-based ViT to benefit from its ability to learn global features, we are using a convolutional neural network. The convolutional neural network extracts local features and focuses on the two-dimensional structure of the input. Mel-Frequency Cepstral Coefficients extracted from audio files are used as input for the proposed neural network. Using this model in the RAVDESS dataset, we achieved an unweighted accuracy of 79.2%.
لیست مقالات
لیست مقالات بایگانی شده
Improving Wind Turbines Blades Damage detection by using YOLO BoF and BoS
Reza Mohammadi - Saeed Sharifian
تعیین آرایش بهینه خطوط جهت کاهش فرسایش یقه پایه های بتنی ناشی از تنشهای باد
میثم پوراحمدی نخلی - حمیدرضا فیروزآبادی
جداسازی عروق در تصاویر شبکیه چشم با استفاده از یک روش آستانه گذاری وفقی مبتنی بر اطلاعات محلی و سرتاسری
زهرا نورانی آتشگاه - محمد آراسته - آیدا فولادی وندا
Design, Prototyping and Performance Analysis of a Barometric-Based Soft Force Sensor
Mohammad Reza SheykhAzimi - Mohammad Reza Nayeri - Mehdi Tale Masouleh - Ahmad Kalhor
Sensitive RSNs to Schizophrenia; A graph parameter approach
Shirin Karimian - Farzaneh Keyvanfard - Abbas Nasiraei Moghaddam
Design and Implementation of a Data-Driven Controller for a Two-Wheeled Self-Balancing Robot
Mohammad Akhavan - Haniye Parvahan - Mojtaba Nouri Manzar
Identifying Influential Nodes in Complex Networks by Multiple Attributes Model
Shima Esfandiari - Mostafa Fakhrahmad
The Theory of A Novel Circuit Design For An Identification Processor Using 0.35µm CMOS Technology
Rouhollah Mohammadinasr - Kheirollah Hadidi - Farhad Piri
Design and Application of a Five-Level Cross-Switched Inverter in Low-Voltage Distribution System Voltage Compensation
Mohammad Farhadi-kangarlu - Yousef Neyshabouri - Asra Sotudeh
On the Correction of the Boundary Deformation Errors in Microwave Imaging With Spatial Priors
Seyyed Mohammad Hosseini - Amir Ahmad Shishegar
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 42.0.4