0% Complete
صفحه اصلی
/
سی و یکمین کنفرانس بین المللی مهندسی برق
Vision Transformer and Parallel Convolutional Neural Network for Speech Emotion Recognition
نویسندگان :
Saber Hashemi
1
Mohammad Asgari
2
1- دانشگاه صداوسیما
2- دانشگاه صدا و سیما
کلمات کلیدی :
speech emotion recognition،vision transformer،convolutional neural network،attention mechanism
چکیده :
Vision transformer (ViT) is a new approach for image processing tasks. The vision transformer splits the image into patches and converts it into a sequence of vectors. This sequence is suitable for the transformer structure. This paper uses the ViT method for speech emotion recognition. Unlike ViT, which splits the image into square patches, we use time frames as patches. Alongside using the frame-based ViT to benefit from its ability to learn global features, we are using a convolutional neural network. The convolutional neural network extracts local features and focuses on the two-dimensional structure of the input. Mel-Frequency Cepstral Coefficients extracted from audio files are used as input for the proposed neural network. Using this model in the RAVDESS dataset, we achieved an unweighted accuracy of 79.2%.
لیست مقالات
لیست مقالات بایگانی شده
Numerical and Computational Study on Compressive Strain Effect in Perovskite Solar Cell
Daniyal Khosh Maram - Hamed Abnavi - Hanieh Talati Aghdam
تخمین کانال V2X با استفاده از CDP وفقی
الهام نادری مقدم - محمدعلی سبقتی - حسن زارعیان
FPGA-Based Multiplier with a New Approximate Full Adder for Error-Resilient Applications
Ali Ranjbar - Elham Esmaeili - Shabnam Rafiei - Nabiollah Shiri
Ultrahigh Step-Up Non-Isolated DC-DC Converter Based on Quadratic Converter without Coupled Inductor
Sajad Rostami - Vahid Abbasi - Masoumeh Parastesh
بررسی عملکرد الگوریتم یادگیری تقلیدی در آموزش شبکه عصبی کاملا متصل برای حل مسئله مسیریابی در محیطهای چندعامله
محمد روغنی - سمانه حسینی سمنانی
An Iterative Post-processing Method for Speech Source Separation in Realistic Scenarios
Iman Shahriari - Hossein Zeinali
LSTM and Markov-Based Mobility Prediction for Multi-access Edge Computing
Hadi Ghavaminejad - Nasser Yazdani - Golboo Rashidi
Modeling of Photo-thermoelectric Current Effects in Phase Change Material based Optical Nano Dipole Antenna Energy Transducer
Daniyal Khosh Maram - Seyed Asad Amirhosseini
Physics-Based Learning Approach Using Self-Terms for Electromagnetic Scattering in Multi-Object Scenarios
Arefeh Nikdast - Amir ahmad Shishegar
ساخت حسگر مقاومتی گاز سولفید هیدروژن با استفاده از ترکیب نانوذرات اکسید تیتانیوم و گرافن اکسید کاهش یافته
محمد دیانتی - سمانه حامدی
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 42.0.4