0% Complete
صفحه اصلی
/
بیست و نهمین کنفرانس مهندسی برق ایران
PAVID-CVs: Persian Audio-Visual Database of CV syllables
نویسندگان :
Mahsa Hedayatipour
1
Yasser Shekofteh
2
Mohsen Ebrahimi Moghaddam
3
1- دانشگاه شهید بهشتی
2- دانشگاه شهید بهشتی
3- دانشگاه شهید بهشتی
کلمات کلیدی :
Visual Speech Recognition, Lip Reading, CV syllables, Visyllable, Audio-Visual Database, Persian/Farsi Language.
چکیده :
Abstract— Lip-reading is a visual speech recognition process. In this process, recognizing the smaller units of speech can be the basis for recognizing the larger units of a language such as words. In this paper, we have introduced a Persian (Farsi) Audio-Visual Database of CV syllables, named PAVID-CVs, as a set of isolated two-phoneme visyllable and isolated words related to the visyllables, which include only Persian CV syllables, for lip-reading or audio-visual speech recognition purposes such as isolated word recognition. This dataset can be used for machine learning-based methods due to its useful tagged information. Here, we explain the steps of preparing the database. It contains about 30 hours data from 40 speakers. Initial experiments are done utilizing hidden Markov models (HMM) as a visyllable classifier. Then, these models have been used for visual recognition of 6 Persian words with different numbers of syllables and an accuracy of 47.37% was obtained in a speaker-independent experiment.
لیست مقالات
لیست مقالات بایگانی شده
Efficient and Fast Analysis of SIW Microwave Devices Using the Multiple Multipole Technique
Ahmad Bakhtafrouz - Mohammad Moemenian - Mohsen Maddahali - Mohsen Karimian Kakolaki
تشخیص و تفکیک برخط خطای مدار باز کلید در اینورترهای تک فاز PWM
مهدی اره پناهی - علی اکبر سلیمی
A Two-Step Stochastic Market-Oriented Approach for Optimal Operation of Commercial VPPs under Uncertainty
Jalal Moradi - Hossein Shahinzadeh - Ahmad Hafezimagham - Gevork B. Gharehpetian - S.M. Muyeen - Mohamed Benbouzid
Fixed-time consensus of unknown nonlinear multi-agent systems
Mohammad Hadi Rezaei - Ali Abooee
Heterogeneous Coverage Path Planning For Multi- Agent systems with ACO and GA
Mohammad Hasan Jalili Bahabadi - ََAmir Mahdavi - Saeed Khankalantary
ارائه ساختار پیشنهادی ترانسفورماتور حالت جامد یک سویه در بهره برداری از شبکه های توزیع
بهنام بهارلوئی - رضا قندهاری - مهدی بابایی - یوسف عطائی
Design and simulation of a surface acoustic wave based micro pressure sensor
Sohrab Ghasemi Bisheh - Mohammad Tahmasebipour - Fatemeh Anousheh
Multiphysics Simulation of the Modified Flux Coupling Type SFCL in VSC-HVDC Network
Mohammad Khakroei - Ashkan Mirzaei Rajeooni - Mahdi Rahimi Pirbasti - Hossein Heydari
A Multi-domain Fuzzy Ensemble Approach for Epileptic Seizure Detection
Samin Shahraki - Alireza Hajabdollah javaheri - Mahdi Pourgholi - Pedram Safarpour
Higher Derivatives Extremum Seeking with Very Slow/ Drifting Sensor
Farzaneh Karimi - Mohsen Mojiri
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 43.6.0