0% Complete
صفحه اصلی
/
بیست و نهمین کنفرانس مهندسی برق ایران
PAVID-CVs: Persian Audio-Visual Database of CV syllables
نویسندگان :
Mahsa Hedayatipour
1
Yasser Shekofteh
2
Mohsen Ebrahimi Moghaddam
3
1- دانشگاه شهید بهشتی
2- دانشگاه شهید بهشتی
3- دانشگاه شهید بهشتی
کلمات کلیدی :
Visual Speech Recognition, Lip Reading, CV syllables, Visyllable, Audio-Visual Database, Persian/Farsi Language.
چکیده :
Abstract— Lip-reading is a visual speech recognition process. In this process, recognizing the smaller units of speech can be the basis for recognizing the larger units of a language such as words. In this paper, we have introduced a Persian (Farsi) Audio-Visual Database of CV syllables, named PAVID-CVs, as a set of isolated two-phoneme visyllable and isolated words related to the visyllables, which include only Persian CV syllables, for lip-reading or audio-visual speech recognition purposes such as isolated word recognition. This dataset can be used for machine learning-based methods due to its useful tagged information. Here, we explain the steps of preparing the database. It contains about 30 hours data from 40 speakers. Initial experiments are done utilizing hidden Markov models (HMM) as a visyllable classifier. Then, these models have been used for visual recognition of 6 Persian words with different numbers of syllables and an accuracy of 47.37% was obtained in a speaker-independent experiment.
لیست مقالات
لیست مقالات بایگانی شده
Q-Learning-Oriented Distributed Energy Management of Grid-Connected Microgrid
Esmat Samadi - Ali Badri - Reza Ebrahimpour
An Analysis of Nash Equilibrium Learning through Myopic Decision-making in Incomplete Information Double Sided Auction Games within Markets
Hesam Farzaneh - Parsa Zholideh
Virtual power plant participation in day-ahead and futures markets with a deep learning approach
Farzin Ghasemi Olanlari - Mohammad Fazel Dehghanniri - Turaj Amraee
اندازهگیری علائم حیاتی چندین نفر با استفاده از رادار داپلر چرخان
فاطمه نقاش - محمدرضا شمسیان - فریدون بهنیا
Improving the Reliability of Multicore Embedded Systems through an Evolutionary-based Task Scheduling Approach
Athena Abdi - Hamid R Zarandi
اصلاح مسیرخروجی ID FANتا دودکش اشکودا و امکان سنجی بازیابی حرارتی دود
یاشار مغمومی - فرشته صادقی
Power Consumption and I/Q-to-Phase Analysis in Direct Demodulation Approaches
Mir mahdi Safari - Jafar Pourrostam
Impedance Evaluation of Plasmonic Nano Dipole Antennas Based on Guided TE Mode
Daniyal Khosh Maram - Hanieh Talati Aghdam - Hamed Abnavi
Distributed Energy Management of Large-Scale Microgrids Using Predictive Control
Hamid Reza Babaei Ghazvini - Mahsa Ghavami - Mohammad Haeri
Decoding Trait: Using Dual Transformers to Analyze Gender, Age Range and Personality
ُSaeed Asadian - Mostafa Tanasan - Bijan Vosoughi vahdat
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 42.5.2