0% Complete
صفحه اصلی
/
بیست و نهمین کنفرانس مهندسی برق ایران
PAVID-CVs: Persian Audio-Visual Database of CV syllables
نویسندگان :
Mahsa Hedayatipour
1
Yasser Shekofteh
2
Mohsen Ebrahimi Moghaddam
3
1- دانشگاه شهید بهشتی
2- دانشگاه شهید بهشتی
3- دانشگاه شهید بهشتی
کلمات کلیدی :
Visual Speech Recognition, Lip Reading, CV syllables, Visyllable, Audio-Visual Database, Persian/Farsi Language.
چکیده :
Abstract— Lip-reading is a visual speech recognition process. In this process, recognizing the smaller units of speech can be the basis for recognizing the larger units of a language such as words. In this paper, we have introduced a Persian (Farsi) Audio-Visual Database of CV syllables, named PAVID-CVs, as a set of isolated two-phoneme visyllable and isolated words related to the visyllables, which include only Persian CV syllables, for lip-reading or audio-visual speech recognition purposes such as isolated word recognition. This dataset can be used for machine learning-based methods due to its useful tagged information. Here, we explain the steps of preparing the database. It contains about 30 hours data from 40 speakers. Initial experiments are done utilizing hidden Markov models (HMM) as a visyllable classifier. Then, these models have been used for visual recognition of 6 Persian words with different numbers of syllables and an accuracy of 47.37% was obtained in a speaker-independent experiment.
لیست مقالات
لیست مقالات بایگانی شده
A new double rotor switched reluctance motor aiming at average torque improvement
Reza Rezaei - Seyed Reza Mousavi Aghdam
Optimal Bidding Strategy of a Cascade Hydroelectric Unit in a Day-ahead Energy Market Using Particle Swarm Optimization
Shabnam Ahmadian Titkanloo - Sahar Ahmadian Titkanloo - Asghar Akbari Foroud - Sadaf Ahmadian - Soheil Ahmadian Titkanloo
A Novel Low Torque Ripple Hexagon Biased Flux Doubly Salient Permanent Magnet Motor
Mohammad Amirkhani - Behnam Mohammadian Mosammam - Mojtaba Mirsalim
A Design Methodology for Submicron Low-Voltage Bandgap Voltage Reference
Mehdi Samavati - Samad Sheikhaei - Mohsen Jalali
An improved ECG segmentation method based on adaptive Hermite functions
Abazar Arabameri - Sajad Haghzad Klidbary
One-Way Edge Modes Induced by Synthetic Magnetic Field in Time-Varying LC Circuit
Sadeq Bahmani - Amir Nader Askarpour
An active learning approach for classification of several arrhythmias in ECG signal
Nastaran Darbani - Danial Katoozian - Hossein Hosseini-Nejad
Robust Object Detection Against Adversarial Perturbations with Gabor Filter
Mohammad Parsa Karimi - Abdollah Amirkhani - Shahriar B. Shokouhi
طراحی و مدلسازی امولاتور دریچه گاز الکترونیکی برای کاربرد در خودرو
محمدرضا درزی - مجید شالچیان
Modulation Classification with Convolutional Neural Network based Deep Learning in Elastic Optical Network
Ehsan Varasteh - Seyed Sadra Kashef - Morteza Valizadeh - Mehdi Ranjbar Zefreh
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 40.3.1