0% Complete
صفحه اصلی
/
بیست و نهمین کنفرانس مهندسی برق ایران
PAVID-CVs: Persian Audio-Visual Database of CV syllables
نویسندگان :
Mahsa Hedayatipour
1
Yasser Shekofteh
2
Mohsen Ebrahimi Moghaddam
3
1- دانشگاه شهید بهشتی
2- دانشگاه شهید بهشتی
3- دانشگاه شهید بهشتی
کلمات کلیدی :
Visual Speech Recognition, Lip Reading, CV syllables, Visyllable, Audio-Visual Database, Persian/Farsi Language.
چکیده :
Abstract— Lip-reading is a visual speech recognition process. In this process, recognizing the smaller units of speech can be the basis for recognizing the larger units of a language such as words. In this paper, we have introduced a Persian (Farsi) Audio-Visual Database of CV syllables, named PAVID-CVs, as a set of isolated two-phoneme visyllable and isolated words related to the visyllables, which include only Persian CV syllables, for lip-reading or audio-visual speech recognition purposes such as isolated word recognition. This dataset can be used for machine learning-based methods due to its useful tagged information. Here, we explain the steps of preparing the database. It contains about 30 hours data from 40 speakers. Initial experiments are done utilizing hidden Markov models (HMM) as a visyllable classifier. Then, these models have been used for visual recognition of 6 Persian words with different numbers of syllables and an accuracy of 47.37% was obtained in a speaker-independent experiment.
لیست مقالات
لیست مقالات بایگانی شده
تدوین استراتژی تعمیرات و نگهداری مبتنی بر قابلیت اطمینان در شبکه ی انتقال قدرت
سید سینا طاهری اطاقسرا - مسعود اصغری قراخیلی
A Novel method for power transmission lines Protection Against the Sub-Synchronous Resonance Using thyristor-based reactive power compensation
Mohammadreza Mousavi Khademi - Mehdi Zareian Jahromi
طراحی خودرمزگذار متغیر جهت تشخیص عیب در بیرینگهای غلتشی
مریم آهنگ - مهدی علیاری شورهدلی
Optimal Placement of Unified Power Flow Controller in Power System Considering Transient Stability and Voltage Stability Criteria
Esmail Zahmatkeshan - Mohsen Bandekhoda
طراحی و شبیه سازی یک مولد اعداد تصادفی ترکیبی ارتقا یافته در آتوماتای سلولی نقطهکوانتومی با به کارگیری ساختارهای فراپایدار
سورنا آسیابان جونقانی - نوید یثربی
Improved Equivalent Input Disturbance Control of Nonlinear Aeropendulum System Using Data-Driven Approach
Mohammad Hossein Bayati - Arman Marzban - Mahsan Tavakoli-Kakhki - Ali Naseh
Designing of Multilayer Planar Spiral Air-Core Inductor for Power Electronic Applications
Mohammad Khakroei - Mohsen Mostafaei - Mansour Arefian - Afshin Rezaei-Zare - Majid Najafi Zarmehri
SAR Images Clustering Based on Modified Nonlinear Orthogonal non-Negative Matrix Factorization (NMF)
Mahdi Jowkar dehouei - Soolmaz Khazandi - Yaser Norouzi
Slice-Aware Resource Calendaring in Cloud-based Radio Access Networks
Zeinab Sasan - Siavash Khorsandi
Absorption Enhancement in Thin-Film Solar Cells using Integrated Photonic Topological Insulators
Mohammad Ali Shameli - Leila Yousefi
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 42.8.0