0% Complete
صفحه اصلی
/
بیست و نهمین کنفرانس مهندسی برق ایران
PAVID-CVs: Persian Audio-Visual Database of CV syllables
نویسندگان :
Mahsa Hedayatipour
1
Yasser Shekofteh
2
Mohsen Ebrahimi Moghaddam
3
1- دانشگاه شهید بهشتی
2- دانشگاه شهید بهشتی
3- دانشگاه شهید بهشتی
کلمات کلیدی :
Visual Speech Recognition, Lip Reading, CV syllables, Visyllable, Audio-Visual Database, Persian/Farsi Language.
چکیده :
Abstract— Lip-reading is a visual speech recognition process. In this process, recognizing the smaller units of speech can be the basis for recognizing the larger units of a language such as words. In this paper, we have introduced a Persian (Farsi) Audio-Visual Database of CV syllables, named PAVID-CVs, as a set of isolated two-phoneme visyllable and isolated words related to the visyllables, which include only Persian CV syllables, for lip-reading or audio-visual speech recognition purposes such as isolated word recognition. This dataset can be used for machine learning-based methods due to its useful tagged information. Here, we explain the steps of preparing the database. It contains about 30 hours data from 40 speakers. Initial experiments are done utilizing hidden Markov models (HMM) as a visyllable classifier. Then, these models have been used for visual recognition of 6 Persian words with different numbers of syllables and an accuracy of 47.37% was obtained in a speaker-independent experiment.
لیست مقالات
لیست مقالات بایگانی شده
بیشینه سازی ظرفیت در رله های تمام دوطرفه تک مسیره با در نظر گرفتن اختلالات سخت افزاری
حسین حصاربنی - زهرا کشاورز گندمانی
Reinforcement Learning based Joint Resource Allocation and User Fairness Optimization in mmWave-NOMA HetNets
Sima Sobhi-Givi - Mahdi Nouri - Mahrokh G. Shayesteh - Hashem Kalbkhani - Zhiguo Ding
Design of a 2MW Medium Voltage Conventional Hybrid DC Circuit Breaker for Railway Application
Seyed Hamid Khalkhali - Mohsen Taghizadeh Kejani - Ali Asghar Razi Kazemi
A Novel UAV-enabled V2V Mobile Network: A Reinforcement Learning Approach
Hossein Mohammadi Firouzjaei - Javad Zeraatkar - Mehrdad Ardebilipour
Reduction of Common-Mode Voltage in Cascaded H-Bridge Inverter Under Faulty Conditions
Ashkan Raki - Yousef Neyshabouri - Hossein Iman-Eini - Mahdi Aslanian
Adaptive Attitude Synchronization and Tracking Control of Spacecraft Formation Flying using Reaction Wheel without Angular Velocity Measurement
Amin Mihankhah - Ali Doustmohammadi
Single-Item Fashion Recommender: Towards Cross-Domain Recommendations
Seyed Omid Mohammadi - Hossein Bodaghi - Ahmad Kalhor
Stability Improvement in Weak Grid-Tied DFIG-based WECS Employing Adaptive Virtual Impedance Strategy Based on Machine Learning Considering the LVRT Constraint
Mohammad Hossein Shaabani - Behrooz Vahidi - Navid Dehghan
A 2D Geometry Based Grasping Pose Generation Algorithm for a Two-finger Robot Hand
Arash Akbari - Arman Akbari - Mehdi Tale Masouleh
Holographic Technique Inspired Multi-Beam Cylindrical Leaky-Wave Antenna
Mohammad Amin Chaychi Zadeh - Nader Komjani - Sajjad Zohrevand
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 43.6.0