0% Complete
صفحه اصلی
/
بیست و نهمین کنفرانس مهندسی برق ایران
PAVID-CVs: Persian Audio-Visual Database of CV syllables
نویسندگان :
Mahsa Hedayatipour
1
Yasser Shekofteh
2
Mohsen Ebrahimi Moghaddam
3
1- دانشگاه شهید بهشتی
2- دانشگاه شهید بهشتی
3- دانشگاه شهید بهشتی
کلمات کلیدی :
Visual Speech Recognition, Lip Reading, CV syllables, Visyllable, Audio-Visual Database, Persian/Farsi Language.
چکیده :
Abstract— Lip-reading is a visual speech recognition process. In this process, recognizing the smaller units of speech can be the basis for recognizing the larger units of a language such as words. In this paper, we have introduced a Persian (Farsi) Audio-Visual Database of CV syllables, named PAVID-CVs, as a set of isolated two-phoneme visyllable and isolated words related to the visyllables, which include only Persian CV syllables, for lip-reading or audio-visual speech recognition purposes such as isolated word recognition. This dataset can be used for machine learning-based methods due to its useful tagged information. Here, we explain the steps of preparing the database. It contains about 30 hours data from 40 speakers. Initial experiments are done utilizing hidden Markov models (HMM) as a visyllable classifier. Then, these models have been used for visual recognition of 6 Persian words with different numbers of syllables and an accuracy of 47.37% was obtained in a speaker-independent experiment.
لیست مقالات
لیست مقالات بایگانی شده
Selenium Doped Hafnium Disulfide Alloy for Visible Photodetection
Mohammadreza Razeghizadeh - Mohsen Mazaherifar - Mahdi Pourfath
A Non-Isolated Extendable Common Grounded DC-DC Boost Converter for DC-microgrid Applications
Saed Mahmoud alilou - Ali Nadermohammadi - Mohammad Maalandish - Seyed hossein Hosseini - Kazem Zare - Mehdi Abapour
Study of Multiple Teeth Linear Switched and Hybrid Reluctance Motors
Mohammad Amin Jalali Kondelaji - Ali Ghaffarpour - Mojtaba Mirsalim
Tuning of SMC Parameters for 3-DOF Spatial Parallel Robot Based on Whale Optimization Algorithm
Saeed Firuz Bahr Afzal - Amir Hossein Hassanabadi
A COMPREHENSIVE DEEP LEARNING METHOD for SHORT-TERM LOAD FORECASTING
Mohammad Sayadlou - Mahdi Salay naderi - Mehrdad Abedi - Sajad Esmaeili - Mohammad Amini
Inversion Coefficient as a Key Design Parameter in MOS Device Performance
Gholamreza Khademevatan - Ali Jalali
جابجایی ایمبرت-فدروف نور عبوری از ساختار چندلایه ای حاوی گرافن و دیاکسید وانادیوم
رباب زادجمال سیفی - رضا عبدی قلعه - کاظم جمشیدی قلعه
Numerical Study of a Microfluidic-Based Motile Sperm Enrichment Using Sperm Rheotactic Behavior
Mohammadjavad Bouloorchi - Saeed Javadizadeh - Aref Valipour - MirBehrad Mousavi - Majid Badieirostami
Image denoising using convolutional neural network
Behnam Latifi - Abolghasem Raie
Multi-Objective Particle Swarm Optimization Of Spiral Antenna for Microwave Imaging Applications
Mehdi Yousefnia - Jaber Allahgholipor - Ataollah Ebrahimzadeh
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 43.6.0