0% Complete
صفحه اصلی
/
سی و یکمین کنفرانس بین المللی مهندسی برق
Improving CycleGAN-VC2 Voice Conversion by Learning MCD-Based Evaluation and Optimization
نویسندگان :
Majid Behdad
1
Davood Gharavian
2
1- دانشگاه شهید بهشتی
2- دانشگاه شهید بهشتی
کلمات کلیدی :
CycleGAN-VC،perceptual evaluation،perceptual optimization،MetricGAN،Mel-Cepstral distance،speech quality Assessment،Nisqa tool
چکیده :
Abstract—Nowadays’ voice conversion systems that convert source speakers to target speakers in a speech utterance, have various applications, and improving their quality is very important. One method that still has not attracted enough attention for the VC quality improvements is to concentrate on the optimization of the discriminators of a GAN-based VC System. In this paper, we chose Cycle-GAN-VC2 as the baseline to implement a modified version of Mel-scale human hearing-related objective evaluation metric, Modified Mel-Cepstral Distance (MMCD) to help the discriminators to better learn to judge between real and fake data. We developed and implemented our new metric MMCD that is between 0 and 1 to use it in discriminators’ loss functions. The main goal is to force the discriminators to learn the MMCD metric behavior in its judgements; while in conventional CycleGAN-VC2, discriminators work as the classifiers that decide which data is real and which one is fake without any attention to perceptual references and measures like MCD score that can be varied continuously from zero to one. Experimental results show improvements in the quality of output speech versus MCD measure despite that the training of our baseline VC system is based on a set of non-parallel data, and don’t use any time-alignment in training process. So, in parallel VC systems more improvements could be anticipated.
لیست مقالات
لیست مقالات بایگانی شده
A Hybrid Data-Driven Algorithm for Real-Time Friction Force Estimation in Hydraulic Cylinders
Mohamad Amin Jamshidi - Mehrbod Zarifi - Zolfa Anvari - Hamed Ghafarirad - Mohammad Zareinejad
A Novel CNN-Based FSK Demodulator With Efficient FPGA Implementation
AmirHossein Sadough - Sina Rezaeeahvanouee
مدلسازی، تحلیل و شبیه سازی مبدل رزونانسی LC-LC با قابلیت همزمان جریان ثابت و ولتاژ ثابت در خروجی مناسب برای شارژ باتری
کامران داودی
Small Target Detection Using an Enhanced Optimization Based Filter and Trajectory Tracking Via Pattern Matching Algorithm
Seyedeh Mahsa Zakipour Bahambari - Saeed Khankalantary
ساخت حسگر گاز بر پایه ی گرافن اکساید و سیلیکون متخلخل
سیده صفیه رضایی - مینا امیر مزلقانی
LSTM and Markov-Based Mobility Prediction for Multi-access Edge Computing
Hadi Ghavaminejad - Nasser Yazdani - Golboo Rashidi
Manifold Learning-Assisted Physical Layer Key Generation for LoRaWAN: an Experimental Study
Hossein Aghajari - Hamed Bakhtiari babadegani, - Mehdi Naderi soorki - Sajad Ahmadinabi - Seyed mohsen Ahmadi
Observer-Based Control for impulsive switched systems with Uncertain inputs
Soheil Sheikh ahmadi - Farzad Hashemzadeh - Mohammad Ali Badamchizadeh
بررسی تاثیر کنترل کنندههای سیستم انتقال جریان مستقیم مبتنی بر مبدلهای منبع ولتاژ با اجزای شبکه قدرت با استفاده از روش تحلیل مدال خطی
علی ضیائی - رضا قاضی - روح الامین زینلی داورانی
Cascaded Multilevel Inverter with Reduced Switch Count
Mohammadamin Aalami - Ebrahim Babaei - Saeid Ghassem Zadeh
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 42.3.2