0% Complete
صفحه اصلی
/
سی و یکمین کنفرانس بین المللی مهندسی برق
Improving CycleGAN-VC2 Voice Conversion by Learning MCD-Based Evaluation and Optimization
نویسندگان :
Majid Behdad
1
Davood Gharavian
2
1- دانشگاه شهید بهشتی
2- دانشگاه شهید بهشتی
کلمات کلیدی :
CycleGAN-VC،perceptual evaluation،perceptual optimization،MetricGAN،Mel-Cepstral distance،speech quality Assessment،Nisqa tool
چکیده :
Abstract—Nowadays’ voice conversion systems that convert source speakers to target speakers in a speech utterance, have various applications, and improving their quality is very important. One method that still has not attracted enough attention for the VC quality improvements is to concentrate on the optimization of the discriminators of a GAN-based VC System. In this paper, we chose Cycle-GAN-VC2 as the baseline to implement a modified version of Mel-scale human hearing-related objective evaluation metric, Modified Mel-Cepstral Distance (MMCD) to help the discriminators to better learn to judge between real and fake data. We developed and implemented our new metric MMCD that is between 0 and 1 to use it in discriminators’ loss functions. The main goal is to force the discriminators to learn the MMCD metric behavior in its judgements; while in conventional CycleGAN-VC2, discriminators work as the classifiers that decide which data is real and which one is fake without any attention to perceptual references and measures like MCD score that can be varied continuously from zero to one. Experimental results show improvements in the quality of output speech versus MCD measure despite that the training of our baseline VC system is based on a set of non-parallel data, and don’t use any time-alignment in training process. So, in parallel VC systems more improvements could be anticipated.
لیست مقالات
لیست مقالات بایگانی شده
Deception Attack Detection and Resilient Control in Platoon of Smart Vehicles
Hassan Mokari - Elnaz Firouzmand - Iman Sharifi - Ali Doustmohammadi
Flexibility Assessment of Virtual Power Plant with Considering Dispatchable Wind Turbine
Mahdi Rahimi - Fatemeh Jahanbani Ardakani - Ali Reza Rahimi
A 0.5-V Ultra-Low-Power Low-Pass-filter with Low Noise for ECG detection system
Yasin Heydarzadeh - Mehran Khanehbeygi - Sajad Sohrabian - Ziaddin Daie Koozehkanani
Improved Equivalent Input Disturbance Control of Nonlinear Aeropendulum System Using Data-Driven Approach
Mohammad Hossein Bayati - Arman Marzban - Mahsan Tavakoli-Kakhki - Ali Naseh
Generation of orbital angular momentum modes via SSPP leaky-wave antenna based on holography technique
Sajjad Zohrevand - Nader Komjani
An Enhanced SLAM Method Using ICP Algorithm for Autonomous Mobile Robots Navigation
Hasan Enami Eraghi - Mohammad Reza Taban - Sayed Farzad Bahreinian - Mohammad Reza Jabbari
Error Probability Analysis of Non-Orthogonal Multiple Access
Rozita Shafie - AliAkbar Tadaion - Zolfa Zeinalpour-Yazdi
Hybrid-Excited, Variable-Flux, and Inter-Modular Biased-Flux Motors: A Comparative Analysis
Mohammad Amirkhani - Ehsan Farmahini Farahani - Alireza Eikani - Mojtaba Mirsalim - Javad Shokrollahi Moghani
Safe Cooperative Control of Non-Holonomic Mobile Manipulators
Zahra Kashi - Nargess Sadeghzadeh-Nokhodberiz
Forecasting Tehran Stock Exchange Trend with Time Series Analysis, Fundamental Data, and Sentiment Analysis in News
Mahdi Shamisavi - Amir Jahanshahi
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 42.0.4