0% Complete
صفحه اصلی
/
سی و یکمین کنفرانس بین المللی مهندسی برق
Improving CycleGAN-VC2 Voice Conversion by Learning MCD-Based Evaluation and Optimization
نویسندگان :
Majid Behdad
1
Davood Gharavian
2
1- دانشگاه شهید بهشتی
2- دانشگاه شهید بهشتی
کلمات کلیدی :
CycleGAN-VC،perceptual evaluation،perceptual optimization،MetricGAN،Mel-Cepstral distance،speech quality Assessment،Nisqa tool
چکیده :
Abstract—Nowadays’ voice conversion systems that convert source speakers to target speakers in a speech utterance, have various applications, and improving their quality is very important. One method that still has not attracted enough attention for the VC quality improvements is to concentrate on the optimization of the discriminators of a GAN-based VC System. In this paper, we chose Cycle-GAN-VC2 as the baseline to implement a modified version of Mel-scale human hearing-related objective evaluation metric, Modified Mel-Cepstral Distance (MMCD) to help the discriminators to better learn to judge between real and fake data. We developed and implemented our new metric MMCD that is between 0 and 1 to use it in discriminators’ loss functions. The main goal is to force the discriminators to learn the MMCD metric behavior in its judgements; while in conventional CycleGAN-VC2, discriminators work as the classifiers that decide which data is real and which one is fake without any attention to perceptual references and measures like MCD score that can be varied continuously from zero to one. Experimental results show improvements in the quality of output speech versus MCD measure despite that the training of our baseline VC system is based on a set of non-parallel data, and don’t use any time-alignment in training process. So, in parallel VC systems more improvements could be anticipated.
لیست مقالات
لیست مقالات بایگانی شده
A new LDO regulator with adaptive PSR improvement under wide load current range and fast load transient response
Mohammad Ahmadi - Emad Ebrahimi
Extended Phase Shift Control in Dual Active Bridge Converter Considering Magnetizing Inductance of Transformer
Masood Soleimanifard - Ali Yazdian Varjani
Application of Metaheurestic Optimization Algorithms for Feature Selection in Text Classification
Elham Nazari - Nafise Haghshenas - Alireza Basiri - Mohammad Reza Ahmadzadeh
پیشنهاد یک ساختار جدید AC/DC مبتنی بر مبدلهای SEPIC و CUK بهبودیافته برای کاربرد شارژر موتورسیکلتهای الکتریکی
سجاد قابلی ثانی - رحیم عجبی فرشباف - میثم صادقی - محمد خدایاری
Stabilizing Control System for Synchronizing a Biological Neuron Network Considering Electrical Autapse Effect
Fatemeh Jahangiri - AliAkbar Afzalian - Mashkour Mansouri
Design of an Optical Current Transformer for High-Voltage Gas-Insulated Switchgear-Part I: Focus on Optical Sensor Design
Reza Babaei - Asghar Akbari - Arash Moradi
Design an Intelligent Fault Detection System for Spring-Drive Operating Mechanism of SF6 High Voltage Circuit Breaker Using ADAMS
Milad Tahvilzadeh - Mehdi Aliyari Shooredeli - Ali asghar Razi Kazemi
گیت Xor/Xnor جدید با مصرف توان پایین مبتنی بر تکنولوژی اسپینترونیک
ایمان علیبیگی - محمود تابنده - سعید باقری شورکی - رامین رجایی
Intelligent Near-Infrared Spectroscopy for Blood Glucose Level Classification
Shahrooz Sharifi - Amirhossein Maddah-Torghabehi - Mohammad-Reza Akbarzadeh-Totonchi
Transfer Learning Based Method for Human Activity Recognition
Saeedeh Zebhi - Smt Almodarresi - Vahid Abootalebi
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 42.3.2