0% Complete
صفحه اصلی
/
سی و یکمین کنفرانس بین المللی مهندسی برق
Improving CycleGAN-VC2 Voice Conversion by Learning MCD-Based Evaluation and Optimization
نویسندگان :
Majid Behdad
1
Davood Gharavian
2
1- دانشگاه شهید بهشتی
2- دانشگاه شهید بهشتی
کلمات کلیدی :
CycleGAN-VC،perceptual evaluation،perceptual optimization،MetricGAN،Mel-Cepstral distance،speech quality Assessment،Nisqa tool
چکیده :
Abstract—Nowadays’ voice conversion systems that convert source speakers to target speakers in a speech utterance, have various applications, and improving their quality is very important. One method that still has not attracted enough attention for the VC quality improvements is to concentrate on the optimization of the discriminators of a GAN-based VC System. In this paper, we chose Cycle-GAN-VC2 as the baseline to implement a modified version of Mel-scale human hearing-related objective evaluation metric, Modified Mel-Cepstral Distance (MMCD) to help the discriminators to better learn to judge between real and fake data. We developed and implemented our new metric MMCD that is between 0 and 1 to use it in discriminators’ loss functions. The main goal is to force the discriminators to learn the MMCD metric behavior in its judgements; while in conventional CycleGAN-VC2, discriminators work as the classifiers that decide which data is real and which one is fake without any attention to perceptual references and measures like MCD score that can be varied continuously from zero to one. Experimental results show improvements in the quality of output speech versus MCD measure despite that the training of our baseline VC system is based on a set of non-parallel data, and don’t use any time-alignment in training process. So, in parallel VC systems more improvements could be anticipated.
لیست مقالات
لیست مقالات بایگانی شده
بررسی تاثیر اعمال پوشش مش متال در مقاومت حرارتی و خوردگی سیم فولادی استحکام بالا بعنوان مغزی هادی های پرظرفیت ACSS
فائزه راد - مهرنوش طاهرخانی - ناصر میرشاه ولایتی - عبداله جواهری
Optimal D2D Resource Allocation in Heterogeneous Cellular Networks by Decentralized Multi-Agent Deep Q-Learning
Pouya Akhoundzadeh - Ghasem Mirjalily - Mohammad taghi Sadeghi
Single-Frequency Microwave Measuring System for Liquid Characterization for Point-of-Care Testing
Saeed Javadizadeh - Mohammadjavad Bouloorchi Tabalvandani - Majid Badieirostami - Mahmoud Shahabadi
Stability Analysis of Singular 2-D Positive systems
Mahmoud Zamani - Masoud Shafiee - Iman Zamani
Numerical study of different pillar shapes using deterministic lateral displacement method for particle separation
Mohammad Mahdi Eskandari Sani - Mahdi Aliverdinia - Mahdi Moghimi Zand
Robust Object Detection Against Adversarial Perturbations with Gabor Filter
Mohammad Parsa Karimi - Abdollah Amirkhani - Shahriar B. Shokouhi
Performance Analysis of the Modified Flux-Coupling-Type SFCL in VSC-HVDC System
Mohammad Khakroei - Ashkan Mirzaei Rajeooni - Mahdi Rahimi Pirbasti - Hossein Heydari
Design and Analysis of a New Electrically Controllable Brushless Eddy-Current Clutch
Hassan Mohammadi Pirouz - Mohammadreza Baghayipour
کنترل توربین بادی با استفاده از کنترلکننده پیشبین تابعی توسعهیافته
آرمین باقری - محمد حائری
Design of Dual Frequency Conformal Leaky-wave Holographic Antenna
Mohammad Amin Chaychi zadeh - Nader Komjani
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 43.6.0