Cross-lingual speaker transfer for Cambodian based on feature disentangler and time-frequency attention adaptive normalization

Cross-lingual speaker transfer for Cambodian based on feature disentangler and time-frequency attention adaptive normalization
Yuanzhang Yang, Linqin Wang, Shengxiang Gao, Zhengtao Yu, Ling Dong
International Journal of Web Information Systems, Vol. 20, No. 2, pp.113-128

This paper aims to disentangle Chinese-English-rich resources linguistic and speaker timbre features, achieving cross-lingual speaker transfer for Cambodian.

This study introduces a novel approach: the construction of a cross-lingual feature disentangler coupled with the integration of time-frequency attention adaptive normalization to proficiently convert Cambodian speaker timbre into Chinese-English without altering the underlying Cambodian speech content.

Considering the limited availability of multi-speaker corpora in Cambodia, conventional methods have demonstrated subpar performance in Cambodian speaker voice transfer.

The originality of this study lies in the effectiveness of the disentanglement process and precise control over speaker timbre feature transfer.

Accessibility