--- about ---

profile_01

簡延庭(Yen-ting Chien)

出生地:高雄市

英文名:Chris

生日:1984年12月13日

學歷:

高雄市市立小港高中
銘傳大學資訊工程學系
國立台灣科技大學資訊工程研究所

論文研究


指導教授:古鴻炎老師

中文題目:基於HMM模型之歌聲合成與音色轉換

英文題目:HMM Based Singing Voice Synthesis and Timbre Conversion

中文摘要:

本論文嘗試結合HMM頻譜模型與GMM音色轉換模型,以建造一個具有歌者音色轉換功能之國語歌聲合成系統。在頻譜係數的分析上,我們使用STRAIGHT來求得較準確的頻譜包絡及音高資訊,然後將各音框的頻譜包絡換算成DCC係數。接著我們自行發展程式來訓練HMM頻譜模型及GMM音色轉換模型,然而在合成階段,兩種模型都遇到了頻譜過度平滑的問題,於是我們研究了音段變異數之作法,來調整所產生出的音框頻譜係數,使得過度平滑的頻譜得到改善。關於歌者音色的轉換,我們研究了四種轉換方法,分別是基本音色轉換法、三階GMM轉換法、使用GMM之相對振幅轉換法以及不使用GMM之相對振幅轉換法。在完成系統的製作後,我們使用合成的歌聲檔案來進行聽測實驗,第一項聽測實驗的結果是,歌唱語料訓練的HMM模型比說話語料訓練的HMM模型較能夠合成出有共鳴感的歌聲;此外,音色轉換的聽測結果是,基本音色轉換法所轉換出的歌聲,在音色與聲音品質上,都比其它轉換方法的效果來得好。

Abstract:

In this thesis, we attempts to combine the HMM spectrum model with the timbre conversion model based on GMM to construct a Mandarin singing voice synthesis system supporting the function of singer timbre conversion. In the analysis of spectral coefficients, we use STRAIGHT to obtain more accurate spectral envelopes and pitch information, and the spectral envelope of each frame is converted into DCC (discrete cepstrum coefficients). Next, we develope programs to train the HMM spectrum model and the timbre conversion model based on GMM. In the synthesis stage, both the models encounter the problem, over smoothed spectral envelopes. Therefore, we study the method of segmental variance matching to adjust the generated DCC coefficients. Then, the problem of spectral over-smoothing is alleviated. About the conversion of the singer timbre, we have studied four conversion methods. That is, basic timbre conversion method, third-order GMM conversion method, relative amplitude conversion method with GMM and relative amplitude conversion method without GMM. After the implementation of the system, we use synthetic singing voice files to conduct listening tests. The result of the first run of tests is that the HMM model trained by using the singing corpus can synthesize more resonant singing voice than that trained by using the speaking corpus. In addition, the result of the listening tests for timbre conversion is that the basic timbre conversion method is better than the other methods in timbre similarity and singing voice quality.

歌聲合成試聽