陳忠緯Chung-Wei Chen


基本資料

 

 生 地

台中縣

 

   歷

台中縣立新社高中

私立銘傳大學資訊工程系

國立台灣科技大學資訊工程所

 

   趣

籃球,漫畫,電影,電玩

 

 


論文研究

·         中文摘要本論文研究了英語語音合成之基週軌跡產生的方法。基週軌跡產生的第一階段工作是預測英語音節的聲調類別,我們提出以兩層式的演算法來作聲調預測,第一層透過動態規劃來尋找最佳的聲調組合之狀態序列,第二層則是用以估計各個音節的局部聲調機率,我們研究了三種局部聲調機率的估計方法,分別為加權式估計法、PPM估計法及類神經網路估計法。接著在基週軌跡產生的第二階段,我們把預測出的聲調、及其它語境資料帶入一個類神經網路來產生出各音節的基週軌跡。然後我們採用規則式作法來設定音量、音長及停頓,語音信號合成則是採用HNM合成法。目前已初步建立一個英語的語音合成系統,並且用以進行系統內聽測之實驗,我們發現聲調預測正確率越高,則合成語音的自然度會愈好;另外也進行了系統間聽測之實驗,結果顯示我們系統的合成語音的自然度,仍然比Festival HTS的差一截。

·         英文摘要In this thesis, a pitch contour generation method for english speech synthesis is studied. The first phase of pitch contour generation is to predict the tone class of each syllable. We have proposed a two-tier algorithm to predict syllable tone classes for a sentence. The first tier is to find the best sequence of tone class combined states by using a dynamic programming based algorithm. In the second tier, for each tone combined state of a syllable, its local probability is estimated. We have studied three local probability estimation methods, namely, the weighted method, PPM based method and artificial neural network (ANN) based method. In the second phase, we take the predict tones and other contextual information into another ANN to generate a pitch contour for each syllable. Then, we use heuristic rules to set the volume, duration and pause of each syllable. Next, speech signal is synthesized by using the method of harmonic-plus-noise model. Therefore, we have initially built an English speech synthesis system. This system is then used to conduct listening tests. We find that as the accuracy of syllable tone prediction becomes higher, the naturalness of the synthesized speech will become better. Also inter-system listening tests have been conducted. The results show that our system’s naturalness level is still significantly lower than that of Festival HTS system.