嗨,我是姜愷威!

 

這是我的個人網頁,紀錄了在語音實驗室這段期間的生活。

This is my personal page, recording everything during the speech-lab.

 

論文研究 Work

指導教授:古鴻炎 博士

題目: 結合ANN, 全域變異數與真實軌跡挑選之基軌跡產生之改進方法

TopicImproved Pitch-contour Generation Methods Combing ANN, Global Variance and Real-contour Selection

 

摘要

本論文提出一種結合類神經網路(ANN)、全域變異數(GV)調整與真實基軌跡挑選之音節基軌跡產生方法,可用以改善ANN產生之基軌跡過 度平滑的現象,並且可提升合成語音音調的自然度。在模型訓練階段,為了解決音高偵測錯誤的問題,我們分析錯誤之種類,再以程式對錯誤的 音高值作更正,然後將各音節的基軌跡轉換成DCT係數,用以訓練ANN模型、GV參數,此外也把各音節的DCT係數向量作分類儲存。在基軌跡產 生階段,以一個句子的文脈資料作為輸入,先令ANN預測出表示基軌跡之DCT係數;接著依據GV參數來對各維度DCT係數作調整,以疏解前述之 過度平滑現象;此外,為了進一步提升合成語音的音調自然度,我們再依據GV調整後的DCT向量,到預先分類儲存之真實基軌跡中進行挑選,以 作為最後產生出的音節基軌跡。關於所提出方法之客觀評估,我們量測了幾種選項設定之下的變異數比值(VR),一般來說,GV調整設的權重值 越大,得到的VR值會越高;此外,主觀聽測的結果顯示,以適當的權重值去作GV調整,確實可改善音調的自然度,並且加入真實軌跡挑選之步驟 ,可進一步提升合成語音之音調自然度。

Abstract

In this thesis, we propose an improved syllable pitch-contour generation method that combines ANN (artificial neural network), GV (global variance) and real-contour selection. This method can not only alleviate the phenomenon of over-smoothed pitch-contour generated by ANN but also improve the naturalness level of the synthetic pitch contour. In the training stage, the automatically detected pitch contours are checked manually for some types of errors, and then corrected in terms of a program developed here. Next, each syllable pitch contour is transformed into DCT (discrete cosine transform) coefficients. Such DCT coefficients are then used to train ANN model and GV parameters, and saved separately according to some context classification modes. In the generation stage, the ANN is used first to predict the DCT coefficients of each syllable pitch-contour according to the inputted contextual information items. Then, the generated DCT coefficients are adjusted by means of GV matching for each DCT vector dimension in order to alleviate the over-smoothing phenomenon mentioned above. Moreover, to promote the naturalness level of the synthetic pitch contours, we base on the DCT vector generated by ANN and adjusted by GV matching to select a real pitch contour from the saved contour pool corresponding to the requested context class. As for the objective assessment of our proposed method, we measure the VRs (variance ratio) under different option setting. It is found that a higher VR value will be obtained if a larger weight for GV adjusting is used. In addition, the results of subjective listening tests demonstrate that an appropriate weight value for GV adjusting will improve the naturalness level of the generated pitch contour, and the processing step of real-contour selection will further improve the naturalness level.

 

音檔試聽

 

 

關於我 Information Learn everything about me

出生地:台北市

Some dummy text Pelus

自行車 Cycling

以公路車結合運動與旅行並挑戰自我

Boardgame

從各種遊戲規則中訓練思考邏輯與規劃能力

鋼琴 Piano

古典與爵士

台北市立河堤國小

1995 – 2001

臺北市立金華國中

2001 – 2004

台北市立成功高中

2004- 2007

私立東吳大學資訊科學系

2007- 2011

國立台灣科技大學資訊工程所

2011- 2014

照片 (點擊觀看全圖)Photo (click to see the full size photo)

  • 傍晚的校門口
  •  
  • 實驗室日常1
  •  
  • 實驗室日常2
  •  
  • 每週五晚的實驗室聚會
  •  
  • 聚餐1
  •  
  •  
  • 聚餐2
  •  
  • costco購物
  •  
  • 學校外套
  •  
  • pizza外送
  •  
  • 第一次在學校過夜
  • 機房失火 大樓跳電
  •  
  • 口試投影片完工
  •  
  • 口試結束與老師的合照
  •  

Contact Get in touch

  • www.facebook.com/keiwei.chiang
  • +886 926276376
  • darkmn0131@hotmail.com