Voice-dubbing or timbre conversion is meant that many distinct timbres can be obtained from a single voice source by conversion processing. In drama works, different actors are usually dubbed with different timbres and many dubbing persons are required. To reduce the cost spended to dub the actors, it will be useful if the computer can help to convert a single voice timbre to many distinct timbres. Therefore, a semiautomatic voice-dubbing (i.e. timbre conversion) system is studied and developed in this thesis. It is called semiautomatic because the emotion (like, anger, sad, happy) perceived from the converted voice will still be controlled by the person who provides the original voice.
In this thesis, a method for timbre conversion is proposed. The goal of timbre conversion is accomplished by providing independent control of fundamental frequency, vocal track length modification, voice source, and internal ratio of vocal track. Among the four factors, the factor of internal ratio of vocal track is newly studied here. As a result of this thesis, a really operable timbre-conversion system is built. It can be used for on-line real-time timbre conversion under some constraints. Also, according to our perception test results, it can indeed convert a single voice timbre to many distinct timbres.