王如江 (Ju-chiang Wang)
- 指導教授:古鴻炎 博士
- 中文題目:基於歌聲表情分析與單元選擇之國語歌聲合成研究
- 英文題目:Mandarin
Singing Voice Synthesis Based on Singing Expression Analysis and Unit
- 中文摘要:本論文研究了歌聲表情參數的分析,再應用音節單元選擇的方法及HNM (Harmonic
plus Noise Model)信號合成的方法,來建立一個可以模仿真人表情的國語歌聲合成系統。我們重新錄製了國語三連音節,並且製作了自動標音程式。關於歌聲表情的分析,我們錄製了不同人所演唱的歌聲,再加以分析出各音符的基週軌跡、音量、音長、波形包絡等表情參數。然後在合成階段,使用所分析出的表情參數值去控制一個歌聲合成音符的子、母音邊界及ASR邊界的時間位置、基週軌跡、波形包絡及作連音的處理,也就是結合表情參數到HNM裡去合成出歌聲信號。此外,我們也作了聽測實驗,結果顯示表情參數的使用的確可改善合成歌聲的品質,並且所合成的歌聲可以相當程度地模仿出真人歌聲的表情。
- 英文摘要:In
this thesis, we study to analyze the expression parameters of singing voice.
Then, the methods of syllable unit selection and HNM (Harmonic plus Noise
Model) based signal synthesis are integrated to build a Mandarin singing
voice synthesis system. This system can mimic the vocal expression of a
human by using the parameters analyzed from his recorded song. We have
re-recorded Mandarin triple-syllable utterances, and developed an automatic
sub-syllable segment boundary detection system. To analyze the expression
parameters of singing voice, we have recorded songs sung by a different
person. Then, those songs are analyzed to obtain each note’s parameters such
as pitch contour, loudness, duration, and wave envelope. In the synthesis
stage, the expression parameters obtained from a source note are used to
determine the time positions of consonant, vowel, and ASR (Attack, Sustain,
and Release) segments, and plan the pitch contour and wave envelope of a
synthesized note. In fact, these expression parameters are taken into the
HNM based mechanism to synthesize singing voice signals. In addition, we
have done perception tests. The results show that the use of the expression
parameters can indeed improve the quality of synthesized singing voice, and
the synthesized singing voice can mimic a person’s singing expressions in a
high similarity level.