Utilize este identificador para referenciar este registo: http://hdl.handle.net/10400.8/540
Título: Singing voice resynthesis using concatenative-based techniques
Autor: Fonseca, Nuno
Palavras-chave: Resynthesis
Singing
Voice
Data de Defesa: 2011
Resumo: Singing has an important role in our life, and although synthesizers have been trying to replicate every musical instrument for decades, is was only during the last nine years that commercial singing synthesizers started to appear, allowing the ability to merge music and text, i.e., singing. These solutions may present realistic results on some situations, but they require time consuming processes and experienced users. The goal of this research work is to develop, create or adapt techniques that allow the resynthesis of the singing voice, i.e., allow the user to directly control a singing voice synthesizer using his/her own voice. The synthesizer should be able to replicate, as close as possible, the same melody, same phonetic sequence, and the same musical performance. Initially, some work was developed trying to resynthesize piano recordings with evolutionary approaches, using Genetic Algorithms, where a population of individuals (candidate solutions) representing a sequence of music notes evolved over time, tries to match an original audio stream. Later, the focus would return to the singing voice, exploring techniques as Hidden Markov Models, Neural Network Self Organized Maps, among others. Finally, a Concatenative Unit Selection approach was chosen as the core of a singing voice resynthesis system. By extracting energy, pitch and phonetic information (MFCC, LPC), and using it within a phonetic similarity Viterbi-based Unit Selection System, a sequence of internal sound library frames is chosen to replicate the original audio performance. Although audio artifacts still exist, preventing its use on professional applications, the concept of a new audio tool was created, that presents high potential for future work, not only in singing voice, but in other musical or speech domains.
Descrição: Dissertação submetida à Faculdade de Engenharia da Universidade do Porto para satisfação parcial dos requisitos do grau de doutor em Engenharia Informática.
Peer review: yes
URI: http://hdl.handle.net/10400.8/540
Aparece nas colecções:Teses de Doutoramento

Ficheiros deste registo:
Ficheiro Descrição TamanhoFormato 
PhD thesis v.029 PostFinal.pdf4,85 MBAdobe PDFVer/Abrir


FacebookTwitterDeliciousLinkedInDiggGoogle BookmarksMySpace
Formato BibTex MendeleyEndnote Degois 

Todos os registos no repositório estão protegidos por leis de copyright, com todos os direitos reservados.