Singing voice resynthesis using concatenative-based techniques

Fonseca, Nuno

http://hdl.handle.net/10400.8/540

Utilize este identificador para referenciar este registo.

Nome:	Descrição:	Tamanho:	Formato:
PhD thesis v.029 PostFinal.pdf		4.74 MB	Adobe PDF	Ver/Abrir

Contacte-nos

Autores

Fonseca, Nuno

Resumo(s)

Singing has an important role in our life, and although synthesizers have been trying to replicate every musical instrument for decades, is was only during the last nine years that commercial singing synthesizers started to appear, allowing the ability to merge music and text, i.e., singing. These solutions may present realistic results on some situations, but they require time consuming processes and experienced users. The goal of this research work is to develop, create or adapt techniques that allow the resynthesis of the singing voice, i.e., allow the user to directly control a singing voice synthesizer using his/her own voice. The synthesizer should be able to replicate, as close as possible, the same melody, same phonetic sequence, and the same musical performance. Initially, some work was developed trying to resynthesize piano recordings with evolutionary approaches, using Genetic Algorithms, where a population of individuals (candidate solutions) representing a sequence of music notes evolved over time, tries to match an original audio stream. Later, the focus would return to the singing voice, exploring techniques as Hidden Markov Models, Neural Network Self Organized Maps, among others. Finally, a Concatenative Unit Selection approach was chosen as the core of a singing voice resynthesis system. By extracting energy, pitch and phonetic information (MFCC, LPC), and using it within a phonetic similarity Viterbi-based Unit Selection System, a sequence of internal sound library frames is chosen to replicate the original audio performance. Although audio artifacts still exist, preventing its use on professional applications, the concept of a new audio tool was created, that presents high potential for future work, not only in singing voice, but in other musical or speech domains.

Descrição

Dissertação submetida à Faculdade de Engenharia da Universidade do Porto para satisfação parcial dos requisitos do grau de doutor em Engenharia Informática.

Palavras-chave

Resynthesis Singing Voice

URI

http://hdl.handle.net/10400.8/540

Coleções

ESTG - Teses de Doutoramento de docentes

Ver registo completo