SpeechToText: An open-source software for automatic detection and transcription of voice recordings in digital forensics

Negrão, Miguel; Domingues, Patrício

http://hdl.handle.net/10400.8/14877

Utilize este identificador para referenciar este registo.

Nome:	Descrição:	Tamanho:	Formato:
1-s2.0-S2666281721001311-main.pdf		1.05 MB	Adobe PDF	Ver/Abrir

Contacte-nos

Autores

Negrão, Miguel

Domingues, Patrício

Resumo(s)

Voice is the most natural way for humans to communicate with each other, and more recently, to interact with voice controlled digital machines. Although text is predominant in digital platforms, voice and video are becoming increasingly important, with communication applications supporting voice messages and videos. This is relevant for digital forensic examinations, as content held in voice format can hold relevant evidence for the investigation. In this paper, we present the open source SpeechToText software, which resorts to state-of-the art Voice Activity Detection (VAD) and Automatic Speech Recognition (ASR) modules to detect voice content, and then to transcribe it to text. This allows integrating voice content into the regular flow of a digital forensic investigation, with transcribed audio indexed by text search engines. Although SpeechToText can be run independently, it also provides a Jython-based software module for the well-known Autopsy software. The paper also analyzes the availability, storage location and audio format of voice-recorded content in 14 popular Android applications featuring voice recordings. SpeechToText achieves 100% accuracy for detecting voice in unencrypted audio/video files, a word error rate (WER) of 27.2% when transcribing English voice messages by non-native speakers and a WER of 7.80% for the test-clean set of LibriSpeech. It achieves a real time factor of 0.15 for the detection and transcription process in a medium-range laptop, meaning that 1 min of speech is processed in roughly 9 s.

Descrição

Article number - 301223

Palavras-chave

Android applications Automatic speech recognition Automatic speech transcription Digital forensics Voice recordings

URI

http://hdl.handle.net/10400.8/14877

Citação

Miguel Negrão, Patricio Domingues, SpeechToText: An open-source software for automatic detection and transcription of voice recordings in digital forensics, Forensic Science International: Digital Investigation, Volume 38, 2021, 301223, ISSN 2666-2817, https://doi.org/10.1016/j.fsidi.2021.301223.

Projetos de investigação

Research Center in Informatics and Communications

Projeto de investigaçãoVer mais

Editora

Elsevier

DOI

10.1016/j.fsidi.2021.301223

Coleções

CIIC - Artigos em Revistas com Peer Review
ESTG - Artigos em revistas internacionais

Licença CC

Sem licença CC

Métricas Alternativas

Ver registo completo