Repository logo
 
Publication

SpeechToText: An open-source software for automatic detection and transcription of voice recordings in digital forensics

datacite.subject.fosEngenharia e Tecnologia
datacite.subject.fosEngenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e Informática
datacite.subject.sdg09:Indústria, Inovação e Infraestruturas
dc.contributor.authorNegrão, Miguel
dc.contributor.authorDomingues, Patrício
dc.date.accessioned2025-12-04T15:19:33Z
dc.date.available2025-12-04T15:19:33Z
dc.date.issued2021-09en_US
dc.date.updated2025-12-03T10:18:27Z
dc.descriptionArticle number - 301223
dc.description.abstractVoice is the most natural way for humans to communicate with each other, and more recently, to interact with voice controlled digital machines. Although text is predominant in digital platforms, voice and video are becoming increasingly important, with communication applications supporting voice messages and videos. This is relevant for digital forensic examinations, as content held in voice format can hold relevant evidence for the investigation. In this paper, we present the open source SpeechToText software, which resorts to state-of-the art Voice Activity Detection (VAD) and Automatic Speech Recognition (ASR) modules to detect voice content, and then to transcribe it to text. This allows integrating voice content into the regular flow of a digital forensic investigation, with transcribed audio indexed by text search engines. Although SpeechToText can be run independently, it also provides a Jython-based software module for the well-known Autopsy software. The paper also analyzes the availability, storage location and audio format of voice-recorded content in 14 popular Android applications featuring voice recordings. SpeechToText achieves 100% accuracy for detecting voice in unencrypted audio/video files, a word error rate (WER) of 27.2% when transcribing English voice messages by non-native speakers and a WER of 7.80% for the test-clean set of LibriSpeech. It achieves a real time factor of 0.15 for the detection and transcription process in a medium-range laptop, meaning that 1 min of speech is processed in roughly 9 s.eng
dc.description.sponsorshipThis work was partially supported by CIIC under the FCT project UIDB-04524-2020, and by FCT/MCTES and EU funds under the project UIDB/EEA/50008/2020.
dc.description.versionN/A
dc.identifier.citationMiguel Negrão, Patricio Domingues, SpeechToText: An open-source software for automatic detection and transcription of voice recordings in digital forensics, Forensic Science International: Digital Investigation, Volume 38, 2021, 301223, ISSN 2666-2817, https://doi.org/10.1016/j.fsidi.2021.301223.
dc.identifier.doi10.1016/j.fsidi.2021.301223en_US
dc.identifier.issn2666-2817en_US
dc.identifier.slugcv-prod-2635014
dc.identifier.urihttp://hdl.handle.net/10400.8/14877
dc.language.isoeng
dc.peerreviewedyes
dc.publisherElsevier
dc.relationResearch Center in Informatics and Communications
dc.relation.hasversionhttps://www.sciencedirect.com/science/article/abs/pii/S2666281721001311
dc.rights.uriN/A
dc.subjectAndroid applications
dc.subjectAutomatic speech recognition
dc.subjectAutomatic speech transcription
dc.subjectDigital forensics
dc.subjectVoice recordings
dc.titleSpeechToText: An open-source software for automatic detection and transcription of voice recordings in digital forensicseng
dc.typejournal articleen_US
dspace.entity.typePublication
oaire.awardTitleResearch Center in Informatics and Communications
oaire.awardURIinfo:eu-repo/grantAgreement/FCT/6817 - DCRRNI ID/UIDB%2F04524%2F2020/PT
oaire.citation.titleForensic Science International: Digital Investigationen_US
oaire.citation.volume38en_US
oaire.fundingStream6817 - DCRRNI ID
oaire.versionhttp://purl.org/coar/version/c_970fb48d4fbd8a85
person.familyNameCerdeira Negrão
person.familyNameDomingues
person.givenNameMiguel
person.givenNamePatrício
person.identifier.ciencia-id3B1A-36E2-B96B
person.identifier.ciencia-idAA15-6185-C477
person.identifier.orcid0000-0002-6540-3164
person.identifier.orcid0000-0002-6207-6292
person.identifier.ridABH-7711-2020
person.identifier.scopus-author-id13411315400
project.funder.identifierhttp://doi.org/10.13039/501100001871
project.funder.nameFundação para a Ciência e a Tecnologia
rcaap.cv.cienciaid3B1A-36E2-B96B | Miguel Cerdeira Marreiros Negrão
rcaap.rightsopenAccessen_US
relation.isAuthorOfPublication27585414-d859-4dc0-9cac-72e03a4407a5
relation.isAuthorOfPublicationb88ada5f-0d8b-4e55-ab0a-62aa82ea1388
relation.isAuthorOfPublication.latestForDiscovery27585414-d859-4dc0-9cac-72e03a4407a5
relation.isProjectOfPublication67435020-fe0d-4b46-be85-59ee3c6138c7
relation.isProjectOfPublication.latestForDiscovery67435020-fe0d-4b46-be85-59ee3c6138c7

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
1-s2.0-S2666281721001311-main.pdf
Size:
1.05 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.33 KB
Format:
Item-specific license agreed upon to submission
Description: