Repository logo
 
Publication

Mispronunciation Detection in Children's Reading of Sentences

dc.contributor.authorProença, Jorge
dc.contributor.authorLopes, Carla Alexandra
dc.contributor.authorTjalve, Michael
dc.contributor.authorStolcke, Andreas
dc.contributor.authorCandeias, Sara
dc.contributor.authorPerdigão, Fernando
dc.date.accessioned2018-07-23T16:49:29Z
dc.date.available2018-07-23T16:49:29Z
dc.date.issued2018-03-28
dc.description.abstractThis work proposes an approach to automatically parse children’s reading of sentences by detecting word pronunciations and extra content, and to classify words as correctly or incorrectly pronounced. This approach can be directly helpful for automatic assessment of reading level or for automatic reading tutors, where a correct reading must be identified. We propose a first segmentation stage to locate candidate word pronunciations based on allowing repetitions and false starts of a word’s syllables. A decoding grammar based solely on syllables allows silence to appear during a word pronunciation. At a second stage, word candidates are classified as mispronounced or not. The feature that best classifies mispronunciations is found to be the log-likelihood ratio between a free phone loop and a word spotting model in the very close vicinity of the candidate segmentation. Additional features are combined in multi-feature models to further improve classification, including: normalizations of the log-likelihood ratio, derivations from phone likelihoods, and Levenshtein distances between the correct pronunciation and recognized phonemes through two phoneme recognition approaches. Results show that most extra events were detected (close to 2% word error rate achieved) and that using automatic segmentation for mispronunciation classification approaches the performance of manual segmentation. Although the log-likelihood ratio from a spotting approach is already a good metric to classify word pronunciations, the combination of additional features provides a relative reduction of the miss rate of 18% (from 34.03% to 27.79% using manual segmentation and from 35.58% to 29.35% using automatic segmentation, at constant 5% false alarm rate).pt_PT
dc.description.versioninfo:eu-repo/semantics/publishedVersionpt_PT
dc.identifier.doi10.1109/TASLP.2018.2820429pt_PT
dc.identifier.urihttp://hdl.handle.net/10400.8/3353
dc.language.isoengpt_PT
dc.peerreviewedyespt_PT
dc.publisherInstitute of Electrical and Electronics Engineerspt_PT
dc.relation.publisherversionhttps://ieeexplore.ieee.org/document/8327486/pt_PT
dc.subjectSpeech analysispt_PT
dc.subjectMispronunciation detectionpt_PT
dc.subjectChildren’s readingpt_PT
dc.subjectAutomatic reading annotationpt_PT
dc.titleMispronunciation Detection in Children's Reading of Sentencespt_PT
dc.typejournal article
dspace.entity.typePublication
oaire.citation.endPage1219pt_PT
oaire.citation.issue7pt_PT
oaire.citation.startPage1207pt_PT
oaire.citation.titleIEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSINGpt_PT
oaire.citation.volume26pt_PT
rcaap.rightsopenAccesspt_PT
rcaap.typearticlept_PT

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
publicado.pdf
Size:
2.45 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.32 KB
Format:
Item-specific license agreed upon to submission
Description: