Repository logo
 
Loading...
Thumbnail Image
Publication

Self-Similarity Matrices and Localized Attention for Chorus Recognition: A Data-Efficient Music Information Retrieval Approach

Use this identifier to reference this record.
Name:Description:Size:Format: 
Project_2230396_Jose_Mena_MCD_c_f.pdf3.15 MBAdobe PDF Download

Abstract(s)

This project presents an efficient approach to chorus recognition in English song lyrics that achieves state-of-the-art performance with significantly fewer resources than existing methods. We developed a Bidirectional Long Short-Term Memory (BiLSTM) model with localized attention mechanisms, trained on only 780 songs compared to the 25,000+ songs typically used in Music Information Retrieval research. Our approach addresses class imbalance through comprehensive stabilization techniques and leverages nine feature views capturing structural, semantic, and rhythmic patterns via selfsimilarity matrices. Through systematic experimentation, we demonstrate that chorus detection relies primarily on local contextual patterns rather than global structural awareness, with head self-similarity features (line beginnings) proving most critical for segmentation. The BiLSTM + Attention model achieves 78.2% Macro F1 at the line level, matching Watanabe & Goto's (2020) performance with 100,000+ songs and significantly exceeding Fell et al.'s (2018) 67.4% F1 with 25,000 songs. For boundary detection, the model achieves 59.6% F1 for exact boundaries and 74.7% F1 with ±2 tolerance. The research demonstrates that strategic data curation, comprehensive feature engineering, and targeted optimization can compete effectively with resource-intensive approaches, showing that local pattern recognition outperforms complex global modeling strategies in specialized domains like lyric analysis.

Description

Keywords

Lyric segmentation Chorus detection Attention mechanisms Self-similarity matrices Local pattern recognition

Pedagogical Context

Citation

Organizational Units

Journal Issue