Browsing by Author "Luz, Gustavo Rocha"
Now showing 1 - 1 of 1
Results Per Page
Sort Options
- Integrating Classification in Image Captioning Tasks: A StudyPublication . Luz, Gustavo Rocha; Grilo, Carlos Fernando de Almeida; Miragaia, Rolando Lúcio Germano; Ribeiro, José Carlos Bregieiro; Távora, Luís Miguel de Oliveira Pegado de Noronha eImage captioning combines computer vision and natural language processing to generate descriptive text for images. This dissertation evaluates whether integrating image classification into captioning models improves the quality of generated descriptions. Experiments were conducted with LSTM and Bidirectional LSTM architecture, using CNN based feature extractors on the FLOWERS dataset. Each configuration was trained 35 times with controlled random seeds to ensure consistency and reproducibility . Although all standard evaluation metrics were computed, the focus was on METEOR and SPICE for their balanced view of linguistic and semantic quality. ResNet50 yielded the best overall results among CNNs. The inclusion of classification labels showed mixed outcomes: in the Base Case, it increased variability; in BiLSTM models, it led to better METEOR scores and more consistent results. Further tests with varied classification accuracy showed limited impact on caption quality. The model remained robust, with no significant drop in performance observed down to 80% accuracy, and top performance recorded at 95% and 90% classification accuracy. These findings suggest classification can enhance performance under favorable conditions, especially when paired with BiLSTM architectures, which is valuable for real-world settings where classification errors are expected. In summary, the results underscore the subtle but meaningful role of classification in image captioning and offer guidance for building more robust multimodal systems.