Percorrer por autor "Miragaia, Rolando"
A mostrar 1 - 10 de 11
Resultados por página
Opções de ordenação
- Applying deep learning to real-time UAV-based forest monitoring: Leveraging multi-sensor imagery for improved resultsPublication . Marques, Tomás; Carreira, Samuel; Miragaia, Rolando; Ramos, João; Pereira, AntónioRising global fire incidents necessitate effective solutions, with forest surveillance emerging as a crucial strategy. This paper proposes a complete solution using technology that integrates visible and infrared spectrum images through Unmanned Aerial Vehicles (UAVs) for enhanced detection of people and vehicles in forest environments. Unlike existing computer vision models relying on single-sensor imagery, this approach overcomes limitations posed by limited spectrum coverage, particularly addressing challenges in low-light conditions, fog, or smoke. The developed 4-channel model uses both types of images to take advantage of the strengths of each one simultaneously. This article presents the development and implementation of a solution for forest monitoring ranging from the transmission of images captured by a UAV to their analysis with an object detection model without human intervention. This model consists of a new version of the YOLOv5 (You Only Look Once) architecture. After the model analyzes the images, the results can be observed on a web platform on any device, anywhere in the world. For the model training, a dataset with thermal and visible images from the aerial perspective was captured with a UAV. From the development of this proposal, a new 4- channel model was created, presenting a substantial increase in precision and mAP (Mean Average Precision) metrics compared to traditional SOTA (state-of-the-art) models that only make use of red, green, and blue (RGB) images. Allied with the increase in precision, we confirmed the hypothesis that our model would perform better in conditions unfavorable to RGB images, identifying objects in situations with low light and reduced visibility with partial occlusions. With the model’s training using our dataset, we observed a significant increase in the model’s performance for images in the aerial perspective. This study introduces a modular system architecture featuring key modules: multisensor image capture, transmission, processing, analysis, and results presentation. Powered by an innovative object detection deep-learning model, these components collaborate to enable real-time, efficient, and distributed forest monitoring across diverse environments.
- Artificial intelligence applied to the stone manufacturing industry: A systematic literature reviewPublication . Santos Silva, Alexandre; Antunes, Carolina; Miragaia, Rolando; Costa, Rogério Luís C.; Silva, Fernando; Ribeiro, JoséNatural stone has long been used in construction, as its properties provide functional and visual value, and the natural stone market currently holds significant importance in the global economy. It is important to consider integrating new technologies in the production chain to aid the industry in moving forward, increasing profit margins and reducing wasted material. This article reviews recent trends in using Artificial Intelligence and Machine Learning techniques in the industry between 2017 and 2024, following a methodology for Systematic Literature Reviews in computer science. It was found that extensive research has been conducted on the subject of tile classification, with solid solutions proposed, achieving results that can be considered robust enough for industrial application. Other subjects comprise tasks regarding stone cutting and defect detection, as well as variable prediction, and quarry activity monitoring. Some authors propose solutions to integrate new technologies into the complete production chain. While more research needs to be done on specific subjects, this review provides a solid first step to future research.
- Cartesian genetic programming applied to pitch estimation of piano notesPublication . Inacio, Tiago; Miragaia, Rolando; Reis, Gustavo; Grilo, Carlos; Fernandez, FranciscoPitch Estimation, also known as Fundamental Frequency (F0) estimation, has been a popular research topic for many years, and is still investigated nowadays. This paper presents a novel approach to the problem of Pitch Estimation, using Cartesian Genetic Programming (CGP). We take advantage of evolutionary algorithms, in particular CGP, to evolve mathematical functions that act as classifiers. These classifiers are used to identify piano notes' pitches in an audio signal. For a first approach, the obtained results are very promising: our error rate outperforms two of three state-of-the-art pitch estimators.
- CGP4Matlab - A Cartesian Genetic Programming MATLAB Toolbox for Audio and Image ProcessingPublication . Miragaia, Rolando; Jorge dos Reis, Gustavo Miguel; Fernandéz, Francisco; Inácio, Tiago; Grilo, CarlosThis paper presents and describes CGP4Matlab, a powerful toolbox that allows to run Cartesian Genetic Programming within MATLAB. This toolbox is particularly suited for signal processing and image processing problems. The implementation of CGP4Matlab, which can be freely downloaded, is described. Some encouraging results on the problem of pitch estimation of musical piano notes achieved using this toolbox are also presented. Pitch estimation of audio signals is a very hard problem with still no generic and robust solution found. Due to the highly flexibility of CGP4Matlab, we managed to apply a new cartesian genetic programming based approach to the problem of pitch estimation. The obtained results are comparable with the state of the art algorithms. © Springer International Publishing AG, part of Springer Nature 2018.
- Driving Behavior Classification Using a ConvLSTMPublication . Pingo, Alberto; Castro, João; Loureiro, Paulo; Mendes, Silvio; Bernardino, Anabela; Miragaia, Rolando; Husyeva, IrynaThis work explores the classification of driving behaviors using a hybrid deep learning model that combines Convolutional Neural Networks (CNNs) with Long Short-Term Memory (LSTM) networks (ConvLSTM). Sensor data are collected from a smartphone application and undergo a preprocessing pipeline, including data normalization, labeling, and feature extraction, to enhance the model’s performance. By capturing temporal and spatial dependencies within driving patterns, the proposed ConvLSTM model effectively differentiates between normal and aggressive driving behaviors. The model is trained and evaluated against traditional stacked LSTM and Bidirectional LSTM (BiLSTM) architectures, demonstrating superior accuracy and robustness. Experimental results confirm that the preprocessing techniques improve classification performance, ensuring high reliability in driving behavior recognition. The novelty of this work lies in a simple data preprocessing methodology combined with the specific application scenario. By enhancing data quality before feeding it into the AI model, we improve classification accuracy and robustness. The proposed framework not only optimizes model performance but also demonstrates practical feasibility, making it a strong candidate for real-world deployment.
- Explainable prototype-based image classification using adaptive feature extractors in medical imagesPublication . Vasconcellos, Nicolas; Tavora, Luis M. N.; Miragaia, Rolando; Grilo, Carlos; Thomaz, LucasPrototype-based classifiers are a category of Explainable Artificial Intelligence methods that use representative samples from the data, called prototypes, to classify new inputs based on a similarity criterion. However, these methods often rely on pre-trained Convolutional Neural Networks as feature extractors, which may not be adapted for the specific type of data being used, thus not suited for identifying the most representative prototypes. In this paper, we propose a method named Explainable Prototype-based Image Classification, a cluster-oriented training strategy that enhances the performance and explainability of prototype-based classifiers. Our method uses a novel loss function, called Cluster Density Error, to fine-tune the feature extractor and preserve the most representative feature vectors in the latent space. We also use Principal Component Analysis-based approach to reduce the dimensionality and complexity of the feature vectors. We conduct experiments on four medical image datasets and compare the results with those from different prototype-based classifiers and state-of-the-art non-explainable learning methods. The proposed method demonstrated superior explainable capabilities and comparable classification performance to the compared methods. Specifically, the proposed method achieved up to 95.01% accuracy and 0.992 AUC using only 43 prototypes. This translated to an improvement in accuracy and AUC score of 21.54% and 9.06%, respectively, and a substantial reduction in the number of prototypes by 98,38%
- INTU-AI: Digitalization of Police Interrogation Supported by Artificial IntelligencePublication . Garcia, José António; Grilo, Carlos; Domingues, Patrício; Miragaia, RolandoTraditional police interrogation processes remain largely time-consuming and reliant on substantial human effort for both analysis and documentation. Intuition Artificial Intelligence (INTU-AI) is a Windows application designed to digitalize the administrative workflow associated with police interrogations, while enhancing procedural efficiency through the integration of AI-driven emotion recognition models. The system employs a multimodal approach that captures and analyzes emotional states using three primary vectors: Facial Expression Recognition (FER), Speech Emotion Recognition (SER), and Text-based Emotion Analysis (TEA). This triangulated methodology aims to identify emotional inconsistencies and detect potential suppression or concealment of affective responses by interviewees. INTU-AI serves as a decision-support tool rather than a replacement for human judgment. By automating bureaucratic tasks, it allows investigators to focus on critical aspects of the interrogation process. The system was validated in practical training sessions with inspectors and with a 12-question questionnaire. The results indicate a strong acceptance of the system in terms of its usability, existing functionalities, practical utility of the program, user experience, and open-ended qualitative responses.
- Multi Pitch Estimation of Piano Music using Cartesian Genetic Programming with Spectral Harmonic MaskPublication . Miragaia, Rolando; Reis, Gustavo; Fernandéz de Vega, Francisco; Chávez, FranciscoPiano notes recognition, or pitch estimation of piano notes has been a popular research topic for many years, and is still investigated nowadays. It is a fundamental task during the process of automatic music transcription (extracting the musical score from an acoustic signal). We take advantage of Cartesian Genetic Programming (CGP) to evolve mathematical functions that act as independent classifiers for piano notes. These classifiers are then used to identify the presence of piano notes in polyphonic audio signals. This paper describes our technique and the latest improvements made in our research. The main feature is the introduction of spectral harmonic masks in the binarization process for measuring the fitness values that has allowed to improve the classification rate: 10% in the F-measure mean result. Our system architecture is also described to show the feasibility of its parallelization, which will reduce the computing time.
- Plum Ripeness Analysis in Real Environments Using Deep Learning with Convolutional Neural NetworksPublication . Miragaia, Rolando; Chávez, Francisco; Díaz, Josefa; Vivas, Antonio; Prieto, Maria Henar; Moñino, Maria JoséDigitization and technological transformation in agriculture is no longer something of the future, but of the present. Many crops are being managed by using sophisticated sensors that allow farmers to know the status of their crops at all times. This modernization of crops also allows for better quality harvests as well as significant cost savings. In this study, we present a tool based on Deep Learning that allows us to analyse different varieties of plums using image analysis to identify the variety and its ripeness status. The novelty of the system is the conditions in which the designed algorithm can work. An uncontrolled photographic acquisition method has been implemented. The user can take a photograph with any device, smartphone, camera, etc., directly in the field, regardless of light conditions, focus, etc. The robustness of the system presented allows us to differentiate, with 92.83% effectiveness, three varieties of plums through images taken directly in the field and values above 94% when the ripening stage of each variety is analyzed independently. We have worked with three varieties of plums, Red Beaut, Black Diamond and Angeleno, with different ripening cycles. This has allowed us to obtain a robust classification system that will allow users to differentiate between these varieties and subsequently determine the ripening stage of the particular variety.
- Synthetic image generation for effective deep learning model training for ceramic industry applicationsPublication . Gaspar, Fábio; Daniel Carreira; Rodrigues, Nuno; Miragaia, Rolando; Costa, Paulo; Pereira, António; Ribeiro, JoséIn the rapidly evolving field of machine learning engineering, access to large, high-quality, and well-balanced labeled datasets is indispensable for accurate product classification. This necessity holds particular significance in sectors such as the ceramics industry, in which effective production line activities are paramount and deep learning classification mechanisms are particularly relevant for streamlining processes; but real-world image samples are scarce and difficult to obtain, hindering dataset building and consequently model training and deployment. This paper presents a novel approach for dataset building in the context of the ceramic industry, which involves employing synthetic images for building or complementing datasets for image classification problems. The proposed methodology was implemented in CeramicFlow, an innovative computer graphics rendering pipeline designed to create synthetic images by employing computer-aided design models of ceramic objects and incorporating domain randomization techniques. As a result, a fully synthetic image dataset named Synthetic CeramicNet was created and validated in real-world ceramic classification problems. The results demonstrate that synthetic images provide an adequate basis for datasets and can significantly reduce reliance on real-world data when developing deep learning approaches for image classification problems in the ceramic industry. Furthermore, the proposed approach can potentially be applied to other industrial fields.
