Percorrer por autor "Miragaia, Rolando"
A mostrar 1 - 10 de 13
Resultados por página
Opções de ordenação
- Applying deep learning to real-time UAV-based forest monitoring: Leveraging multi-sensor imagery for improved resultsPublication . Marques, Tomás; Carreira, Samuel; Miragaia, Rolando; Ramos, João; Pereira, AntónioRising global fire incidents necessitate effective solutions, with forest surveillance emerging as a crucial strategy. This paper proposes a complete solution using technology that integrates visible and infrared spectrum images through Unmanned Aerial Vehicles (UAVs) for enhanced detection of people and vehicles in forest environments. Unlike existing computer vision models relying on single-sensor imagery, this approach overcomes limitations posed by limited spectrum coverage, particularly addressing challenges in low-light conditions, fog, or smoke. The developed 4-channel model uses both types of images to take advantage of the strengths of each one simultaneously. This article presents the development and implementation of a solution for forest monitoring ranging from the transmission of images captured by a UAV to their analysis with an object detection model without human intervention. This model consists of a new version of the YOLOv5 (You Only Look Once) architecture. After the model analyzes the images, the results can be observed on a web platform on any device, anywhere in the world. For the model training, a dataset with thermal and visible images from the aerial perspective was captured with a UAV. From the development of this proposal, a new 4- channel model was created, presenting a substantial increase in precision and mAP (Mean Average Precision) metrics compared to traditional SOTA (state-of-the-art) models that only make use of red, green, and blue (RGB) images. Allied with the increase in precision, we confirmed the hypothesis that our model would perform better in conditions unfavorable to RGB images, identifying objects in situations with low light and reduced visibility with partial occlusions. With the model’s training using our dataset, we observed a significant increase in the model’s performance for images in the aerial perspective. This study introduces a modular system architecture featuring key modules: multisensor image capture, transmission, processing, analysis, and results presentation. Powered by an innovative object detection deep-learning model, these components collaborate to enable real-time, efficient, and distributed forest monitoring across diverse environments.
- Artificial intelligence applied to the stone manufacturing industry: A systematic literature reviewPublication . Santos Silva, Alexandre; Antunes, Carolina; Miragaia, Rolando; Costa, Rogério Luís C.; Silva, Fernando; Ribeiro, JoséNatural stone has long been used in construction, as its properties provide functional and visual value, and the natural stone market currently holds significant importance in the global economy. It is important to consider integrating new technologies in the production chain to aid the industry in moving forward, increasing profit margins and reducing wasted material. This article reviews recent trends in using Artificial Intelligence and Machine Learning techniques in the industry between 2017 and 2024, following a methodology for Systematic Literature Reviews in computer science. It was found that extensive research has been conducted on the subject of tile classification, with solid solutions proposed, achieving results that can be considered robust enough for industrial application. Other subjects comprise tasks regarding stone cutting and defect detection, as well as variable prediction, and quarry activity monitoring. Some authors propose solutions to integrate new technologies into the complete production chain. While more research needs to be done on specific subjects, this review provides a solid first step to future research.
- Cartesian genetic programming applied to pitch estimation of piano notesPublication . Inacio, Tiago; Miragaia, Rolando; Reis, Gustavo; Grilo, Carlos; Fernandez, FranciscoPitch Estimation, also known as Fundamental Frequency (F0) estimation, has been a popular research topic for many years, and is still investigated nowadays. This paper presents a novel approach to the problem of Pitch Estimation, using Cartesian Genetic Programming (CGP). We take advantage of evolutionary algorithms, in particular CGP, to evolve mathematical functions that act as classifiers. These classifiers are used to identify piano notes' pitches in an audio signal. For a first approach, the obtained results are very promising: our error rate outperforms two of three state-of-the-art pitch estimators.
- CGP4Matlab - A Cartesian Genetic Programming MATLAB Toolbox for Audio and Image ProcessingPublication . Miragaia, Rolando; Jorge dos Reis, Gustavo Miguel; Fernandéz, Francisco; Inácio, Tiago; Grilo, CarlosThis paper presents and describes CGP4Matlab, a powerful toolbox that allows to run Cartesian Genetic Programming within MATLAB. This toolbox is particularly suited for signal processing and image processing problems. The implementation of CGP4Matlab, which can be freely downloaded, is described. Some encouraging results on the problem of pitch estimation of musical piano notes achieved using this toolbox are also presented. Pitch estimation of audio signals is a very hard problem with still no generic and robust solution found. Due to the highly flexibility of CGP4Matlab, we managed to apply a new cartesian genetic programming based approach to the problem of pitch estimation. The obtained results are comparable with the state of the art algorithms. © Springer International Publishing AG, part of Springer Nature 2018.
- Driving Behavior Classification Using a ConvLSTMPublication . Pingo, Alberto; Castro, João; Loureiro, Paulo; Mendes, Silvio; Bernardino, Anabela; Miragaia, Rolando; Husyeva, IrynaThis work explores the classification of driving behaviors using a hybrid deep learning model that combines Convolutional Neural Networks (CNNs) with Long Short-Term Memory (LSTM) networks (ConvLSTM). Sensor data are collected from a smartphone application and undergo a preprocessing pipeline, including data normalization, labeling, and feature extraction, to enhance the model’s performance. By capturing temporal and spatial dependencies within driving patterns, the proposed ConvLSTM model effectively differentiates between normal and aggressive driving behaviors. The model is trained and evaluated against traditional stacked LSTM and Bidirectional LSTM (BiLSTM) architectures, demonstrating superior accuracy and robustness. Experimental results confirm that the preprocessing techniques improve classification performance, ensuring high reliability in driving behavior recognition. The novelty of this work lies in a simple data preprocessing methodology combined with the specific application scenario. By enhancing data quality before feeding it into the AI model, we improve classification accuracy and robustness. The proposed framework not only optimizes model performance but also demonstrates practical feasibility, making it a strong candidate for real-world deployment.
- Evolving a Multi-Classifier System for Multi-Pitch Estimation of Piano Music and Beyond: An Application of Cartesian Genetic ProgrammingPublication . Miragaia, Rolando; Fernández, Francisco; Reis, Gustavo; Inácio, TiagoThis paper presents a new method with a set of desirable properties for multi-pitch estimation of piano recordings. We propose a framework based on a set of classifiers to analyze audio input and to identify piano notes present in a given audio signal. Our system’s classifiers are evolved using Cartesian genetic programming: we take advantage of Cartesian genetic programming to evolve a set of mathematical functions that act as independent classifiers for piano notes. Two significant improvements are described: the use of a harmonic mask for better fitness values and a data augmentation process for improving the training stage. The proposed approach achieves com-petitive results using F-measure metrics when compared to state-of-the-art algorithms. Then, we go beyond piano and show how it can be directly applied to other musical instruments, achieving even better results. Our system’s architecture is also described to show the feasibility of its parallelization and its implementation as a real-time system. Our methodology is also a white-box optimization approach that allows for clear analysis of the solutions found and for researchers to learn and test improvements based on the new findings.
- Evolving a multi-classifier system with cartesian genetic programming for multi-pitch estimation of polyphonic piano musicPublication . Miragaia, Rolando; Vega, Francisco Fernandez de; Reis, GustavoThis paper presents a new method for multi-pitch estimation on piano recordings. We propose a framework based on a set of classifiers to analyze the audio input and identify the piano notes present on the given audio signal. Our system's classifiers were evolved using Cartesian Genetic Programming: we take advantage of Cartesian Genetic Programming to evolve a set of mathematical functions that act as independent classifiers for piano notes. Our latest improvements are also presented, including test results using F-measure metrics. Our system architecture is also described to show the feasibility of its parallelization and implementation as a real time system. The proposed approach achieved competitive results, when compared to the state of the art.
- Explainable prototype-based image classification using adaptive feature extractors in medical imagesPublication . Vasconcellos, Nicolas; Tavora, Luis M. N.; Miragaia, Rolando; Grilo, Carlos; Thomaz, LucasPrototype-based classifiers are a category of Explainable Artificial Intelligence methods that use representative samples from the data, called prototypes, to classify new inputs based on a similarity criterion. However, these methods often rely on pre-trained Convolutional Neural Networks as feature extractors, which may not be adapted for the specific type of data being used, thus not suited for identifying the most representative prototypes. In this paper, we propose a method named Explainable Prototype-based Image Classification, a cluster-oriented training strategy that enhances the performance and explainability of prototype-based classifiers. Our method uses a novel loss function, called Cluster Density Error, to fine-tune the feature extractor and preserve the most representative feature vectors in the latent space. We also use Principal Component Analysis-based approach to reduce the dimensionality and complexity of the feature vectors. We conduct experiments on four medical image datasets and compare the results with those from different prototype-based classifiers and state-of-the-art non-explainable learning methods. The proposed method demonstrated superior explainable capabilities and comparable classification performance to the compared methods. Specifically, the proposed method achieved up to 95.01% accuracy and 0.992 AUC using only 43 prototypes. This translated to an improvement in accuracy and AUC score of 21.54% and 9.06%, respectively, and a substantial reduction in the number of prototypes by 98,38%
- INTU-AI: Digitalization of Police Interrogation Supported by Artificial IntelligencePublication . Garcia, José António; Grilo, Carlos; Domingues, Patrício; Miragaia, RolandoTraditional police interrogation processes remain largely time-consuming and reliant on substantial human effort for both analysis and documentation. Intuition Artificial Intelligence (INTU-AI) is a Windows application designed to digitalize the administrative workflow associated with police interrogations, while enhancing procedural efficiency through the integration of AI-driven emotion recognition models. The system employs a multimodal approach that captures and analyzes emotional states using three primary vectors: Facial Expression Recognition (FER), Speech Emotion Recognition (SER), and Text-based Emotion Analysis (TEA). This triangulated methodology aims to identify emotional inconsistencies and detect potential suppression or concealment of affective responses by interviewees. INTU-AI serves as a decision-support tool rather than a replacement for human judgment. By automating bureaucratic tasks, it allows investigators to focus on critical aspects of the interrogation process. The system was validated in practical training sessions with inspectors and with a 12-question questionnaire. The results indicate a strong acceptance of the system in terms of its usability, existing functionalities, practical utility of the program, user experience, and open-ended qualitative responses.
- Multi Pitch Estimation of Piano Music using Cartesian Genetic Programming with Spectral Harmonic MaskPublication . Miragaia, Rolando; Reis, Gustavo; Fernandéz de Vega, Francisco; Chávez, FranciscoPiano notes recognition, or pitch estimation of piano notes has been a popular research topic for many years, and is still investigated nowadays. It is a fundamental task during the process of automatic music transcription (extracting the musical score from an acoustic signal). We take advantage of Cartesian Genetic Programming (CGP) to evolve mathematical functions that act as independent classifiers for piano notes. These classifiers are then used to identify the presence of piano notes in polyphonic audio signals. This paper describes our technique and the latest improvements made in our research. The main feature is the introduction of spectral harmonic masks in the binarization process for measuring the fitness values that has allowed to improve the classification rate: 10% in the F-measure mean result. Our system architecture is also described to show the feasibility of its parallelization, which will reduce the computing time.
