ESTG - Mestrado em Engenharia Eletrotécnica - Energia e Automação
Permanent URI for this collection
Browse
Browsing ESTG - Mestrado em Engenharia Eletrotécnica - Energia e Automação by advisor "Coelho, Paulo Jorge Simões"
Now showing 1 - 4 of 4
Results Per Page
Sort Options
- Automatização de uma Máquina para a Indústria VidreiraPublication . Lopes, Gustavo Carreira Vidigueira; Coelho, Paulo Jorge Simões
- Comunicações RFID para Identificação e Controlo de AcessosPublication . Peixoto, Rui Miguel Baptista; Gomes, Hugo Miguel Cravo; Coelho, Paulo Jorge SimõesA problemática da identificação de pessoas, não sendo um tema novo, está em constante desenvolvimento. A evolução tecnológica nas últimas duas décadas, em especial na área da eletrónica de microcontroladores, tem trazido diversas soluções, tanto ao nível dos métodos de identificação de pessoas utilizados, como ao nível das suas aplicações. Este projeto contribui com o desenvolvimento de um dispositivo de identificação capaz de ser utilizado para registar a presença de estudantes em sala de aula. Foram utilizadas diversas tecnologias de identificação para permitir uma aproximação às necessidades e exigências de cada um dos seus utilizadores: docente e estudantes. Foram definidas duas áreas principais de intervenção: estudo e seleção das tecnologias de identificação a utilizar e formas de interface de utilizador. Para as tecnologias de identificação pretendeu-se diversificar os métodos: para tecnologias de identificação por objetos identificadores utilizou-se o RFID e NFC; e para método de identificação por biometria utilizou-se a tecnologia de identificação por impressão digital. Para a interface com o utilizador foi desenvolvida uma aplicação para Smartphone para introdução dos dados de estudantes e unidades curriculares. Foi também construído um interface gráfico no próprio dispositivo para visualização de todos os dados introduzidos e para navegação entre as diversas opções do sistema. Para além disso, foram desenvolvidos recursos de Hardware para acomodar todas as ligações e dispositivos externos e desenvolvido todo o Software necessário à compatibilização o e interligação dos recursos.
- Deep Learning applied to Visual Speech RecognitionPublication . Santos, Carlos Manuel Simões dos; Coelho, Paulo Jorge Simões; Cunha, António Manuel Trigueiros da SilvaVisual Speech Recognition (VSR) or Automatic Lip-Reading (ALR), the artificial process used to infer visemes, words, or sentences from video inputs, is an efficient yet far from being a day-to-day tool. With the evolution of deep learning models and the proliferation of databases (DB), vocabularies increase in quality and quantity. Large DB feed end-to-end deep learning (DL) models that extract speech, solely on the visual recognition of the speaker’s lips movements. However, large DB production requires large resources, unavailable to the majority of ALR researchers, impairing a larger scale evolution. This dissertation contributes to the development of ALR by diversifying training data, on which the DL depends upon. This includes producing a new DB, in Portuguese language, capable of state-of-the-art (SOTA) performance. As DL only shows a SOTA performance if trained on a large DB, whose resources are not on the scope of this dissertation, a knowledge leveraging method emerges, as a necessary subsequent objective. A large DB and a SOTA model are selected and used as templates, from which a smaller DB (LusaPt) is created, comprising 100 phrases by 10 speakers, uttering 50 typical Portuguese digits and words, recorded and processed by day-to-day equipment. After having pre-trained on the SOTA DB, the new model is then fine-tuned on the new DB. For LusaPt’s validation, the performance of new and the SOTA’s are compared. Results reveal that, if the same video is recurrently subject to the same model, the same prediction is obtained. Tests also show a clear increase on the word recognition rate (WRR), from the 0% when inferring with the SOTA model with no further training on the new DB, to an over 95% when inferring with the new model. Besides showing a “powerful belief” of the SOTA model in its predictions, this work also validates the new DB and its creation methodology. It reenforces that the transfer learning process is efficient in learning a new language, therefore new words. Another contribution is to demonstrate that, with a day-to-day equipment and limited human resources, it is possible to enrich the DB corpora and, ultimately, to positively impact the performance and future of Automatic Lip-Reading.
- Detection and Mosaicing through Deep Learning Models for Low-Quality Retinal ImagesPublication . Correia, Tales Veríssimo Souza; Coelho, Paulo Jorge Simões; Cunha, António Manuel Trigueiros da SilvaGlaucoma is a severe eye disease that is asymptomatic in the initial stages and can lead to blindness, due to its degenerative characteristic. There isn’t any available cure for it, and it is the second most common cause of blindness in the world. Most of the people affected by it only discovers the disease when it is already too late. Regular visits to the ophthalmologist are the best way to prevent or contain it, with a precise diagnosis performed with professional equipment. From another perspective, for some individuals or populations, this task can be difficult to accomplish, due to several restrictions, such as low incoming resources, geographical adversities, and travelling restrictions (distance, lack of means of transportation, etc.). Also, logistically, due to its dimensions, relocating the professional equipment can be expensive, thus becoming not viable to bring them to remote areas. In the market, low-cost products like the D-Eye lens offer an alternative to meet this need. The D-Eye lens can be attached to a smartphone to capture fundus images, but it presents a major drawback in terms of lower-quality imaging when compared to professional equipment. This work presents and evaluates methods for eye reading with D-Eye recordings. This involves exposing the retina in two steps: object detection and summarization via object mosaicing. Deep learning methods, such as the YOLO family architecture, were used for retina registration as an object detector. The summarization methods presented and inferred in this work mosaiced the best retina images together to produce a more detailed resultant image. After selecting the best workflow from these methods, a final inference was performed and visually evaluated, the results were not rich enough to serve as a pre-screening medical assessment, determining that improvements in the actual algorithm and technology are needed to retrieve better imaging.