Unidade de Investigação - INESCC-DL – Instituto de Engenharia de Sistemas e Computadores de Coimbra [delegação Politécnico de Leiria]
Permanent URI for this community
Browse
Browsing Unidade de Investigação - INESCC-DL – Instituto de Engenharia de Sistemas e Computadores de Coimbra [delegação Politécnico de Leiria] by Author "Antunes, Mário"
Now showing 1 - 1 of 1
Results Per Page
Sort Options
- DOTS: Drift Oriented Tool SystemPublication . Antunes, Mário; Costa, Joana; Silva, Catarina; Bernardete RibeiroDrift is a given in most machine learning applications. The idea that models must accommodate for changes, and thus be dynamic, is ubiquitous. Current challenges include temporal data streams, drift and non-stationary scenarios, often with text data, whether in social networks or in business systems. There are multiple drift patterns types: concepts that appear and disappear suddenly, recurrently, or even gradually or incrementally. Researchers strive to propose and test algorithms and techniques to deal with drift in text classification, but it is difficult to find adequate benchmarks in such dynamic environments. In this paper we present DOTS, Drift Oriented Tool System, a framework that allows for the definition and generation of text-based datasets where drift characteristics can be thoroughly defined, implemented and tested. The usefulness of DOTS is presented using a Twitter stream case study. DOTS is used to define datasets and test the effectiveness of using different document representation in a Twitter scenario. Results show the potential of DOTS in machine learning research.
