Choice of Best Samples for Building Ensembles in Dynamic Environments

Costa, Joana; Silva, Catarina; Antunes, Mário; Ribeiro, Bernardete

http://hdl.handle.net/10400.8/13700

Use this identifier to reference this record.

Name:	Description:	Size:	Format:
318.pdf		514.44 KB	Adobe PDF	Download

Send Feedback

Authors

Abstract(s)

Machine learning approaches often focus on optimizing the algorithm rather than assuring that the source data is as rich as possible. However, when it is possible to enhance the input examples to construct models, one should consider it thoroughly. In this work, we propose a technique to define the best set of training examples using dynamic ensembles in text classification scenarios. In dynamic environments, where new data is constantly appearing, old data is usually disregarded, but sometimes some of those disregarded examples may carry substantial information. We propose a method that determines the most relevant examples by analysing their behaviour when defining separating planes or thresholds between classes. Those examples, deemed better than others, are kept for a longer time-window than the rest. Results on a Twitter scenario show that keeping those examples enhances the final classification performance.

Keywords

Dynamic Environments Ensembles Drift Text Classification Social Networks

URI

http://hdl.handle.net/10400.8/13700

Citation

Costa, J., Silva, C., Antunes, M., Ribeiro, B. (2016). Choice of Best Samples for Building Ensembles in Dynamic Environments. In: Jayne, C., Iliadis, L. (eds) Engineering Applications of Neural Networks. EANN 2016. Communications in Computer and Information Science, vol 629. Springer, Cham. https://doi.org/10.1007/978-3-319-44188-7_3