Repository logo
 
Publication

Distributed Text Classification With an Ensemble Kernel-Based Learning Approach

datacite.subject.fosCiências Naturais::Ciências da Computação e da Informação
datacite.subject.fosEngenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e Informática
dc.contributor.authorSilva, Catarina
dc.contributor.authorLotric, Uros
dc.contributor.authorRibeiro, Bernardete
dc.contributor.authorDobnikar, Andrej
dc.date.accessioned2025-11-13T18:58:48Z
dc.date.available2025-11-13T18:58:48Z
dc.date.issued2010
dc.descriptionFonte: https://www.researchgate.net/publication/224108209_Distributed_Text_Classification_With_an_Ensemble_Kernel-Based_Learning_Approach
dc.description.abstractConstructing a single text classifier that excels in any given application is a rather inviable goal. As a result, ensemble systems are becoming an important resource, since they permit the use of simpler classifiers and the integration of different knowledge in the learning process. However, many text-classification ensemble approaches have an extremely high computational burden, which poses limitations in applications in real environments. Moreover, state-of-the-art kernel-based classifiers, such as support vector machines and relevance vector machines, demand large resources when applied to large databases. Therefore, we propose the use of a new systematic distributed ensemble framework to tackle these challenges, based on a generic deployment strategy in a cluster distributed environment. We employ a combination of both task and data decomposition of the text-classification system, based on partitioning, communication, agglomeration, and mapping to define and optimize a graph of dependent tasks. Additionally, the framework includes an ensemble system where we exploit diverse patterns of errors and gain from the synergies between the ensemble classifiers. The ensemble data partitioning strategy used is shown to improve the performance of baseline state-of-the-art kernel-based machines. The experimental results show that the performance of the proposed framework outperforms standard methods both in speed and classification.eng
dc.description.sponsorshipThis work was supported by the Ministry of Higher Education, Science and Technology of Slovenia, and the Ministry of Science, Technology and Higher Education of Portugal (2005–2007) under the Slovenia-Portugal Bilateral Scientific Cooperation Project. This paper was recommended by Associate Editor J. A. Keane.
dc.identifier.citationSilva, Catarina & Lotric, Uros & Dobnikar, Andrej. (2010). Distributed Text Classification With an Ensemble Kernel-Based Learning Approach. Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on. 40. 287 - 297. DOI: https://doi.org/10.1109/TSMCC.2009.2038280.
dc.identifier.doi10.1109/tsmcc.2009.2038280
dc.identifier.eissn1558-2442
dc.identifier.issn1094-6977
dc.identifier.urihttp://hdl.handle.net/10400.8/14614
dc.language.isoeng
dc.peerreviewedyes
dc.publisherIEEE Canada
dc.relation.hasversionhttps://ieeexplore.ieee.org/document/5398989
dc.relation.ispartofIEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews)
dc.rights.uriN/A
dc.subjectDistributed learning
dc.subjectensembles
dc.subjectkernel-based machines
dc.subjecttext classification
dc.titleDistributed Text Classification With an Ensemble Kernel-Based Learning Approacheng
dc.typejournal article
dspace.entity.typePublication
oaire.citation.endPage11
oaire.citation.startPage1
oaire.citation.titleIEEE Transactions on Systems, Man, and Cybernetics: Systems
oaire.versionhttp://purl.org/coar/version/c_ab4af688f83e57aa
person.familyNameSilva
person.givenNameCatarina
person.identifier.ciencia-id1B19-3DDC-BE75
person.identifier.orcid0000-0002-5656-0061
relation.isAuthorOfPublicationee28e079-5ca7-4842-9094-372c40f75c38
relation.isAuthorOfPublication.latestForDiscoveryee28e079-5ca7-4842-9094-372c40f75c38

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Distributed text classification with an ensemble kernel-based learning approach.pdf
Size:
1.56 MB
Format:
Adobe Portable Document Format
Description:
Constructing a single text classifier that excels in any given application is a rather inviable goal. As a result, ensemble systems are becoming an important resource, since they permit the use of simpler classifiers and the integration of different knowledge in the learning process. However, many text-classification ensemble approaches have an extremely high computational burden, which poses limitations in applications in real environments. Moreover, state-of-the-art kernel-based classifiers, such as support vector machines and relevance vector machines, demand large resources when applied to large databases. Therefore, we propose the use of a new systematic distributed ensemble framework to tackle these challenges, based on a generic deployment strategy in a cluster distributed environment. We employ a combination of both task and data decomposition of the text-classification system, based on partitioning, communication, agglomeration, and mapping to define and optimize a graph of dependent tasks. Additionally, the framework includes an ensemble system where we exploit diverse patterns of errors and gain from the synergies between the ensemble classifiers. The ensemble data partitioning strategy used is shown to improve the performance of baseline state-of-the-art kernel-based machines. The experimental results show that the performance of the proposed framework outperforms standard methods both in speed and classification.
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.32 KB
Format:
Item-specific license agreed upon to submission
Description: