Repository logo
 
Loading...
Profile Picture

Search Results

Now showing 1 - 10 of 23
  • An automatic generation of textual pattern rules for digital content filters proposal, using grammatical evolution genetic programming
    Publication . Basto-Fernandes, Vitor; Yevseyeva, Iryna; Frantz, Rafael Z.; Grilo, Carlos Fernando Almeida; Díaz, Noemí Pérez; Emmerich, Michael
  • Characterising Enterprise Application Integration Solutions as Discrete-Event Systems
    Publication . Sawicki, Sandro; Frantz, Rafael Z.; Basto-Fernandes, Vitor; Fabricia Roos-Frantz, Fabricia; Yevseyeva, Iryna; Corchuelo, Rafael
    It is not difficult to find an enterprise which has a software ecosystem composed of applications that were built using different technologies, data models, operating systems, and most often were not designed to exchange data and share functionalities. Enterprise Application Integration provides methodologies and tools to design and implement integration solutions. The state-of-the-art integration technologies provide a domain-specific language that enables the design of conceptual models for integration solutions. The analysis of integration solutions to predict their behaviour and find possible performance bottlenecks is an important activity that contributes to increase the quality of the delivered solutions, however, software engineers follow a costly, risky, and time-consuming approach. Integration solutions shall be understood as a discrete-event system. This chapter introduces a new approach based on simulation to take advantage of well-established techniques and tools for discrete-event simulation, cutting down cost, risk, and time to deliver better integration solutions.
  • A spam filtering multi-objective optimization study covering parsimony maximization and three-way classification
    Publication . Basto-Fernandes, Vitor; Yevseyeva, Iryna; Méndez, José R.; Zhao, Jiaqi; Fdez-Riverola, Florentino; Emmerich, Michael T.M.
    Classifier performance optimization in machine learning can be stated as a multi-objective optimization problem. In this context, recent works have shown the utility of simple evolutionary multi-objective algorithms (NSGA-II, SPEA2) to conveniently optimize the global performance of different anti-spam filters. The present work extends existing contributions in the spam filtering domain by using three novel indicator-based (SMS-EMOA, CH-EMOA) and decomposition-based (MOEA/D) evolutionary multiobjective algorithms. The proposed approaches are used to optimize the performance of a heterogeneous ensemble of classifiers into two different but complementary scenarios: parsimony maximization and e-mail classification under low confidence level. Experimental results using a publicly available standard corpus allowed us to identify interesting conclusions regarding both the utility of rule-based classification filters and the appropriateness of a three-way classification system in the spam filtering domain.
  • Multiobjective optimization of classifiers by means of 3D convex-hull-based evolutionary algorithms
    Publication . Zhao, Jiaqi; Basto-Fernandes, Vitor; Jiao, Licheng; Yevseyeva, Iryna; Maulana, Asep; Li, Rui; Bäck, Thomas; Tang, Ke; Emmerich, Michael T.M.
    The receiver operating characteristic (ROC) and detection error tradeoff (DET) curves are frequently used in the machine learning community to analyze the performance of binary classifiers. Recently, the convex-hull-based multiobjective genetic programming algorithm was proposed and successfully applied to maximize the convex hull area for binary classification problems by minimizing false positive rate and maximizing true positive rate at the same time using indicator-based evolutionary algorithms. The area under the ROC curve was used for the performance assessment and to guide the search. Here we extend this research and propose two major advancements: Firstly we formulate the algorithm in detection error tradeoff space, minimizing false positives and false negatives, with the advantage that misclassification cost tradeoff can be assessed directly. Secondly, we add complexity as an objective function, which gives rise to a 3D objective space (as opposed to a 2D previous ROC space). A domain specific performance indicator for 3D Pareto front approximations, the volume above DET surface, is introduced, and used to guide the indicator-based evolutionary algorithm to find optimal approximation sets. We assess the performance of the new algorithm on designed theoretical problems with different geometries of Pareto fronts and DET surfaces, and two application-oriented benchmarks: (1) Designing spam filters with low numbers of false rejects, false accepts, and low computational cost using rule ensembles, and (2) finding sparse neural networks for binary classification of test data from the UCI machine learning benchmark. The results show a high performance of the new algorithm as compared to conventional methods for multicriteria optimization.
  • RuleSIM: a toolkit for simulating the operation and improving throughput of rule-based spam filters
    Publication . Ruano-Ordás, David; Fdez-Glez, Jorge; Fdez-Riverola, Florentino; Basto-Fernandes, Vitor; Méndez, José Ramón
    This paper introduces RuleSIM, a toolkit comprising different simulation tools specifically designed to aid researchers concerned about spam-filtering throughput. RuleSIM allows easily designing, developing, simulating and comparing new scheduling heuristics using different filters and sets of e-mails. Simulation results can be both graphically analysed, by using different complementary views, and quantitatively compared through several measures. Moreover, the underlying RuleSIM API can be easily integrated with third-party Java optimization platforms to facilitate debugging and achieve better configurations for rule scheduling. RuleSIM is free software distributed under the terms of GNU Lesser General Public License, and both source code and documentation are publicly available at https://github.com/rulesim/v2.0. Copyright © 2015 John Wiley & Sons, Ltd.
  • A survey of diversity-oriented optimization
    Publication . Basto-Fernandes, Vitor; Yevseyeva, Iryna; Emmerich, Michael
  • Information architecture for IS function : a case study
    Publication . Carriço, Nelson; Varajão, João; Basto-Fernandes, Vitor; Dominguez, Caroline
    Today’s complex, unstable and competitive society raises several difficulties to organisations. In this context, Information and Communications Technologies (ICT) and information itself have become resources of vital importance. The pressing need for Information Systems (IS) to meet several business requirements, in addition to the complexity involved in technology and business management, turns the IS Function one of the main areas of influence for success of modern organisations. Through its capacity of representing activities, management objects and corresponding relations, the Information Architecture of the Information Systems Function (IAISF), a technique derived from the well-known Information Architecture but exclusively focused on the Information Systems Function (ISF), allows not only the conceptualization and understanding of the ISF itself, but also of its interactions with other areas within organizations. This paper presents the main results of a case study related to the application of the IAISF technique in a computer service centre of a University.
  • Desafios para a implantação de soluções de integração de aplicações empresariais em provedores de computação em nuvem
    Publication . Frantz, Rafael; Sawicki, Sandro; Roos-Frantz, Fabricia; Corchuelo, Rafael; Basto-Fernandes, Vitor; Hernández, Inma
    Nos últimos anos o campo de estudos conhecido como Integração de Aplicações Empresariais tem desempenhado um importante papel ao proporcionar metodologias, técnicas e ferramentas para que as empresas possam desenvolver soluções de integração, visando reutilizar suas aplicações e dar suporte às novas demandas que surgem com a evolução dos seus processos de negócio. A Computação em Nuvem é parte de uma nova realidade, na qual tanto pequenas como grandes empresas têm a sua disposição uma infraestrutura de TI de alta capacidade, a um baixo custo, na qual podem implantar e executar suas soluções de integração. O modelo de cobrança adotado pelos provedores de Computação em Nuvem se baseia na quantidade de recursos computacionais consumidos por uma solução de integração. Tais recursos podem ser conhecidos, basicamente, de duas formas distintas: a partir da execução real de uma solução de integração em um motor de orquestração, ou a partir da simulação do modelo conceitual que descreve a solução sem que a mesma tenha que ser previamente implementada. Ainda, é desejável que os provedores proporcionem modelos conceituais que descrevam detalhadamente a variabilidade de serviços e as restrições entre eles. A revisão da literatura técnica e científica evidencia que não existem metodologias, técnicas e ferramentas para estimar a demanda de recursos computacionais consumidos por soluções de integração, a partir de seus modelos conceituais. Além disso, os provedores de Computação em Nuvem não possuem ou disponibilizam os modelos conceituais dos serviços que possam ser contratados. Tais questões constituem a base para que se possa estabelecer um processo e desenvolver ferramentas de apoio a tomada de decisão para a implantação de soluções de integração de aplicações empresariais em provedores de Computação em Nuvem.
  • Semantic Integration Framework for Resources Identification
    Publication . Pragosa, Miguel; Basto-Fernandes, Vitor; Luísa Oliveira
    The global adoption of information technology systems throughout all activity domains lead to the existence of a huge amount of information systems unable to interoperate, first of all, because of different ways of identifying resources. Different transport and application level protocols for data exchange also raise some interoperability difficulties. Common interoperability scenarios rely on tightly controlled, specific communities of information technology islands. In this chapter we present and discuss the usage of lexical, syntactic and semantic lexical technologies to address interoperability problems at the product identification level, in the context of food consumption analysis.
  • Two-stage Security Controls Selection
    Publication . Yevseyeva, Iryna; Basto-Fernandes, Vitor; van Moorsel, Aad; Janicke, Helge; Emmerich, Michael
    o protect a system from potential cyber security breaches and attacks, one needs to select efficient security controls, taking into account technical and institutional goals and constraints, such as available budget, enterprise activity, internal and external environment. Here we model the security controls selection problem as a two-stage decision making: First, managers and information security officers define the size of security budget. Second, the budget is distributed between various types of security controls. By viewing loss prevention with security controls measured as gains relative to a baseline (losses without applying security controls), we formulate the decision making process as a classical portfolio selection problem. The model assumes security budget allocation as a two objective problem, balancing risk and return, given a budget constraint. The Sharpe ratio is used to identify an optimal point on the Pareto front to spend the budget. At the management level the budget size is chosen by computing the trade-offs between Sharpe ratios and budget sizes. It is shown that the proposed two-stage decision making model can be solved by quadratic programming techniques, which is shown for a test case scenario with realistic data.