Logo do repositório
 
A carregar...
Miniatura
Publicação

EasyBDI: Near Real-Time Data Analytics over Heterogeneous Data Sources

Utilize este identificador para referenciar este registo.
Nome:Descrição:Tamanho:Formato: 
EasyBDI Near real-time data analytics over heterogeneous data sources.pdfThe large volume of currently available data creates several opportunities for sciences and industry, especially with the application of data analytics. But also raises challenges that make unfeasible the use of batch-based ETL processes. Indeed, near real-time data analytics is a requirement in several domains as an alternative to traditional data warehouses. In the last years, big data platforms have been developed to enable query execution over distributed data sources. However, they do not deal with subject-oriented analysis, do not provide data distribution transparency, or do not assist with schema mapping and integration. In this demonstration, we present EasyBDI. It's a near real-time big data analytics prototype that enables users to run queries over heterogeneous data sources based on global logical abstractions created by the system and provides some usual concepts of data warehouse systems, like facts and dimensions. We use two motivating scenarios, one based on three years of real data on photovoltaic energy production and consumption, and the other based on the SSB+ benchmark. We will also present implementation challenges, issues, solutions, and insights.929.7 KBAdobe PDF Ver/Abrir

Orientador(es)

Resumo(s)

The large volume of currently available data creates several opportunities for sciences and industry, especially with the application of data analytics. But also raises challenges that make unfeasible the use of batch-based ETL processes. Indeed, near real-time data analytics is a requirement in several domains as an alternative to traditional data warehouses. In the last years, big data platforms have been developed to enable query execution over distributed data sources. However, they do not deal with subject-oriented analysis, do not provide data distribution transparency, or do not assist with schema mapping and integration. In this demonstration, we present EasyBDI. It's a near real-time big data analytics prototype that enables users to run queries over heterogeneous data sources based on global logical abstractions created by the system and provides some usual concepts of data warehouse systems, like facts and dimensions. We use two motivating scenarios, one based on three years of real data on photovoltaic energy production and consumption, and the other based on the SSB+ benchmark. We will also present implementation challenges, issues, solutions, and insights.

Descrição

Link de acesso ao documento - https://openproceedings.org/2021/conf/edbt/p190.pdf
Conference date - 23 March 2021 - 26 March 2021; Conference code - 171234

Palavras-chave

Distribution transparency data analytics near real-time data warehousing

Contexto Educativo

Citação

Silva, B., Moreira, J. M., & Costa, R. L. D. C. (2021). EasyBDI: Near Real-Time Data Analytics over Heterogeneous Data Sources. In EDBT (pp. 702-705). DOI: https://doi.org/10.5441/002/edbt.2021.88.

Unidades organizacionais

Fascículo