Publicação
A Survey on Data-driven Performance Tuning for Big Data Analytics Platforms
| datacite.subject.fos | Ciências Sociais::Economia e Gestão | |
| datacite.subject.fos | Ciências Naturais::Ciências da Computação e da Informação | |
| datacite.subject.sdg | 08:Trabalho Digno e Crescimento Económico | |
| datacite.subject.sdg | 09:Indústria, Inovação e Infraestruturas | |
| datacite.subject.sdg | 10:Reduzir as Desigualdades | |
| dc.contributor.author | Costa, Rogério Luís de C. | |
| dc.contributor.author | Moreira, José | |
| dc.contributor.author | Pintor, Paulo | |
| dc.contributor.author | Santos, Veronica dos | |
| dc.contributor.author | Lifschitz, Sérgio | |
| dc.date.accessioned | 2026-02-20T16:48:34Z | |
| dc.date.available | 2026-02-20T16:48:34Z | |
| dc.date.issued | 2021-07-15 | |
| dc.description | Ex-docente | |
| dc.description.abstract | Many research works deal with big data platforms looking forward to data science and analytics. These are complex and usually distributed environments, composed of several systems and tools. As expected, there is a need for a closer look at performance issues. In this work, we review performance tuning strategies in the big data environment. We focus on data-driven tuning techniques, discussing the use of database inspired approaches. Concerning big data and NoSQL stores, performance tuning issues are quite different from the so-called conventional systems. Many existing solutions are mostly ad-hoc activities that do not fit for multiple situations. But there are some categories of data-driven solutions that can be taken as guidelines and incorporated into general-purpose auto-tuning modules for big data systems. We examine typical performance tuning actions, discussing available solutions to support some of the tuning process's primary activities. We also discuss recent implementations of data-driven performance tuning solutions for big data platforms. We propose an initial classification based on the domain state-of-the-art and present selected tuning actions for large-scale data processing systems. Finally, we organized existing works towards self-tuning big data systems based on this classification and presented general and system-specific tuning recommendations. We found that most of the literature pieces evaluate the use of tuning actions at the physical design perspective, and there is a lack of self-tuning machine-learning-based solutions for big data systems. | eng |
| dc.description.sponsorship | This work is partially funded by National Funds through the FCT (Foundation for Science and Technology) in the context of the projects UIDB/04524/2020 and UIDB/00127/2020, and by Fundo Europeu de Desenvolvimento Regional (FEDER), Programa Operacional Competitividade e Internacionalização in the context of the projects POCI-01-0145-FEDER-032636 and Produtech II SIF – POCI-01-0247- FEDER-024541. Some of the authors are partially supported by grants from CNPq and CAPES, Brazilian public funding agencies and research institutes. | |
| dc.identifier.citation | Rogério Luís de C. Costa, José Moreira, Paulo Pintor, Veronica dos Santos, Sérgio Lifschitz, A Survey on Data-driven Performance Tuning for Big Data Analytics Platforms, Big Data Research, Volume 25, 2021, 100206, ISSN 2214-5796, https://doi.org/10.1016/j.bdr.2021.100206. | |
| dc.identifier.doi | 10.1016/j.bdr.2021.100206 | |
| dc.identifier.eissn | 2214-580X | |
| dc.identifier.issn | 2214-5796 | |
| dc.identifier.uri | http://hdl.handle.net/10400.8/15691 | |
| dc.language.iso | eng | |
| dc.peerreviewed | yes | |
| dc.publisher | Elsevier | |
| dc.relation | Research Center in Informatics and Communications | |
| dc.relation.hasversion | https://www.sciencedirect.com/science/article/pii/S221457962100023X | |
| dc.relation.ispartof | Big Data Research | |
| dc.rights.uri | http://creativecommons.org/licenses/by/4.0/ | |
| dc.subject | Big data systems | |
| dc.subject | Big data platforms | |
| dc.subject | Performance tuning | |
| dc.subject | Database systems | |
| dc.title | A Survey on Data-driven Performance Tuning for Big Data Analytics Platforms | eng |
| dc.type | journal article | |
| dspace.entity.type | Publication | |
| oaire.awardTitle | Research Center in Informatics and Communications | |
| oaire.awardURI | info:eu-repo/grantAgreement/FCT/6817 - DCRRNI ID/UIDB%2F04524%2F2020/PT | |
| oaire.citation.endPage | 17 | |
| oaire.citation.startPage | 1 | |
| oaire.citation.title | Big Data Research | |
| oaire.citation.volume | 25 | |
| oaire.fundingStream | 6817 - DCRRNI ID | |
| oaire.version | http://purl.org/coar/version/c_970fb48d4fbd8a85 | |
| person.familyName | de Carvalho Costa | |
| person.givenName | Rogério Luís | |
| person.identifier.ciencia-id | 7717-9573-0C0F | |
| person.identifier.orcid | 0000-0003-2306-7585 | |
| person.identifier.rid | A-7940-2016 | |
| person.identifier.scopus-author-id | 7801604983 | |
| project.funder.identifier | http://doi.org/10.13039/501100001871 | |
| project.funder.name | Fundação para a Ciência e a Tecnologia | |
| relation.isAuthorOfPublication | 5654d934-3fa0-4afb-9b3b-f2736104924c | |
| relation.isAuthorOfPublication.latestForDiscovery | 5654d934-3fa0-4afb-9b3b-f2736104924c | |
| relation.isProjectOfPublication | 67435020-fe0d-4b46-be85-59ee3c6138c7 | |
| relation.isProjectOfPublication.latestForDiscovery | 67435020-fe0d-4b46-be85-59ee3c6138c7 |
Ficheiros
Principais
1 - 1 de 1
Miniatura indisponível
- Nome:
- A Survey on Data-driven Performance Tuning for Big Data Analytics Platforms.pdf
- Tamanho:
- 965.67 KB
- Formato:
- Adobe Portable Document Format
- Descrição:
- Many research works deal with big data platforms looking forward to data science and analytics. These are complex and usually distributed environments, composed of several systems and tools. As expected, there is a need for a closer look at performance issues. In this work, we review performance tuning strategies in the big data environment. We focus on data-driven tuning techniques, discussing the use of database inspired approaches. Concerning big data and NoSQL stores, performance tuning issues are quite different from the so-called conventional systems. Many existing solutions are mostly ad-hoc activities that do not fit for multiple situations. But there are some categories of data-driven solutions that can be taken as guidelines and incorporated into general-purpose auto-tuning modules for big data systems. We examine typical performance tuning actions, discussing available solutions to support some of the tuning process's primary activities. We also discuss recent implementations of data-driven performance tuning solutions for big data platforms. We propose an initial classification based on the domain state-of-the-art and present selected tuning actions for large-scale data processing systems. Finally, we organized existing works towards self-tuning big data systems based on this classification and presented general and system-specific tuning recommendations. We found that most of the literature pieces evaluate the use of tuning actions at the physical design perspective, and there is a lack of self-tuning machine-learning-based solutions for big data systems.
Licença
1 - 1 de 1
Miniatura indisponível
- Nome:
- license.txt
- Tamanho:
- 1.32 KB
- Formato:
- Item-specific license agreed upon to submission
- Descrição:
