Logo do repositório
 
Publicação

Attention-driven tile splitting method for improved efficiency of omnidirectional versatile video coding

datacite.subject.fosCiências Naturais::Ciências da Computação e da Informação
datacite.subject.sdg10:Reduzir as Desigualdades
datacite.subject.sdg11:Cidades e Comunidades Sustentáveis
dc.contributor.authorCarreira, J.
dc.contributor.authorFaria, Sergio M. M. de
dc.contributor.authorTavora, Luis M. N.
dc.contributor.authorNavarro, Antonio
dc.contributor.authorAssuncao, Pedro A.
dc.date.accessioned2026-04-20T18:05:56Z
dc.date.available2026-04-20T18:05:56Z
dc.date.issued2021-09
dc.descriptionEISBN - 978-1-6654-4115-5
dc.descriptionDate of Conference: 19-22 September 2021
dc.description.abstractA common approach used in omnidirectional video coding is based on frame splitting into tiles, allowing partial delivery of only the subset of tiles that is necessary to render the user’s current viewing region, defined as a specific viewport or Field-of-View (FoV). Since tiles can be independently encoded, such mechanism provides a flexible solution for encoding planar representations with ultra-high definition (UHD), such as the Equirectangular Projection (ERP), using Versatile Video Coding (VVC). By only selecting and transmitting the coded data that is required to render the necessary FoV, rather than the full 360°, a great deal of bandwidth can be saved. While current solutions are based on splitting the omnidirectional video frames into tiles of equal size, this paper proposes a new approach based on adaptive tile size, driven by visual attention. Those regions where the visual attention is higher are partitioned in smaller tiles to obtain higher bit rate granularity, allowing to decode the most frequent FoVs with minimum out-of-FoV pixels and reduced bandwidth. Optimal tile boundaries are found by solving a lagrangian minimisation problem with a cost function that achieves the best tradeoff between the standard deviation and the average attention-weighted bit rate per tile. The experimental results show that an average of 7.17% and 17.73% of bit rate savings is obtained in comparison with conventional tilling methods for the commonly used FoVs of 90° ×90° and 45° ×45°, respectively.eng
dc.description.sponsorshipThis work was supported by Programa Operacional Regional do Centro, project ARoundVision CENTRO-01-0145-FEDER-030652 and by FCT/MCTES through national funds and when applicable co-funded EU funds under the project UIDB/EEA/50008/2020, Portugal.
dc.identifier.citationJ. Carreira, S. M. M. de Faria, L. M. N. Tavora, A. Navarro and P. A. Assuncao, "Attention-driven tile splitting method for improved efficiency of omnidirectional versatile video coding," 2021 IEEE International Conference on Image Processing (ICIP), Anchorage, AK, USA, 2021, pp. 2149-2153, doi: https://doi.org/10.1109/ICIP42928.2021.9506358.
dc.identifier.doi10.1109/icip42928.2021.9506358
dc.identifier.eissn2381-8549
dc.identifier.isbn978-1-6654-3102-6
dc.identifier.isbn978-1-6654-4115-5
dc.identifier.issn1522-4880
dc.identifier.urihttp://hdl.handle.net/10400.8/16163
dc.language.isoeng
dc.peerreviewedyes
dc.publisherIEEE Canada
dc.relation.hasversionhttps://ieeexplore.ieee.org/document/9506358
dc.relation.ispartof2021 IEEE International Conference on Image Processing (ICIP)
dc.rights.uriN/A
dc.subjectOmnidirectional video coding
dc.subjectadaptive 360° streaming
dc.subjecttile-based coding
dc.titleAttention-driven tile splitting method for improved efficiency of omnidirectional versatile video codingeng
dc.typeconference paper
dspace.entity.typePublication
oaire.citation.conferenceDate2021-09
oaire.citation.conferencePlaceAnchorage, Alaska, USA
oaire.citation.endPage2153
oaire.citation.startPage2149
oaire.citation.titleProceedings - International Conference on Image Processing, ICIP
oaire.versionhttp://purl.org/coar/version/c_970fb48d4fbd8a85
person.familyNameMonteiro Carreira
person.familyNameFaria
person.familyNamede Oliveira Pegado de Noronha E Távora
person.familyNameAssunção
person.givenNameJoão Filipe
person.givenNameSergio
person.givenNameLuís Miguel
person.givenNamePedro
person.identifier.ciencia-id911C-4634-594A
person.identifier.ciencia-id8815-4101-28DD
person.identifier.ciencia-id121C-FADA-D750
person.identifier.ciencia-id6811-3984-C17B
person.identifier.gsidj3gTMAcAAAAJ&hl
person.identifier.orcid0000-0002-5711-5875
person.identifier.orcid0000-0002-0993-9124
person.identifier.orcid0000-0002-8580-1979
person.identifier.orcid0000-0001-9539-8311
person.identifier.ridC-5245-2011
person.identifier.ridA-4827-2017
person.identifier.scopus-author-id14027853900
person.identifier.scopus-author-id6701838347
relation.isAuthorOfPublication762f8087-0f43-4df3-bc4d-db13f8ddd8c2
relation.isAuthorOfPublicationf69bd4d6-a6ef-4d20-8148-575478909661
relation.isAuthorOfPublication71940f24-f333-4ab6-abf6-00c7119a07c2
relation.isAuthorOfPublication25649bb9-f135-48e8-8d0f-3706b86701d3
relation.isAuthorOfPublication.latestForDiscovery762f8087-0f43-4df3-bc4d-db13f8ddd8c2

Ficheiros

Principais
A mostrar 1 - 1 de 1
Miniatura indisponível
Nome:
ATTENTION-DRIVEN TILE SPLITTING METHOD FOR IMPROVED EFFICIENCY OF OMNIDIRECTIONAL VERSATILE VIDEO CODING.pdf
Tamanho:
1.79 MB
Formato:
Adobe Portable Document Format
Descrição:
A common approach used in omnidirectional video coding is based on frame splitting into tiles, allowing partial delivery of only the subset of tiles that is necessary to render the user’s current viewing region, defined as a specific viewport or Field-of-View (FoV). Since tiles can be independently encoded, such mechanism provides a flexible solution for encoding planar representations with ultra-high definition (UHD), such as the Equirectangular Projection (ERP), using Versatile Video Coding (VVC). By only selecting and transmitting the coded data that is required to render the necessary FoV, rather than the full 360°, a great deal of bandwidth can be saved. While current solutions are based on splitting the omnidirectional video frames into tiles of equal size, this paper proposes a new approach based on adaptive tile size, driven by visual attention. Those regions where the visual attention is higher are partitioned in smaller tiles to obtain higher bit rate granularity, allowing to decode the most frequent FoVs with minimum out-of-FoV pixels and reduced bandwidth. Optimal tile boundaries are found by solving a lagrangian minimisation problem with a cost function that achieves the best tradeoff between the standard deviation and the average attention-weighted bit rate per tile. The experimental results show that an average of 7.17% and 17.73% of bit rate savings is obtained in comparison with conventional tilling methods for the commonly used FoVs of 90° ×90° and 45° ×45°, respectively.
Licença
A mostrar 1 - 1 de 1
Miniatura indisponível
Nome:
license.txt
Tamanho:
1.32 KB
Formato:
Item-specific license agreed upon to submission
Descrição: