A generic framework for optimal 2D/3D key-frame extraction driven by aggregated saliency maps

Ferreira, Lino; Cruz, Luis A. da Silva; Assunção, Pedro

Publicação

A generic framework for optimal 2D/3D key-frame extraction driven by aggregated saliency maps

2015-11Artigo científico

datacite.subject.fos	Engenharia e Tecnologia
datacite.subject.sdg	09:Indústria, Inovação e Infraestruturas
dc.contributor.author	Ferreira, Lino
dc.contributor.author	Cruz, Luis A. da Silva
dc.contributor.author	Assunção, Pedro
dc.date.accessioned	2025-06-04T13:59:12Z
dc.date.available	2025-06-04T13:59:12Z
dc.date.issued	2015-11
dc.description.abstract	This paper proposes a generic framework for extraction of key-frames from 2D or 3D video sequences, relying on a new method to compute 3D visual saliency. The framework comprises the following novel aspects that distinguish this work from previous ones: (i) the key-frame selection process is driven by an aggregated saliency map, computed from various feature maps, which in turn correspond to different visual attention models; (ii) a method for computing aggregated saliency maps in 3D video is proposed and validated using fixation density maps, obtained from ground-truth eye-tracking data; (iii) 3D video content is processed within the same framework as 2D video, by including a depth feature map into the aggregated saliency. A dynamic programming optimisation algorithm is used to find the best set of K frames that minimises the dissimilarity error (i.e., maximise similarity) between the original video shots of size and those reconstructed from the key-frames. Using different performance metrics and publicly available databases, the simulation results demonstrate that the proposed framework outperforms similar state-of-art methods and achieves comparable performance as other quite different approaches. Overall, the proposed framework is validated for a wide range of visual content and has the advantage of being independent from any specific visual saliency model or similarity metrics.	eng
dc.description.sponsorship	Funding This work was supported by R&D Unit UID/EEA/50008/2013, Project 3DVQM and PhD Grant SFRH/BD/37510/2007, co-funded by FEDER-PT2020, FCT/MEC, Portugal .
dc.identifier.citation	Ferreira, Lino & da Silva Cruz, Luis & Assunção, Pedro. (2015). A generic framework for optimal 2D/3D key-frame extraction driven by aggregated saliency maps. Signal Processing: Image Communication. 39. 10.1016/j.image.2015.09.005.
dc.identifier.doi	10.1016/j.image.2015.09.005
dc.identifier.issn	0923-5965
dc.identifier.uri	http://hdl.handle.net/10400.8/13107
dc.language.iso	eng
dc.peerreviewed	yes
dc.publisher	Elsevier
dc.relation	Instituto de Telecomunicações
dc.relation	SCALABLE VIDEO CODING WITH DYNAMIC REGIONS OF INTEREST
dc.relation.hasversion	https://www.sciencedirect.com/science/article/pii/S0923596515001459
dc.relation.ispartof	Signal Processing: Image Communication
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/
dc.subject	3D key-frames
dc.subject	Visual saliency map
dc.subject	3D video summary
dc.subject	Aggregated saliency map
dc.title	A generic framework for optimal 2D/3D key-frame extraction driven by aggregated saliency maps	eng
dc.type	journal article
dspace.entity.type	Publication
oaire.awardNumber	UIDP/50008/2020
oaire.awardNumber	SFRH/BD/37510/2007
oaire.awardTitle	Instituto de Telecomunicações
oaire.awardTitle	SCALABLE VIDEO CODING WITH DYNAMIC REGIONS OF INTEREST
oaire.awardURI	info:eu-repo/grantAgreement/FCT/6817 - DCRRNI ID/UIDP%2F50008%2F2020/PT
oaire.awardURI	http://hdl.handle.net/10400.8/13106
oaire.citation.endPage	110
oaire.citation.startPage	98
oaire.citation.title	Signal Processing: Image Communication
oaire.citation.volume	39, Part A
oaire.fundingStream	6817 - DCRRNI ID
oaire.fundingStream	FARH
oaire.version	http://purl.org/coar/version/c_970fb48d4fbd8a85
person.familyName	Ferreira
person.familyName	Assunção
person.givenName	Lino
person.givenName	Pedro
person.identifier.ciencia-id	061B-4DCB-78BE
person.identifier.ciencia-id	6811-3984-C17B
person.identifier.orcid	0000-0003-0648-6067
person.identifier.orcid	0000-0001-9539-8311
person.identifier.rid	AAA-4462-2020
person.identifier.rid	A-4827-2017
person.identifier.scopus-author-id	36881920000
person.identifier.scopus-author-id	6701838347
project.funder.identifier	http://doi.org/10.13039/501100001871
project.funder.name	Fundação para a Ciência e a Tecnologia
relation.isAuthorOfPublication	03a0439a-a3c2-4fb1-8239-10b98d64659a
relation.isAuthorOfPublication	25649bb9-f135-48e8-8d0f-3706b86701d3
relation.isAuthorOfPublication.latestForDiscovery	03a0439a-a3c2-4fb1-8239-10b98d64659a
relation.isProjectOfPublication	91a8e212-cbb0-462f-b533-5ed3552e8067
relation.isProjectOfPublication	c9f7d818-7311-4d3f-8f04-1c64d39a8040
relation.isProjectOfPublication.latestForDiscovery	91a8e212-cbb0-462f-b533-5ed3552e8067

Ficheiros

Principais

A mostrar 1 - 1 de 1

Nome:: A generic framework for optimal 2D 3D.pdf
Tamanho:: 1.49 MB
Formato:: Adobe Portable Document Format

Ver/Abrir

Licença

A mostrar 1 - 1 de 1

Nome:: license.txt
Tamanho:: 1.32 KB
Formato:: Item-specific license agreed upon to submission
Descrição:

Ver/Abrir

Coleções

ESTG - Artigos em revistas internacionais