Browsing by Author "Cruz, Luis A. da Silva"
Now showing 1 - 10 of 15
Results Per Page
Sort Options
- 3D key-frame extraction method based on visual saliencyPublication . Ferreira, Lino; Assunção, Pedro; Cruz, Luis A. da SilvaThis paper presents a method for key-frame extraction from 3D video using visual saliency to weight the 3D content according to a user attention model. Key-frames are found in temporal segments of arbitrary length (i.e., 3D scenes) using a dynamic programming algorithm which minimises the dissimilarity between the reconstructed and the original temporal segment. The dissimilarity measure is based on a combination of frame difference and visual relevance estimated through visual saliency maps. These maps result from attention modeling, taking into account spatial, temporal and depth features of the 3D video content. The results, evaluated using the Shot Reconstruction Degree and the Fidelity measure, show that the proposed method outperforms those obtained from uniform sampling and attention curve methods. This method may be useful for fast browsing of 3D video repositories.
- Classification-based early termination for coding tree structure decision in HEVCPublication . Correa, Guilherme; Assunção, Pedro; Cruz, Luis A. da Silva; Agostini, LucianoThe High Efficiency Video Coding (HEVC) standard provides improved compression rates in comparison to its predecessors at the cost of large increases in computational complexity. An important share of such increases is due to the introduction of flexible Coding Tree structures, which best configuration is decided through exhaustive tests in a Rate-Distortion Optimization (RDO) scheme. In this work, an early termination method for the decision of such structures was designed using classification trees obtained through Data Mining techniques. The classification trees were trained using intermediate encoding results from a training set of video sequences and implemented in the encoder to skip the full RDO-based decision. An average reduction of 37% in the HEVC encoder computational complexity was achieved when using the designed classification trees, with a negligible cost of only 0.28% in terms of Bjontegaard Delta-rate increase.
- Computational Resource Management for Video Coding in Mobile EnvironmentsPublication . Correa, Guilherme; Assunção, Pedro; Agostini, Luciano; Cruz, Luis A. da SilvaThe increase of computational resources in mobile devices and the availability of reliable communication infrastructures provide support for acquisition, display, coding/decoding and transmission of high-resolution video in a broad set of equipment such as tablets and smartphones. Nevertheless, real-time video encoding and decoding is still a challenge in such computing environments, especially when considering the amount of computational resources required by state-of-the-art video coding standards. Moreover, battery technologies did not evolve as much as desired, which makes power consumption minimization an important issue for the mobile devices industry and users. Therefore, in current mobile systems, the available computational resources along with battery-life are responsible for imposing significant limitations on mobile real-time multimedia communications. This chapter presents an overview of the state-of-the-art research on management of computational resources for video encoding systems in mobile communications equipment. A review on computational complexity analysis of both H.264/AVC and HEVC video coding standards is presented, followed by a description of current methods for modelling, reducing and controlling the expenditure of computational resources on these video codecs. Finally, future trends on computational complexity management for video codecs implemented on power-constrained devices are lined out.
- Disparity compensation of light fields for improved efficiency in 4D transform-based encodersPublication . Santos, Joao M.; Thomaz, Lucas A.; Assuncao, Pedro A. A.; Cruz, Luis A. da Silva; Tavora, Luis M. N.; Faria, Sergio M. M. deEfficient light field en coders take advantage of the inherent 4D data structures to achieve high compression performance. This is accomplished by exploiting the redundancy of co-located pixels in different sub-aperture images (SAIs) through prediction and/or transform schemes to find a m ore compact representation of the signal. However, in image regions with higher disparity between SAIs, such scheme's performance tends to decrease, thus reducing the compression efficiency. This paper introduces a reversible pre-processing algorithm for disparity compensation that operates on the SAI domain of light field data. The proposed method contributes to improve the transform efficiency of the encoder, since the disparity-compensated data presents higher correlation between co-located image blocks. The experimental results show significant improvements in the compression performance of 4D light fields, achieving Bjontegaard delta rate gains of about 44% on average for MuLE codec using the 4D discrete cosine transform, when encoding High Density Camera Arrays (HDCA) light field images.
- Fast coding tree structure decision for HEVC based on classification treesPublication . Correa, Guilherme; Assunção, Pedro; Agostini, Luciano; Cruz, Luis A. da SilvaThe High Efficiency Video Coding (HEVC) standard provides improved compression rates in comparison to its predecessors at the cost of large increases in computational complexity. An important share of such increases is due to the introduction of flexible Coding Tree structures, which best configuration is decided through exhaustive tests in a rate-distortion optimization (RDO) scheme. In this work, an early termination method for the decision of such structures was designed using classification trees obtained through Data Mining techniques. The classification trees were trained using intermediate encoding results from a set of video sequences and implemented in the encoder to avoid the full RDO-based decision. An average reduction of 37 % in the HEVC encoder computational complexity was achieved when using the designed classification trees, with a negligible cost of only 0.28 % in terms of Bjontegaard Delta-rate increase.
- Fast HEVC Encoding Decisions Using Data MiningPublication . Correa, Guilherme; Assunção, Pedro; Agostini, Luciano Volcan; Cruz, Luis A. da SilvaThe High Efficiency Video Coding standard provides improved compression ratio in comparison with its predecessors at the cost of large increases in the encoding computational complexity. An important share of this increase is due to the new flexible partitioning structures, namely the coding trees, the prediction units, and the residual quadtrees, with the best configurations decided through an exhaustive rate-distortion optimization (RDO) process. In this paper, we propose a set of procedures for deciding whether the partition structure optimization algorithm should be terminated early or run to the end of an exhaustive search for the best configuration. The proposed schemes are based on decision trees obtained through data mining techniques. By extracting intermediate data, such as encoding variables from a training set of video sequences, three sets of decision trees are built and implemented to avoid running the RDO algorithm to its full extent. When separately implemented, these schemes achieve average computational complexity reductions (CCRs) of up to 50% at a negligible cost of 0.56% in terms of Bjontegaard Delta (BD) rate increase. When the schemes are jointly implemented, an average CCR of up to 65% is achieved, with a small BD-rate increase of 1.36%. Extensive experiments and comparisons with similar works demonstrate that the proposed early termination schemes achieve the best rate-distortion-complexity tradeoffs among all the compared works.
- Four-step algorithm for early termination in HEVC inter-frame prediction based on decision treesPublication . Correa, Guilherme; Assunção, Pedro; Agostini, Luciano; Cruz, Luis A. da SilvaThe flexible encoding structures of High Efficiency Video Coding (HEVC) are the main responsible for the improvements of the standard in terms of compression efficiency in comparison to its predecessors. However, the flexibility provided by these structures is accompanied by high levels of computational complexity, since more options are considered in a Rate-Distortion (R-D) optimization scheme. In this paper, we propose a four-step early-termination method, which decides whether the inter mode decision should be halted without testing all possibilities. The method employs a set of decision trees, which are trained offline once, using information from unconstrained HEVC encoding runs. The resulting trees present a mode decision accuracy ranging from 97.6% to 99.4% with a negligible computational overhead. The method is capable of achieving an average computational complexity decrease of 49% at the cost of a very small Bjontegaard Delta (BD)-rate increase (0.58%).
- A generic framework for optimal 2D/3D key-frame extraction driven by aggregated saliency mapsPublication . Ferreira, Lino; Cruz, Luis A. da Silva; Assunção, PedroThis paper proposes a generic framework for extraction of key-frames from 2D or 3D video sequences, relying on a new method to compute 3D visual saliency. The framework comprises the following novel aspects that distinguish this work from previous ones: (i) the key-frame selection process is driven by an aggregated saliency map, computed from various feature maps, which in turn correspond to different visual attention models; (ii) a method for computing aggregated saliency maps in 3D video is proposed and validated using fixation density maps, obtained from ground-truth eye-tracking data; (iii) 3D video content is processed within the same framework as 2D video, by including a depth feature map into the aggregated saliency. A dynamic programming optimisation algorithm is used to find the best set of K frames that minimises the dissimilarity error (i.e., maximise similarity) between the original video shots of size and those reconstructed from the key-frames. Using different performance metrics and publicly available databases, the simulation results demonstrate that the proposed framework outperforms similar state-of-art methods and achieves comparable performance as other quite different approaches. Overall, the proposed framework is validated for a wide range of visual content and has the advantage of being independent from any specific visual saliency model or similarity metrics.
- Lossless coding of light field images based on minimum-rate predictorsPublication . Santos, João M.; Assunção, Pedro; Cruz, Luis A. da Silva; de Oliveira Pegado de Noronha E Távora, Luís Miguel; Fonseca-Pinto, Rui; Faria, SergioRecent developments in light field acquisition and computational photography are driving new research efforts on light field encoding methods, capable of exploiting the specific features of this type of visual data. This paper presents a research study of lossless light field image compression, using Minimum-Rate Predictors (MRP) and mainstream image and video encoders. The research is focused on three light field representation formats: lenslet images, stack of sub-aperture images and epipolar images. The main contributions of this work are the ‘Spiral-blackend’ serialization method and the use of MRP for the lossless compression of light fields with joint encoding of RGB data. The results show that the lenslet format yields lower compression efficiencies than other formats. Furthermore, it is demonstrated that the MRP algorithm consistently outperforms HEVC-RExt, JPEG2000, JPEG-LS and CALIC when light fields are represented by either a stack of sub-aperture or epipolar images.
- Lossless Compression of Medical Images Using 3-D PredictorsPublication . Lucas, Luis F. R.; M. M. Rodrigues, Nuno; Cruz, Luis A. da Silva; Faria, Sergio M. M. deThispaper describes a highly efficientmethod for lossless compression of volumetric sets of medical images, such as CTs or MRIs. The proposed method, referred to as 3-D-MRP, is based on the principle ofminimum rate predictors (MRPs), which is one of the state-of-the-art lossless compression technologies presented in the data compression literature. The main features of the proposed method include the use of 3-D predictors, 3-D-block octree partitioning and classification, volume-based optimization, and support for 16-b-depth images. Experimental results demonstrate the efficiency of the 3-D-MRP algorithm for the compression of volumetric sets of medical images, achieving gains above 15% and 12% for 8- and 16-bitdepth contents, respectively, when compared with JPEGLS, JPEG2000, CALIC, and HEVC, aswell as other proposals based on the MRP algorithm.