Browsing by Author "Rodrigues, Nuno M. M."
Now showing 1 - 10 of 21
Results Per Page
Sort Options
- Adaptive bridge model for compressed domain point cloud classificationPublication . Seleem, Abdelrahman; Guarda, André F. R.; Rodrigues, Nuno M. M.; Pereira, FernandoThe recent adoption of deep learning-based models for the processing and coding of multimedia signals has brought noticeable gains in performance, which have established deep learning-based solutions as the uncontested state-of-the-art both for computer vision tasks, targeting machine consumption, as well as, more recently, coding applications, targeting human visualization. Traditionally, applications requiring both coding and computer vision processing require frst decoding the bitstream and then applying the computer vision methods to the decompressed multimedia signals. However, the adoption of deep learning-based solutions enables the use of compressed domain computer vision processing, with gains in performance and computational complexity over the decompressed domain approach. For point clouds (PCs), these gains have been demonstrated in the single available compressed domain computer vision processing solution, named Compressed Domain PC Classifer, which processes JPEG Pleno PC coding (PCC) compressed streams using a PC classifer largely compatible with the state-of-the-art spatial domain PointGrid classifer. However, the available Compressed Domain PC Classifer presents strong limitations by imposing a single, specifc input size which is associated to specifc JPEG Pleno PCC confgurations; this limits the compression performance as these confgurations are not ideal for all PCs due to their diferent characteristics, notably density. To overcome these limitations, this paper proposes the frst Adaptive Compressed Domain PC Classifer solution which includes a novel adaptive bridge model that allows to process the JPEG Pleno PCC encoded bit streams using diferent coding confgurations, now maximizing the compression efciency. Experimental results show that the novel Adaptive Compressed Domain PC Classifer allows JPEG PCC to achieve better compression performance by not imposing a single, specifc coding confguration for all PCs, regardless of its diferent characteristics. Moreover, the added adaptability power can achieve slightly better PC classifcation performance than the previous Compressed Domain PC Classifer and largely better PC classifcation performance (and lower number of weights) than the PointGrid PC classifer working in the decompressed domain.
- Adaptive Deep Learning-Based Point Cloud Geometry CodingPublication . Andre F. R. Guarda; Rodrigues, Nuno M. M.; Fernando PereiraPoint clouds are a very rich 3D visual representation model, which has become increasingly appealing for multimedia applications with immersion, interaction and realism requirements. Due to different acquisition and creation conditions as well as target applications, point clouds’ characteristics may be very diverse, notably on their density. While geographical information systems or autonomous driving applications may use rather sparse point clouds, cultural heritage or virtual reality applications typically use denser point clouds to more accurately represent objects and people. Naturally, to offer immersion and realism, point clouds need a rather large number of points, thus asking for the development of efficient coding solutions. The use of deep learning models for coding purposes has recently gained relevance, with latest developments in image coding achieving state-of-the-art performance, thus making natural the adoption of this technology also for point cloud coding. This paper presents a novel deep learning-based solution for point cloud geometry coding which is able to efficiently adapt to the content’s characteristics. The proposed coding solution divides the point cloud into 3D blocks and selects the most suitable available deep learning coding model to code each block, thus maximizing the compression performance. In comparison to the state-of-the-art MPEG G-PCC Trisoup standard, the proposed coding solution offers average quality gains up to 4.9dB and 5.7dB for PSNR D1 and PSNR D2, respectively.
- Compression of medical images using MRP with bi-directional prediction and histogram packingPublication . Santos, Joao M.; Guarda, Andre F.R.; Cruz, Luís A. da Silva; Rodrigues, Nuno M. M.; Faria, Sergio M. M. deMedical imaging technology has become essential for the improvement of medical practice. This led to advances in the technology, namely in image sampling resolutions, pixel bit-depth and inter slice resolution. Additionally, common use of medical images, the life expectancy of patients and legal restrictions led to increasing storage costs. Therefore, efficient compression of medical image data is in high demand, for archiving and transmission. In this work we propose to improve the compression efficiency of the Minimum Rate Predictors lossless encoder, by adding bi-directional prediction support and a histogram packing technique. The results show that the proposed method presents a higher compression efficiency than state-of-the-art HEVC encoder. The compression efficiency is improved by 20%, on average, when compared to HEVC and by 46.1% when compared with the original MRP algorithm.
- Compression of touchless multiview fingerprintsPublication . Francisco, Nelson C.; Zaghetto, Alexandre; Macchiavello, Bruno; Silva, Eduardo A. B. da; Lima-Marques, Mamede; Rodrigues, Nuno M. M.; Faria, SergioRecently, touchless multiview fingerprinting technology has been proposed as an alternative to overcome the intrinsic problems of traditional contact-based systems. Nevertheless, compression of this kind of signal has not been fully evaluated and standardized. This paper investigates the comparative performance of several encoders for this data, namely WSQ, JPEG2000, H.264/AVC and MMP. Experimental results show that WSQ encoder, which is the current compression standard for contact-based fingerprints, is objectively outperformed by all others. In particular, MMP, which achieved the best results, outperforms WSQ by up to 4 dB.
- Constant Size Point Cloud Clustering: a Compact, Non-Overlapping SolutionPublication . Guarda, André F. R.; Rodrigues, Nuno M. M.; Pereira, FernandoPoint clouds have recently become a popular 3D representation model for many application domains, notably virtual and augmented reality. Since point cloud data is often very large, processing a point cloud may require that it be segmented into smaller clusters. For example, the input to deep learning-based methods like auto-encoders should be constant size point cloud clusters, which are ideally compact and non-overlapping. However, given the unorganized nature of point clouds, defining the specific data segments to code is not always trivial. This paper proposes a point cloud clustering algorithm which targets five main goals: i) clusters with a constant number of points; ii) compact clusters, i.e. with low dispersion; iii) non-overlapping clusters, i.e. not intersecting each other; iv) ability to scale with the number of points; and v) low complexity. After appropriate initialization, the proposed algorithm transfers points between neighboring clusters as a propagation wave, filling or emptying clusters until they achieve the same size. The proposed algorithm is unique since there is no other point cloud clustering method available in the literature offering the same clustering features for large point clouds at such low complexity
- Deep Learning-based Point Cloud Geometry Coding with Resolution ScalabilityPublication . Guarda, André F. R.; Rodrigues, Nuno M. M.; Pereira, FernandoPoint clouds are a 3D visual representation format that has recently become fundamentally important for immersive and interactive multimedia applications. Considering the high number of points of practically relevant point clouds, and their increasing market demand, efficient point cloud coding has become a vital research topic. In addition, scalability is an important feature for point cloud coding, especially for real-time applications, where the fast and rate efficient access to a decoded point cloud is important; however, this issue is still rather unexplored in the literature. In this context, this paper proposes a novel deep learning-based point cloud geometry coding solution with resolution scalability via interlaced sub-sampling. As additional layers are decoded, the number of points in the reconstructed point cloud increases as well as the overall quality. Experimental results show that the proposed scalable point cloud geometry coding solution outperforms the recent MPEG Geometry-based Point Cloud Compression standard which is much less scalable.
- Deep Learning-Based Point Cloud Geometry Coding: RD Control Through Implicit and Explicit QuantizationPublication . Guarda, André F. R.; Rodrigues, Nuno M. M.; Pereira, FernandoDeep learning is becoming more and more relevant for multiple multimedia processing tasks, and lately it has raised much interest in the coding arena notably for images and point clouds. While offering near state-of-the-art compression performance, current deep learning-based point cloud coding solutions have a shortcoming since they require training and storing multiple models in order to obtain different rate-distortion trade-offs. This paper proposes a solution that effectively reduces the number of deep learning models that need to be trained and stored by applying explicit quantization to the latent representation, which can be controlled at coding time, to generate varying rate-distortion tradeoffs. The proposed implicit-explicit quantization combination achieves a compression performance that is equivalent or better than the alternative, while significantly reducing the model storage memory requirements.
- Efficient depth map coding using linear residue approximation and a flexible prediction frameworkPublication . Lucas, Luis F. R.; Rodrigues, Nuno M. M.; Pagliari, Carla L.; Silva, Eduardo A. B. da; Faria, Sérgio M. M. deThe importance to develop more efficient 3D and multiview data representation algorithms results from the recent market growth for 3D video equipments and associated services. One of the most investigated formats is video+depth which uses depth image based rendering (DIBR) to combine the information of texture and depth, in order to create an arbitrary number of views in the decoder. Such approach requires that depth information must be accurately encoded. However, methods usually employed to encode texture do not seem to be suitable for depth map coding. In this paper we propose a novel depth map coding algorithm based on the assumption that depth images are piecewise-linear smooth signals. This algorithm is designed to encode sharp edges using a flexible dyadic block segmentation and hierarchical intra-prediction framework. The residual signal from this operation is aggregated into blocks which are approximated using linear modeling functions. Furthermore, the proposed algorithm uses a dictionary that increases the coding efficiency for previously used approximations. Experimental results for depth map coding show that synthesized views using the depth maps encoded by the proposed algorithm present higher PSNR than their counterparts, demonstrating the method’s efficiency.
- A generic post-deblocking filter for block based image compression algorithmsPublication . Francisco, Nelson C.; Rodrigues, Nuno M. M.; Silva, Eduardo A.B. da; Faria, Sérgio M.M. deIn this paper we propose a new post-processing deblocking technique that is independent of the compression method used to encode the image. The development of this filter was motivated by the use of Multidimensional Multiscale Parser (MMP) algorithm, a generic lossy and lossless compression method. Since it employs an adaptive block size, it presents some impairments when using the deblocking techniques presented in the literature. This led us to the development of a new and more generic deblocking method, based on total variation and adaptive bilateral filtering. The proposed method was evaluated not only for still images, but also for video sequences, encoded using pattern matching and transform based compression methods. For all cases, both the objective and subjective quality of the reconstructed images were improved, showing the versatility of the proposed technique.
- Improving multiscale recurrent pattern image coding with least-squares prediction modePublication . Graziosi, Danillo B.; Rodrigues, Nuno M. M.; Silva, Eduardo A. B. da; Faria, Sérgio M. M. de; Carvalho, Murilo B. de; Faria, Sergio; M. M. Rodrigues, Nuno;The Multidimensional Multiscale Parser-based (MMP) image coding algorithm, when combined with flexible partitioning and predictive coding techniques (MMP-FP), provides state-of-the-art performance. In this paper we investigate the use of adaptive least-squares prediction in MMP. The linear prediction coefficients implicitly embed the local texture characteristics, and are computed based on a block’s causal neighborhood (composed of already reconstructed data). Thus, the intra prediction mode is adaptively adjusted according to the local context and no extra overhead is needed for signaling the coefficients. We add this new context-adaptive linear prediction mode to the other MMP prediction modes, that are based on the ones used in H.264/AVC; the best mode is chosen through rate-distortion optimization. Simulation results show that least-squares prediction is able to significantly increase MMP-FPs rate-distortion performance for smooth images, leading to better results than the ones of state-of-theart, transform-based methods. Yet with the addition of least-squares prediction MMP-FP presents no performance loss when used for encoding non-smooth images, such as text and graphics.
- «
- 1 (current)
- 2
- 3
- »
