Alexandre Lopes , Fernando Pereira dos Santos , Diulhio de Oliveira , Mauricio Schiezaro , Helio Pedrini
{"title":"Computer Vision Model Compression Techniques for Embedded Systems:A Survey","authors":"Alexandre Lopes , Fernando Pereira dos Santos , Diulhio de Oliveira , Mauricio Schiezaro , Helio Pedrini","doi":"10.1016/j.cag.2024.104015","DOIUrl":"10.1016/j.cag.2024.104015","url":null,"abstract":"<div><p>Deep neural networks have consistently represented the state of the art in most computer vision problems. In these scenarios, larger and more complex models have demonstrated superior performance to smaller architectures, especially when trained with plenty of representative data. With the recent adoption of Vision Transformer (ViT) based architectures and advanced Convolutional Neural Networks (CNNs), the total number of parameters of leading backbone architectures increased from 62M parameters in 2012 with AlexNet to 7B parameters in 2024 with AIM-7B. Consequently, deploying such deep architectures faces challenges in environments with processing and runtime constraints, particularly in embedded systems. This paper covers the main model compression techniques applied for computer vision tasks, enabling modern models to be used in embedded systems. We present the characteristics of compression subareas, compare different approaches, and discuss how to choose the best technique and expected variations when analyzing it on various embedded devices. We also share codes to assist researchers and new practitioners in overcoming initial implementation challenges for each subarea and present trends for Model Compression.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"123 ","pages":"Article 104015"},"PeriodicalIF":2.5,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S009784932400150X/pdfft?md5=4a61da15472973e3b8b39fed45db404f&pid=1-s2.0-S009784932400150X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141848296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ben Veldhuijzen , Remco C. Veltkamp , Omar Ikne , Benjamin Allaert , Hazem Wannous , Marco Emporio , Andrea Giachetti , Joseph J. LaViola Jr. , Ruiwen He , Halim Benhabiles , Adnane Cabani , Anthony Fleury , Karim Hammoudi , Konstantinos Gavalas , Christoforos Vlachos , Athanasios Papanikolaou , Ioannis Romanelis , Vlassis Fotis , Gerasimos Arvanitis , Konstantinos Moustakas , Christoph von Tycowicz
{"title":"SHREC 2024: Recognition of dynamic hand motions molding clay","authors":"Ben Veldhuijzen , Remco C. Veltkamp , Omar Ikne , Benjamin Allaert , Hazem Wannous , Marco Emporio , Andrea Giachetti , Joseph J. LaViola Jr. , Ruiwen He , Halim Benhabiles , Adnane Cabani , Anthony Fleury , Karim Hammoudi , Konstantinos Gavalas , Christoforos Vlachos , Athanasios Papanikolaou , Ioannis Romanelis , Vlassis Fotis , Gerasimos Arvanitis , Konstantinos Moustakas , Christoph von Tycowicz","doi":"10.1016/j.cag.2024.104012","DOIUrl":"10.1016/j.cag.2024.104012","url":null,"abstract":"<div><p>Gesture recognition is a tool to enable novel interactions with different techniques and applications, like Mixed Reality and Virtual Reality environments. With all the recent advancements in gesture recognition from skeletal data, it is still unclear how well state-of-the-art techniques perform in a scenario using precise motions with two hands. This paper presents the results of the SHREC 2024 contest organized to evaluate methods for their recognition of highly similar hand motions using the skeletal spatial coordinate data of both hands. The task is the recognition of 7 motion classes given their spatial coordinates in a frame-by-frame motion. The skeletal data has been captured using a Vicon system and pre-processed into a coordinate system using Blender and Vicon Shogun Post. We created a small, novel dataset with a high variety of durations in frames. This paper shows the results of the contest, showing the techniques created by the 5 research groups on this challenging task and comparing them to our baseline method.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"123 ","pages":"Article 104012"},"PeriodicalIF":2.5,"publicationDate":"2024-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S009784932400147X/pdfft?md5=d75274a315e451ba3701d800635d5155&pid=1-s2.0-S009784932400147X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141704615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Visual narratives to edutain against misleading visualizations in healthcare","authors":"Anna Shilo, Renata G. Raidou","doi":"10.1016/j.cag.2024.104011","DOIUrl":"10.1016/j.cag.2024.104011","url":null,"abstract":"<div><p>We propose an interactive game based on visual narratives to <em>edutain</em>, i.e., to educate while entertaining, broad audiences against misleading visualizations in healthcare. Uncertainty at various stages of the visualization pipeline may give rise to misleading visual representations. These comprise misleading elements that may negatively impact the audiences by contributing to misinformed decisions, delayed treatments, and a lack of trust in medical information. We investigate whether visual narratives within the setting of an educational game support recognizing and addressing misleading elements in healthcare-related visualizations. Our methodological approach focuses on three key aspects: <em>(i)</em> identifying uncertainty types in the visualization pipeline which could serve as the origin of misleading elements, <em>(ii)</em> designing fictional visual narratives that comprise several misleading elements linking to these uncertainties, and <em>(iii)</em> proposing an interactive game that aids the communication of these misleading visualization elements to broad audiences. The game features eight fictional visual narratives built around misleading visualizations, each with specific assumptions linked to uncertainties. Players assess the correctness of these assumptions to earn points and rewards. In case of incorrect assessments, interactive explanations are provided to enhance understanding For an initial assessment of our game, we conducted a user study with 21 participants. Our study indicates that when participants incorrectly assess assumptions, they also spend more time elaborating on the reasons for their mistakes, indicating a willingness to learn more. The study also provided positive indications on game aspects such as memorability, reinforcement, and engagement, while it gave us pointers for future improvement.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"123 ","pages":"Article 104011"},"PeriodicalIF":2.5,"publicationDate":"2024-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0097849324001468/pdfft?md5=57b27d37f4d2906c08649b9ce6e5e5e3&pid=1-s2.0-S0097849324001468-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141690022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"From coin to 3D face sculpture portraits in the round of Roman emperors","authors":"Umberto Castellani , Riccardo Bartolomioli , Giacomo Marchioro , Dario Calomino","doi":"10.1016/j.cag.2024.103999","DOIUrl":"10.1016/j.cag.2024.103999","url":null,"abstract":"<div><p>Representing historical figures on visual media has always been a crucial aspect of political communication in the ancient world, as it is in modern society. A great example comes from ancient Rome, when the emperor’s portraits were serially replicated on visual media to disseminate his image across the countries ruled by the Romans and to assert the power and authority that he embodied by making him universally recognizable. In particular, one of the most common media through which ancient Romans spread the imperial image was coinage, which showed a bi-dimensional projection of his portrait on the very low relief produced by the impression of the coin-die. In this work, we propose a new method that uses a multi-modal 2D and 3D approach to reconstruct the full portrait in the round of Roman emperors from their images adopted on ancient coins. A well-defined pipeline is introduced from the digitization of coins using 3D scanning techniques to the estimation of the 3D model of the portrait represented by a polygonal mesh. A morphable model trained on real 3D faces is exploited to infer the morphological (i.e., geometric) characteristics of the Roman emperor from the contours extracted from a coin portrait using a model fitting procedure. We present examples of face reconstruction of different emperors from coins produced in Rome as well as in the imperial provinces, which sometimes showed local variations of the official portraits centrally designed.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"123 ","pages":"Article 103999"},"PeriodicalIF":2.5,"publicationDate":"2024-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0097849324001341/pdfft?md5=8a2768f574543214216168dfcdcc1d4c&pid=1-s2.0-S0097849324001341-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141703436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Iany Macedo Barcelos, Taís Bruno Rabelo, Flavia Bernardini, Rodrigo Salvador Monteiro, Leandro Augusto Frata Fernandes
{"title":"From past to present: A tertiary investigation of twenty-four years of image inpainting","authors":"Iany Macedo Barcelos, Taís Bruno Rabelo, Flavia Bernardini, Rodrigo Salvador Monteiro, Leandro Augusto Frata Fernandes","doi":"10.1016/j.cag.2024.104010","DOIUrl":"10.1016/j.cag.2024.104010","url":null,"abstract":"<div><p>Inpainting techniques, rooted in ancient art restoration practices, have become essential tools for digital image editing in modern contexts. Despite their widespread applications across diverse domains, the rapid advance of inpainting methodologies has highlighted the need for comprehensive reviews to document progress and identify areas for deeper investigation. Although there are many works in literature describing the state of the art regarding inpainting methods, algorithms, and technologies, many of them are presented lacking methodological rigor, which compromises the reliability and validity of their conclusions. In light of the wide literature about inpainting, this tertiary review aims to systematically identify their main techniques, recurring challenges, and applications through the perspective of secondary studies, providing a helpful background for new researchers. Our findings are based on an analysis of 45 reviews, where one of the major issues observed was the lack of standardization in the classification of methods, and to address this, we provide a concise and clear classification. Furthermore, we present a summary of the most commonly used metrics and a discussion of the main shortcomings and applications, which extend beyond digital image restoration to include medical imaging, three-dimensional restoration, cultural heritage preservation, and more. While inpainting poses challenges, this review aims to inspire further exploration and advancement in the field by providing a comprehensive overview of inpainting research.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"123 ","pages":"Article 104010"},"PeriodicalIF":2.5,"publicationDate":"2024-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141698158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yukun Lu , Yuhang Wang , Peng Song , Hang Siang Wong , Yingjuan Mok , Ligang Liu
{"title":"Computational design of custom-fit PAP masks","authors":"Yukun Lu , Yuhang Wang , Peng Song , Hang Siang Wong , Yingjuan Mok , Ligang Liu","doi":"10.1016/j.cag.2024.103998","DOIUrl":"https://doi.org/10.1016/j.cag.2024.103998","url":null,"abstract":"<div><p>Positive airway pressure (PAP) therapy refers to sleep disordered breathing treatment that uses a stream of compressed air to support the airway during sleep. Even though the use of PAP therapy has been shown to be effective in improving the symptoms and quality of life, many patients are intolerant of the treatment due to poor mask fit. In this paper, our goal is to develop a computational approach for designing custom-fit PAP masks such that they can achieve better mask fit performance in terms of mask leakage and comfort. Our key observation is that a custom-fit PAP mask should fit a patient’s face in its deformed state instead of in its rest state since the PAP mask cushion undergoes notable deformation before reaching an equilibrium state during PAP therapy. To this end, we compute the equilibrium state of a mask cushion using the finite element method, and quantitatively measure the leakage and comfort of the mask cushion in this state. We further optimize the mask cushion geometry to minimize the two measures while ensuring that the cushion can be easily fabricated with molding. We demonstrate the effectiveness of our computational approach on a variety of face models and different types of PAP masks. Experimental results on real subjects show that our designed custom-fit PAP masks are able to achieve better mask fit performance than a generic PAP mask and custom-fit PAP masks designed by a state-of-the-art approach.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"122 ","pages":"Article 103998"},"PeriodicalIF":2.5,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141607447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SketchCleanGAN: A generative network to enhance and correct query sketches for improving 3D CAD model retrieval systems","authors":"Kamalesh Kumar Kosalaraman, Prasad Pralhad Kendre, Raghwani Dhaval Manilal, Ramanathan Muthuganapathy","doi":"10.1016/j.cag.2024.104000","DOIUrl":"10.1016/j.cag.2024.104000","url":null,"abstract":"<div><p>Given an input query, a search and retrieval system fetches relevant information from a dataset. In the Engineering domain, such a system is beneficial for tasks such as design reuse. A two-dimensional (2D) sketch is more conducive for an end user to give as a query than a three-dimensional (3D) object. Such query sketches, nevertheless, will inevitably contain defects like incomplete lines, mesh lines, overdrawn areas, missing areas, etc. Since a retrieval system’s results are only as good as the query, it is necessary to improve the query sketches.</p><p>In this paper, the problem of transforming a defective CAD sketch into a defect-free sketch is addressed using Generative Adversarial Networks (GANs), which, to the best of our knowledge, has not been investigated before. We first create a dataset of 534 hand-drawn sketches by tracing the boundaries of images of CAD models. We then pair the corrected sketches with their corresponding defective sketches and use them for training a C-WGAN (Conditional Wasserstein Generative Adversarial Network), called SketchCleanGAN. We model the transformation from defective to defect-free sketch as a factorization of the defective input sketch and then translate it to the space of defect-free sketch. We propose a three-branch strategy to this problem. Ablation studies and comparisons with other state-of-the-art techniques demonstrate the efficacy of the proposed technique. Additionally, we also contribute to a dataset of around 58000 improved sketches using the proposed framework.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"123 ","pages":"Article 104000"},"PeriodicalIF":2.5,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141694929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Quadratic-attraction subdivision with contraction-ratio λ=12","authors":"Kȩstutis Karčiauskas , Jörg Peters","doi":"10.1016/j.cag.2024.104001","DOIUrl":"10.1016/j.cag.2024.104001","url":null,"abstract":"<div><p>Classic generalized subdivision, such as Catmull–Clark subdivision, as well as recent subdivision algorithms for high-quality surfaces, rely on slower convergence towards extraordinary points for mesh nodes surrounded by <span><math><mrow><mi>n</mi><mo>></mo><mn>4</mn></mrow></math></span> quadrilaterals. Slow convergence corresponds to a contraction-ratio of <span><math><mrow><mi>λ</mi><mo>></mo><mn>0</mn><mo>.</mo><mn>5</mn></mrow></math></span>. To improve shape, prevent parameterization discordant with surface growth, or to improve convergence in isogeometric analysis near extraordinary points, a number of algorithms explicitly adjust <span><math><mi>λ</mi></math></span> by altering refinement rules. However, such tuning of <span><math><mi>λ</mi></math></span> has so far led to poorer surface quality, visible as uneven distribution or oscillation of highlight lines. The recent Quadratic-Attraction Subdivision (QAS) generates high-quality, bounded curvature surfaces based on a careful choice of quadratic expansion at the central point and, just like Catmull–Clark subdivision, creates the control points of the next subdivision ring by matrix multiplication. But QAS shares the contraction-ratio <span><math><mrow><msub><mrow><mi>λ</mi></mrow><mrow><mi>C</mi><mi>C</mi></mrow></msub><mo>></mo><mn>1</mn><mo>/</mo><mn>2</mn></mrow></math></span> of Catmull–Clark subdivision when <span><math><mrow><mi>n</mi><mo>></mo><mn>4</mn></mrow></math></span>. For <span><math><mrow><mi>n</mi><mo>=</mo><mn>5</mn><mo>,</mo><mo>…</mo><mo>,</mo><mn>10</mn></mrow></math></span>, QAS<span><math><msub><mrow></mrow><mrow><mo>+</mo></mrow></msub></math></span> improves the convergence to the uniform <span><math><mrow><mi>λ</mi><mo>=</mo><mfrac><mrow><mn>1</mn></mrow><mrow><mn>2</mn></mrow></mfrac></mrow></math></span> of binary domain refinement and without sacrificing surface quality compared to QAS.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"123 ","pages":"Article 104001"},"PeriodicalIF":2.5,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141690207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guillaume Gisbert, Raphaëlle Chaine, David Coeurjolly
{"title":"Neural inpainting of folded fabrics with interactive editing","authors":"Guillaume Gisbert, Raphaëlle Chaine, David Coeurjolly","doi":"10.1016/j.cag.2024.103997","DOIUrl":"10.1016/j.cag.2024.103997","url":null,"abstract":"<div><p>We propose a deep learning approach for inpainting holes in digital models of fabric surfaces. Leveraging the developable nature of fabric surfaces, we flatten the area surrounding the holes with minor distortion and regularly sample it to obtain a discrete 2D map of the 3D embedding, with an indicator mask outlining holes locations. This enables the use of a standard 2D convolutional neural network to inpaint holes given the 3D positioning of the surface. The provided neural architecture includes an attention mechanism to capture long-range relationships on the surface. Finally, we provide <em>ScarfFolds</em>, a database of folded fabrics patches with varying complexity, which is used to train our convolutional network in a supervised manner. We successfully tested our approach on various examples and illustrated that previous 3D deep learning approaches suffer from several issues when applied to fabrics. Also, our method allows the users to interact with the construction of the inpainted surface. The editing is interactive and supports many tools like vertex grabbing, drape twisting or pinching.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"122 ","pages":"Article 103997"},"PeriodicalIF":2.5,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141630891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SOA: Seed point offset attention for indoor 3D object detection in point clouds","authors":"Jun Shu , Shiqi Yu , Xinyi Shu , Jiewen Hu","doi":"10.1016/j.cag.2024.103992","DOIUrl":"10.1016/j.cag.2024.103992","url":null,"abstract":"<div><p>Three-dimensional object detection plays a pivotal role in scene understanding and holds significant importance in various indoor perception applications. Traditional methods based on Hough voting are susceptible to interference from background points or neighboring objects when casting votes for the target’s center from each seed point. Moreover, fixed-size set abstraction modules may result in the loss of structural information for large objects. To address these challenges, this paper proposes a three-dimensional object detection model based on seed point offset attention. The objective of this model is to enhance the model’s resilience to voting noise interference and alleviate feature loss for large-scale objects. Specifically, a seed point offset tensor is first defined, and then the offset tensor self-attention network is employed to learn the weights between votes, thereby establishing a correlation between the voting semantic features and the object structural information. Furthermore, an object surface perception module is introduced, which incorporates detailed features of local object surfaces into global feature representations through vote backtracking and surface mapping. Experimental results indicate that the model achieved excellent performance on the ScanNet-V2 ([email protected], 60.3%) and SUN RGB-D ([email protected], 64.0%) datasets, respectively improving by 2.6% ([email protected]) and 5.4% ([email protected]) compared to VoteNet.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"123 ","pages":"Article 103992"},"PeriodicalIF":2.5,"publicationDate":"2024-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141708916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}