{"title":"FQM-GC: Full-reference Quality Metric for Colored Point Cloud Based on Graph Signal Features and Color Features","authors":"Ke-Xin Zhang, G. Jiang, Mei Yu","doi":"10.1145/3469877.3490578","DOIUrl":"https://doi.org/10.1145/3469877.3490578","url":null,"abstract":"Colored Point Cloud (CPC) is often distorted in the processes of its acquisition, processing, and compression, so reliable quality assessment metrics are required to estimate the perception of distortion of CPC. We propose a Full-reference Quality Metric for colored point cloud based on Graph signal features and Color features (FQM-GC). For geometric distortion, the normal and coordinate information of the sub-clouds divided via geometric segmentation is used to construct their underlying graphs, then, the geometric structure features are extracted. For color distortion, the corresponding color statistical features are extracted from regions divided with color attribution. Meanwhile, the color features of different regions are weighted to simulate the visual masking effect. Finally, all the extracted features are formed into a feature vector to estimate the quality of CPCs. Experimental results on three databases (CPCD2.0, IRPC and SJTU-PCQA) show that the proposed metric FQM-GC is more consistent with human visual perception.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123766833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatically Generate Rigged Character from Single Image","authors":"Zhanpeng Huang, Rui Han, Jianwen Huang, Hao Yin, Zipeng Qin, Zibin Wang","doi":"10.1145/3469877.3490565","DOIUrl":"https://doi.org/10.1145/3469877.3490565","url":null,"abstract":"Animation plays an important role in virtual reality and augmented reality applications. However, it requires great efforts for non-professional users to create animation assets. In this paper, we propose a systematic pipeline to generate ready-to-used characters from images for real-time animation without user intervention. Rather than per-pixel mapping or synthesis in image space using optical flow or generative models, we employ an approximate geometric embodiment to undertake 3D animation without large distortion. The geometry structure is generated from a type-agnostic character. A skeleton adaption is then adopted to guarantee semantic motion transfer to the geometry proxy. The generated character is compatible with standard 3D graphics engines and ready to use for real-time applications. Experiments show that our method works on various images (e.g. sketches, cartoons, and photos) of most object categories (e.g. human, animals, and non-creatures). We develop an AR demo to show its potential usage for fast prototyping.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124754355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Efficient Bus Crowdedness Classification System","authors":"Lingcan Meng, Xiushan Nie, Zhifang Tan","doi":"10.1145/3469877.3493587","DOIUrl":"https://doi.org/10.1145/3469877.3493587","url":null,"abstract":"We propose an efficient bus crowdedness classification system that can be used in daily life. In particular, we analyze and study the data collected from real bus, aiming to deal with the difficulty of bus congestion classification. Besides, we combine deep learning and computer vision technology to extract images or videos from the internal surveillance cameras of the bus. The information of crowd will finally be integrated with algorithms into a complete classification system. As a consequence, when the user enters the system and submits the image or video to be detected, the system will display the classification results in turn. The classification results include passenger density distribution, number of passengers, date, and algorithm running time. In addition, the user can use the mouse to delineate an area in the passenger density distribution map and count any image area.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127444519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rui Wang, Chengyu Zheng, Yanru Jiang, Zhao-Hui Wang, Min Ye, Chenglong Wang, Ning Song, Jie Nie
{"title":"A Fine-Grained River Ice Semantic Segmentation based on Attentive Features and Enhancing Feature Fusion","authors":"Rui Wang, Chengyu Zheng, Yanru Jiang, Zhao-Hui Wang, Min Ye, Chenglong Wang, Ning Song, Jie Nie","doi":"10.1145/3469877.3497698","DOIUrl":"https://doi.org/10.1145/3469877.3497698","url":null,"abstract":"The semantic segmentation of frazil ice and anchor ice is of great significance for river management, ship navigation, and ice hazard forecasting in cold regions. Especially, distinguishing frazil ice from sediment-carrying anchor ice can increase the estimation accuracy of the sediment transportation capacity of the river. Although the river ice semantic segmentation methods based on deep learning has achieved great prediction accuracy, there is still the problem of insufficient feature extraction. To address this problem, we proposed a Fine-Grained River Ice Semantic Segmentation (FGRIS) based on attentive features and enhancing feature fusion to deal with these challenges. First, we propose a Dual-Attention Mechanism (DAM) method, which uses a combination of channel attention features and position attention features to extract more comprehensive semantic features. Then, we proposed a novel Branch Feature Fusion (BFF) module to bridge the semantic feature gap between high-level feature semantic features and low-level semantic features, which is robust to different scales. Experimental results conducted on Alberta River Ice Segmentation Dataset demonstrate the superiority of the proposed method.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128081916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Impression of a Job Interview training agent that gives rationalized feedback: Should Virtual Agent Give Advice with Rationale?","authors":"Nao Takeuchi, Tomoko Koda","doi":"10.1145/3469877.3493598","DOIUrl":"https://doi.org/10.1145/3469877.3493598","url":null,"abstract":"The COVID-19 pandemic has had a significant socio-economic impact on the world. Specifically, social distancing has impacted many activities that were previously conducted face-to-face. One of these was the training that students receive for job interviews. Thus, we developed a job interview training system that will give students the ability to continue receiving this type of training. Our system recognized the nonverbal behaviors of an interviewee, namely gaze, facial expression, and posture and compares the recognition results with those of models of exemplary nonverbal behaviors of an interviewee. A virtual agent acted as an advisor gives feedback on the interviewee's behaviors that need improvement. In order to verify the effectiveness of the two kinds of feedback, namely, rationalized feedback (with quantitative recognition results) vs. non-rationalized one, we compared interviewees’ impression. The results of the evaluation experiment indicated that the virtual agent with rationalized feedback was rated as more reliable but less friendly than the non-rationalized feedback.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126690738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Huan Wang, Yunhui Shi, Jin Wang, Gang Wu, N. Ling, Baocai Yin
{"title":"Spherical Image Compression Using Spherical Wavelet Transform","authors":"Huan Wang, Yunhui Shi, Jin Wang, Gang Wu, N. Ling, Baocai Yin","doi":"10.1145/3469877.3490577","DOIUrl":"https://doi.org/10.1145/3469877.3490577","url":null,"abstract":"The Spherical Measure Based Spherical Image Representation (SMSIR) has nearly uniformly distributed pixels in the spherical domain with effective index schemes. Based on SMSIR, the spherical wavelet transform can be efficiently designed, which can capture the spherical geometry feature in a compact manner and provides a powerful tool for spherical image compression. In this paper, we propose an efficient compression scheme for SMSIR images named Spherical Set Partitioning in Hierarchical Trees (S-SPIHT) using the spherical wavelet transform, which exploits the inherent similarities across the subbands in the spherical wavelet decomposition of a SMSIR image. The proposed S-SPIHT can progressively transform spherical wavelet coefficients into bit-stream, and generate an embedded compressed bit-stream that can be efficiently decoded at several spherical image quality levels. The most crucial part of our proposed S-SPIHT is the redesign of scanning the wavelet coefficients corresponding to different index schemes. We design three scanning methods, namely ordered root tree index scanning (ORTIS), dyadic index progressive scanning(DIPS) and dyadic index cross scanning(DICS)to efficiently reorganize the wavelet coefficients. These methods can effectively exploit the self-similarity between sub-bands and the fact that the high-frequency sub-bands mostly contain insignificant coefficients. Experimental results on widely-used datasets demonstrate that our proposed S-SPIHT outperforms the straightforward SPIHT for SMSIR images in terms of PSNR, S-PSNR and SSIM.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130236949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Attention-based Dual-Branches Localization Network for Weakly Supervised Object Localization","authors":"Wenjun Hui, Chuangchuang Tan, Guanghua Gu","doi":"10.1145/3469877.3490568","DOIUrl":"https://doi.org/10.1145/3469877.3490568","url":null,"abstract":"Weakly supervised object localization exploits the last convolutional feature maps of classification model and the weights of Fully-Connected (FC) layer to achieves localization. However, high-level feature maps for localization lack edge features. Additionally, the weights are specific to classification task, causing only discriminative regions to be discovered. In order to fuse edge features and adjust the attention distribution for feature map channels, we propose an efficient method called Attention-based Dual-Branches Localization (ADBL) Network, in which dual-branches structure and attention mechanism are adopted to mine edge features and non-discriminative features for locating more target areas. Specifically, dual-branches structure cascades low-level feature maps to mine target object edge regions. Additionally, during inference stage, attention mechanism assigns appropriate attention for different features to preserve non-discriminative areas. Extensive experiments on both ILSVRC and CUB-200-2011 datasets show that the ADBL method achieves substantial performance improvements.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131001813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SangeetXML: An XML Format for Score Retrieval for Indic Music","authors":"Chandan Misra","doi":"10.1145/3469877.3493697","DOIUrl":"https://doi.org/10.1145/3469877.3493697","url":null,"abstract":"Efficient retrieval of score information from a large set of XML-encoded scores and lyrics in an XML database requires such music data to be stored in a well-structured and systematic technique. Current search engines for Indic music (Tagore songs in the present context) retrieves only metadata and lacks scores and lyric retrieval schemes. Being vastly different from its western counterpart, an Indic music piece is required to be encoded in a different way than the XML format used for western music like MusicXML. Such encoding requires a proper understanding of the structure of the music sheet and its careful implementation in XML. In this paper, we propose the development of an XML-based format, SangeetXML, for exchanging and retrieving Indic music information from a theoretical 2D matrix model Swaralipi. We implement SangeetXML by formatting a sample of Rabindra Sangeet (read Tagore Songs in English) compositions and highlights the feasibility of an easy and quick retrieval system based on SangeetXML through XQuery, the de-facto standard for querying XML-encoded data.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132379071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards Discriminative Visual Search via Semantically Cycle-consistent Hashing Networks","authors":"Zheng Zhang, Jianning Wang, Guangming Lu","doi":"10.1145/3469877.3490583","DOIUrl":"https://doi.org/10.1145/3469877.3490583","url":null,"abstract":"Deep hashing has shown great potentials in large-scale visual similarity search due to preferable storage and computation efficiency. Typically, deep hashing encodes visual features into compact binary codes by preserving representative semantic visual features. Works in this area mainly focus on building the relationship between the visual and objective hash space, while they seldom study the triadic cross-domain semantic knowledge transfer among visual, semantic and hashing spaces, leading to serious semantic ignorance problem during space transformation. In this paper, we propose a novel deep tripartite semantically interactive hashing framework, dubbed Semantically Cycle-consistent Hashing Networks (SCHN), for discriminative hash code learning. Particularly, we construct a flexible semantic space and a transitive latent space, in conjunction with the visual space, to jointly deduce the privileged discriminative hash space. Specifically, a semantic space is conceived to strengthen the flexibility and completeness of categories in feature inference. Moreover, a transitive latent space is formulated to explore the shared semantic interactivity embedded in visual and semantic features. Our SCHN, for the first time, establishes the cyclic principle of deep semantic-preserving hashing by adaptive semantic parsing across different spaces in visual similarity search. In addition, the entire learning framework is jointly optimized in an end-to-end manner. Extensive experiments performed on diverse large-scale datasets evidence the superiority of our method against other state-of-the-art deep hashing algorithms.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121589772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Reinforcement Learning-Based Reward Mechanism for Molecule Generation that Introduces Activity Information","authors":"Hao Liu, Jinmeng Yan, Yuandong Zhou","doi":"10.1145/3469877.3497700","DOIUrl":"https://doi.org/10.1145/3469877.3497700","url":null,"abstract":"In this paper, we propose an activity prediction method for molecule generation based on the framework of reinforcement learning. The method is used as a scoring module for the molecule generation process. By introducing information about known active molecules for specific set of target conformations, it overcomes the traditional molecular optimization strategy where the method only uses computable properties. Eventually, our prediction method improves the quality of the generated molecules. The prediction method utilized fusion features that consist of traditional countable properties of molecules such as atomic number and the binding property of the molecule to the target. Furthermore, this paper designs a ultra large-scale molecular docking parallel computing method, which greatly improves the performance of the molecular docking [1] scoring process. The computing method makes the high-quality docking computing to predict molecular activity possible. The final experimental result shows that the molecule generation model using the prediction method can produce nearly twenty percent active molecules, which shows that the method proposed in this paper can effectively improve the performance of molecule generation.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115944114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}