ACM Multimedia Asia最新文献

筛选
英文 中文
Head-Motion-Aware Viewport Margins for Improving User Experience in Immersive Video 改善沉浸式视频用户体验的头部运动感知视口边距
ACM Multimedia Asia Pub Date : 2021-12-01 DOI: 10.1145/3469877.3490573
Mehmet N. Akcay, Burak Kara, Saba Ahsan, A. Begen, I. Curcio, Emre B. Aksu
{"title":"Head-Motion-Aware Viewport Margins for Improving User Experience in Immersive Video","authors":"Mehmet N. Akcay, Burak Kara, Saba Ahsan, A. Begen, I. Curcio, Emre B. Aksu","doi":"10.1145/3469877.3490573","DOIUrl":"https://doi.org/10.1145/3469877.3490573","url":null,"abstract":"Viewport-dependent delivery (VDD) is a technique to save network resources during the transmission of immersive videos. However, it results in a non-zero motion-to-high-quality delay (MTHQD), which is the delta time from the moment where the current viewport has at least one low-quality tile to when all the tiles in the new viewport are rendered in high quality. MTHQD is an important metric in the evaluation of the VDD systems. This paper improves an earlier concept called viewport margins by introducing head-motion awareness. The primary benefit of this improvement is the reduction (up to 64%) in the average MTHQD.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"142 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121361126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Learning to Decompose and Restore Low-light Images with Wavelet Transform 学习用小波变换分解和恢复弱光图像
ACM Multimedia Asia Pub Date : 2021-12-01 DOI: 10.1145/3469877.3490622
Pengju Zhang, Chaofan Zhang, Zheng Rong, Yihong Wu
{"title":"Learning to Decompose and Restore Low-light Images with Wavelet Transform","authors":"Pengju Zhang, Chaofan Zhang, Zheng Rong, Yihong Wu","doi":"10.1145/3469877.3490622","DOIUrl":"https://doi.org/10.1145/3469877.3490622","url":null,"abstract":"Low-light images often suffer from low visibility and various noise. Most existing low-light image enhancement methods often amplify noise when enhancing low-light images, due to the neglect of separating valuable image information and noise. In this paper, we propose a novel wavelet-based attention network, where wavelet transform is integrated into attention learning for joint low-light enhancement and noise suppression. Particularly, the proposed wavelet-based attention network includes a Decomposition-Net, an Enhancement-Net and a Restoration-Net. In Decomposition-Net, to benefit denoising, wavelet transform layers are designed for separating noise and global content information into different frequency features. Furthermore, an attention-based strategy is introduced to progressively select suitable frequency features for accurately restoring illumination and reflectance according to Retinex theory. In addition, Enhancement-Net is introduced for further removing degradations in reflectance and adjusting illumination, while Restoration-Net employs conditional adversarial learning to adversarially improve the visual quality of final restored results based on enhanced illumination and reflectance. Extensive experiments on several public datasets demonstrate that the proposed method achieves more pleasing results than state-of-the-art methods.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"55 9","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132090962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Hard-Boundary Attention Network for Nuclei Instance Segmentation 核实例分割的硬边界关注网络
ACM Multimedia Asia Pub Date : 2021-12-01 DOI: 10.1145/3469877.3490602
Yalu Cheng, Pengchong Qiao, Hong-Ju He, Guoli Song, Jie Chen
{"title":"Hard-Boundary Attention Network for Nuclei Instance Segmentation","authors":"Yalu Cheng, Pengchong Qiao, Hong-Ju He, Guoli Song, Jie Chen","doi":"10.1145/3469877.3490602","DOIUrl":"https://doi.org/10.1145/3469877.3490602","url":null,"abstract":"Image segmentation plays an important role in medical image analysis, and accurate segmentation of nuclei is especially crucial to clinical diagnosis. However, existing methods fail to segment dense nuclei due to the hard-boundary which has similar texture to nuclear inside. To this end, we propose a Hard-Boundary Attention Network (HBANet) for nuclei instance segmentation. Specifically, we propose a Background Weaken Module (BWM) to weaken the attention of our model to the nucleus background by integrating low-level features into high-level features. To improve the robustness of the model to the hard-boundary of nuclei, we further design a Gradient-based boundary adaptive Strategy (GS) which generates boundary-weakened data for model training in an adversarial manner. We conduct extensive experiments on MoNuSeg and CPM-17 datasets, and experimental results show that our HBANet outperforms the state-of-the-art methods.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114408864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Embarrassingly Simple Approach to Discrete Supervised Hashing 离散监督哈希的一种令人尴尬的简单方法
ACM Multimedia Asia Pub Date : 2021-12-01 DOI: 10.1145/3469877.3493595
Shuguang Zhao, Bingzhi Chen, Zheng Zhang, Guangming Lu
{"title":"An Embarrassingly Simple Approach to Discrete Supervised Hashing","authors":"Shuguang Zhao, Bingzhi Chen, Zheng Zhang, Guangming Lu","doi":"10.1145/3469877.3493595","DOIUrl":"https://doi.org/10.1145/3469877.3493595","url":null,"abstract":"Prior hashing works typically learn a projection function from high-dimensional visual feature space to low-dimensional latent space. However, such a projection function remains several crucial bottlenecks: 1) information loss and coding redundancy are inevitable; 2) the available information of semantic labels is not well-explored; 3) the learned latent embedding lacks explicit semantic meaning. To overcome these limitations, we propose a novel supervised Discrete Auto-Encoder Hashing (DAEH) framework, in which a linear auto-encoder can effectively project the semantic labels of images into a latent representation space. Instead of using the visual feature projection, the proposed DAEH framework skillfully explores the semantic information of supervised labels to refine the latent feature embedding and further optimizes hashing function. Meanwhile, we reformulate the objective and relax the discrete constraints for the binary optimization problem. Extensive experiments on Caltech-256, CIFAR-10, and MNIST datasets demonstrate that our method can outperform the state-of-the-art hashing baselines.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122494134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Language Based Image Quality Assessment 基于语言的图像质量评估
ACM Multimedia Asia Pub Date : 2021-12-01 DOI: 10.1145/3469877.3490605
L. Galteri, Lorenzo Seidenari, P. Bongini, M. Bertini, A. Bimbo
{"title":"Language Based Image Quality Assessment","authors":"L. Galteri, Lorenzo Seidenari, P. Bongini, M. Bertini, A. Bimbo","doi":"10.1145/3469877.3490605","DOIUrl":"https://doi.org/10.1145/3469877.3490605","url":null,"abstract":"Evaluation of generative models, in the visual domain, is often performed providing anecdotal results to the reader. In the case of image enhancement, reference images are usually available. Nonetheless, using signal based metrics often leads to counterintuitive results: highly natural crisp images may obtain worse scores than blurry ones. On the other hand, blind reference image assessment may rank images reconstructed with GANs higher than the original undistorted images. To avoid time consuming human based image assessment, semantic computer vision tasks may be exploited instead [9, 25, 33]. In this paper we advocate the use of language generation tasks to evaluate the quality of restored images. We show experimentally that image captioning, used as a downstream task, may serve as a method to score image quality. Captioning scores are better aligned with human rankings with respect to signal based metrics or no-reference image quality metrics. We show insights on how the corruption, by artifacts, of local image structure may steer image captions in the wrong direction.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116426881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Chinese White Dolphin Detection in the Wild 在野外发现中华白海豚
ACM Multimedia Asia Pub Date : 2021-12-01 DOI: 10.1145/3469877.3490574
Hao Zhang, Qi Zhang, P. Nguyen, Victor C. S. Lee, Antoni B. Chan
{"title":"Chinese White Dolphin Detection in the Wild","authors":"Hao Zhang, Qi Zhang, P. Nguyen, Victor C. S. Lee, Antoni B. Chan","doi":"10.1145/3469877.3490574","DOIUrl":"https://doi.org/10.1145/3469877.3490574","url":null,"abstract":"For ecological protection of the ocean, biologists usually conduct line-transect vessel surveys to measure sea species’ population density within their habitat (such as dolphins). However, sea species observation via vessel surveys consumes a lot of manpower resources and is more challenging compared to observing common objects, due to the scarcity of the object in the wild, tiny-size of the objects, and similar-sized distracter objects (e.g., floating trash). To reduce the human experts’ workload and improve the observation accuracy, in this paper, we develop a practical system to detect Chinese White Dolphins in the wild automatically. First, we construct a dataset named Dolphin-14k with more than 2.6k dolphin instances. To improve the dataset annotation efficiency caused by the rarity of dolphins, we design an interactive dolphin box annotation strategy to annotate sparse dolphin instances in long videos efficiently. Second, we compare the performance and efficiency of three off-the-shelf object detection algorithms, including Faster-RCNN, FCOS, and YoloV5, on the Dolphin-14k dataset and pick YoloV5 as the detector, where a new category (Distracter) is added to the model training to reject the false positives. Finally, we incorporate the dolphin detector into a system prototype, which detects dolphins in video frames at 100.99 FPS per GPU with high accuracy (i.e., 90.95 mAP@0.5).","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126273047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Reinforcement Learning and Docking Simulations for autonomous molecule generation in de novo Drug Design 新药物设计中自主分子生成的深度强化学习和对接模拟
ACM Multimedia Asia Pub Date : 2021-12-01 DOI: 10.1145/3469877.3497694
Hao Liu, Qian Wang, Xiaotong Hu
{"title":"Deep Reinforcement Learning and Docking Simulations for autonomous molecule generation in de novo Drug Design","authors":"Hao Liu, Qian Wang, Xiaotong Hu","doi":"10.1145/3469877.3497694","DOIUrl":"https://doi.org/10.1145/3469877.3497694","url":null,"abstract":"In medicinal chemistry programs, it is key to design and make compounds that are efficacious and safe. In this study, we developed a new deep Reinforcement learning-based compounds molecular generation method. Because chemical space is impractically large, and many existing generation models generate molecules that lack effectiveness, novelty and unsatisfactory molecular properties. Our proposed method-DeepRLDS, which integrates transformer network, balanced binary tree search and docking simulation based on super large-scale supercomputing, can solve these problems well. Experiments show that more than 96 of the generated molecules are chemically valid, 99 of the generated molecules are chemically novelty, the generated molecules have satisfactory molecular properties and possess a broader chemical space distribution.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127348166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Intra- and Inter-frame Iterative Temporal Convolutional Networks for Video Stabilization 用于视频稳定的帧内和帧间迭代时间卷积网络
ACM Multimedia Asia Pub Date : 2021-12-01 DOI: 10.1145/3469877.3490608
Haopeng Xie, Liang Xiao, Huicong Wu
{"title":"Intra- and Inter-frame Iterative Temporal Convolutional Networks for Video Stabilization","authors":"Haopeng Xie, Liang Xiao, Huicong Wu","doi":"10.1145/3469877.3490608","DOIUrl":"https://doi.org/10.1145/3469877.3490608","url":null,"abstract":"Video jitter is an uncomfortable product of irregular lens motion in time sequence. How to extract motion state information in a period of continuous video frames is a major issue for video stabilization. In this paper, we propose a novel sequence model, Intra- and Inter-frame Iterative Temporal Convolutional Networks (I3TC-Net), which alternatively transfer the spatial-temporal correlation of motion within and between frames. We hypothesize that the motion state information can be represented by transmission states. Specifically, we employ combination of Convolutional Long Short-Term Memory (ConvLSTM) and embedded encoder-decoder to generate the latent stable frame, which are used to update transmission states iteratively and learn a global homography transformation effectively for each unstable frame to generate the corresponding stabilized result along the time axis. Furthermore, we create a video dataset to solve the lack of stable data and improve the training effect. Experimental results show that our method outperforms state-of-the-art results on publicly available videos, such as 5.4 points improvements in stability score. The project page is available at https://github.com/root2022IIITC/IIITC.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"42 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130679449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Differentially Private Learning with Grouped Gradient Clipping 分组梯度裁剪的差异私有学习
ACM Multimedia Asia Pub Date : 2021-12-01 DOI: 10.1145/3469877.3490594
Haolin Liu, Chenyu Li, Bochao Liu, Pengju Wang, Shiming Ge, Weiping Wang
{"title":"Differentially Private Learning with Grouped Gradient Clipping","authors":"Haolin Liu, Chenyu Li, Bochao Liu, Pengju Wang, Shiming Ge, Weiping Wang","doi":"10.1145/3469877.3490594","DOIUrl":"https://doi.org/10.1145/3469877.3490594","url":null,"abstract":"While deep learning has proved success in many critical tasks by training models from large-scale data, some private information within can be recovered from the released models, leading to the leakage of privacy. To address this problem, this paper presents a differentially private deep learning paradigm to train private models. In the approach, we propose and incorporate a simple operation termed grouped gradient clipping to modulate the gradient weights. We also incorporated the smooth sensitivity mechanism into differentially private deep learning paradigm, which bounds the adding Gaussian noise. In this way, the resulting model can simultaneously provide with strong privacy protection and avoid accuracy degradation, providing a good trade-off between privacy and performance. The theoretic advantages of grouped gradient clipping are well analyzed. Extensive evaluations on popular benchmarks and comparisons with 11 state-of-the-arts clearly demonstrate the effectiveness and genearalizability of our approach.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131369552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Multi-Scale Graph Convolutional Network and Dynamic Iterative Class Loss for Ship Segmentation in Remote Sensing Images 基于多尺度图卷积网络和动态迭代类损失的遥感图像船舶分割
ACM Multimedia Asia Pub Date : 2021-12-01 DOI: 10.1145/3469877.3497699
Yanru Jiang, Chengyu Zheng, Zhao-Hui Wang, Rui Wang, Min Ye, Chenglong Wang, Ning Song, Jie Nie
{"title":"Multi-Scale Graph Convolutional Network and Dynamic Iterative Class Loss for Ship Segmentation in Remote Sensing Images","authors":"Yanru Jiang, Chengyu Zheng, Zhao-Hui Wang, Rui Wang, Min Ye, Chenglong Wang, Ning Song, Jie Nie","doi":"10.1145/3469877.3497699","DOIUrl":"https://doi.org/10.1145/3469877.3497699","url":null,"abstract":"The accuracy of the semantic segmentation results of ships is of great significance to coastline navigation, resource management, and territorial protection. Although the ship semantic segmentation method based on deep learning has made great progress, there is still the problem of not exploring the correlation between the targets. In order to avoid the above problems, this paper designed a multi-scale graph convolutional network and dynamic iterative class loss for ship segmentation in remote sensing images to generate more accurate segmentation results. Based on DeepLabv3+, our network uses deep convolutional networks and atrous convolutions for multi-scale feature extraction. In particular, for multi-scale semantic features, we propose to construct a Multi-Scale Graph Convolution Network (MSGCN) to introduce semantic correlation information for pixel feature learning by GCN, which enhances the segmentation result of ship objects. In addition, we propose a Dynamic Iterative Class Loss (DICL) based on iterative batch-wise class rectification instead of pre-computing the fixed weights over the whole dataset, which solves the problem of imbalance between positive and negative samples. We compared the proposed algorithm with the most advanced deep learning target detection methods and ship detection methods and proved the superiority of our method. On a High-Resolution SAR Images Dataset [1], ship detection and instance segmentation can be implemented well.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"98 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113983351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信