ArXiv Pub Date : 2024-03-07 DOI: 10.1609/aaai.v38i2.27847

Qingyuan Cai, Xuecai Hu, Saihui Hou, Li Yao, Yongzhen Huang

{"title":"Disentangled Diffusion-Based 3D Human Pose Estimation with Hierarchical Spatial and Temporal Denoiser","authors":"Qingyuan Cai, Xuecai Hu, Saihui Hou, Li Yao, Yongzhen Huang","doi":"10.1609/aaai.v38i2.27847","DOIUrl":"https://doi.org/10.1609/aaai.v38i2.27847","url":null,"abstract":"Recently, diffusion-based methods for monocular 3D human pose estimation have achieved state-of-the-art (SOTA) performance by directly regressing the 3D joint coordinates from the 2D pose sequence. Although some methods decompose the task into bone length and bone direction prediction based on the human anatomical skeleton to explicitly incorporate more human body prior constraints, the performance of these methods is significantly lower than that of the SOTA diffusion-based methods. This can be attributed to the tree structure of the human skeleton. Direct application of the disentangled method could amplify the accumulation of hierarchical errors, propagating through each hierarchy. Meanwhile, the hierarchical information has not been fully explored by the previous methods. To address these problems, a Disentangled Diffusion-based 3D human Pose Estimation method with Hierarchical Spatial and Temporal Denoiser is proposed, termed DDHPose. In our approach: (1) We disentangle the 3d pose and diffuse the bone length and bone direction during the forward process of the diffusion model to effectively model the human pose prior. A disentanglement loss is proposed to supervise diffusion model learning. (2) For the reverse process, we propose Hierarchical Spatial and Temporal Denoiser (HSTDenoiser) to improve the hierarchical modelling of each joint. Our HSTDenoiser comprises two components: the Hierarchical-Related Spatial Transformer (HRST) and the Hierarchical-Related Temporal Transformer (HRTT). HRST exploits joint spatial information and the influence of the parent joint on each joint for spatial modeling, while HRTT utilizes information from both the joint and its hierarchical adjacent joints to explore the hierarchical temporal correlations among joints. Extensive experiments on the Human3.6M and MPI-INF-3DHP datasets show that our method outperforms the SOTA disentangled-based, non-disentangled based, and probabilistic approaches by 10.0%, 2.0%, and 1.3%, respectively.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"18 18","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140397242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Towards Robustness Analysis of E-Commerce Ranking System 实现电子商务排名系统的稳健性分析

ArXiv Pub Date : 2024-03-07 DOI: 10.1145/3589335.3648335

Ningfei Wang, Yupin Huang, Han Cheng, Jiri Gesi, Xiaojie Wang, Vivek Mittal

{"title":"Towards Robustness Analysis of E-Commerce Ranking System","authors":"Ningfei Wang, Yupin Huang, Han Cheng, Jiri Gesi, Xiaojie Wang, Vivek Mittal","doi":"10.1145/3589335.3648335","DOIUrl":"https://doi.org/10.1145/3589335.3648335","url":null,"abstract":"Information retrieval (IR) is a pivotal component in various applications. Recent advances in machine learning (ML) have enabled the integration of ML algorithms into IR, particularly in ranking systems. While there is a plethora of research on the robustness of ML-based ranking systems, these studies largely neglect commercial e-commerce systems and fail to establish a connection between real-world and manipulated query relevance. In this paper, we present the first systematic measurement study on the robustness of e-commerce ranking systems. We define robustness as the consistency of ranking outcomes for semantically identical queries. To quantitatively analyze robustness, we propose a novel metric that considers both ranking position and item-specific information that are absent in existing metrics. Our large-scale measurement study with real-world data from e-commerce retailers reveals an open opportunity to measure and improve robustness since semantically identical queries often yield inconsistent ranking results. Based on our observations, we propose several solution directions to enhance robustness, such as the use of Large Language Models. Note that the issue of robustness discussed herein does not constitute an error or oversight. Rather, in scenarios where there exists a vast array of choices, it is feasible to present a multitude of products in various permutations, all of which could be equally appealing. However, this extensive selection may lead to customer confusion. As e-commerce retailers use various techniques to improve the quality of search results, we hope that this research offers valuable guidance for measuring the robustness of the ranking systems.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"23 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140397431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

iScore: Visual Analytics for Interpreting How Language Models Automatically Score Summaries iScore：解读语言模型如何为摘要自动评分的可视化分析技术

ArXiv Pub Date : 2024-03-07 DOI: 10.1145/3640543.3645142

Adam Joseph Coscia, Langdon Holmes, Wesley Morris, Joon Suh Choi, Scott Crossley, A. Endert

{"title":"iScore: Visual Analytics for Interpreting How Language Models Automatically Score Summaries","authors":"Adam Joseph Coscia, Langdon Holmes, Wesley Morris, Joon Suh Choi, Scott Crossley, A. Endert","doi":"10.1145/3640543.3645142","DOIUrl":"https://doi.org/10.1145/3640543.3645142","url":null,"abstract":"The recent explosion in popularity of large language models (LLMs) has inspired learning engineers to incorporate them into adaptive educational tools that automatically score summary writing. Understanding and evaluating LLMs is vital before deploying them in critical learning environments, yet their unprecedented size and expanding number of parameters inhibits transparency and impedes trust when they underperform. Through a collaborative user-centered design process with several learning engineers building and deploying summary scoring LLMs, we characterized fundamental design challenges and goals around interpreting their models, including aggregating large text inputs, tracking score provenance, and scaling LLM interpretability methods. To address their concerns, we developed iScore, an interactive visual analytics tool for learning engineers to upload, score, and compare multiple summaries simultaneously. Tightly integrated views allow users to iteratively revise the language in summaries, track changes in the resulting LLM scores, and visualize model weights at multiple levels of abstraction. To validate our approach, we deployed iScore with three learning engineers over the course of a month. We present a case study where interacting with iScore led a learning engineer to improve their LLM's score accuracy by three percentage points. Finally, we conducted qualitative interviews with the learning engineers that revealed how iScore enabled them to understand, evaluate, and build trust in their LLMs during deployment.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"22 40","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140397439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

DeepSee: Multidimensional Visualizations of Seabed Ecosystems DeepSee：多维可视化海底生态系统

ArXiv Pub Date : 2024-03-07 DOI: 10.1145/3613904.3642001

Adam Joseph Coscia, H. Sapers, Noah Deutsch, Malika Khurana, J. Magyar, Sergio A. Parra, Daniel R. Utter, R.L. Wipfler, D. Caress, Eric J. Martin, J. Paduan, M. Hendrie, S. Lombeyda, H. Mushkin, A. Endert, Scott Davidoff, V. Orphan

{"title":"DeepSee: Multidimensional Visualizations of Seabed Ecosystems","authors":"Adam Joseph Coscia, H. Sapers, Noah Deutsch, Malika Khurana, J. Magyar, Sergio A. Parra, Daniel R. Utter, R.L. Wipfler, D. Caress, Eric J. Martin, J. Paduan, M. Hendrie, S. Lombeyda, H. Mushkin, A. Endert, Scott Davidoff, V. Orphan","doi":"10.1145/3613904.3642001","DOIUrl":"https://doi.org/10.1145/3613904.3642001","url":null,"abstract":"Scientists studying deep ocean microbial ecosystems use limited numbers of sediment samples collected from the seafloor to characterize important life-sustaining biogeochemical cycles in the environment. Yet conducting fieldwork to sample these extreme remote environments is both expensive and time consuming, requiring tools that enable scientists to explore the sampling history of field sites and predict where taking new samples is likely to maximize scientific return. We conducted a collaborative, user-centered design study with a team of scientific researchers to develop DeepSee, an interactive data workspace that visualizes 2D and 3D interpolations of biogeochemical and microbial processes in context together with sediment sampling history overlaid on 2D seafloor maps. Based on a field deployment and qualitative interviews, we found that DeepSee increased the scientific return from limited sample sizes, catalyzed new research workflows, reduced long-term costs of sharing data, and supported teamwork and communication between team members with diverse research goals.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"23 25","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140397066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Adaptive Discovering and Merging for Incremental Novel Class Discovery 增量式新类别发现的自适应发现与合并

ArXiv Pub Date : 2024-03-06 DOI: 10.1609/aaai.v38i10.29006

Guangyao Chen, Peixi Peng, Yangru Huang, Mengyue Geng, Yonghong Tian

{"title":"Adaptive Discovering and Merging for Incremental Novel Class Discovery","authors":"Guangyao Chen, Peixi Peng, Yangru Huang, Mengyue Geng, Yonghong Tian","doi":"10.1609/aaai.v38i10.29006","DOIUrl":"https://doi.org/10.1609/aaai.v38i10.29006","url":null,"abstract":"One important desideratum of lifelong learning aims to discover novel classes from unlabelled data in a continuous manner. The central challenge is twofold: discovering and learning novel classes while mitigating the issue of catastrophic forgetting of established knowledge. To this end, we introduce a new paradigm called Adaptive Discovering and Merging (ADM) to discover novel categories adaptively in the incremental stage and integrate novel knowledge into the model without affecting the original knowledge. To discover novel classes adaptively, we decouple representation learning and novel class discovery, and use Triple Comparison (TC) and Probability Regularization (PR) to constrain the probability discrepancy and diversity for adaptive category assignment. To merge the learned novel knowledge adaptively, we propose a hybrid structure with base and novel branches named Adaptive Model Merging (AMM), which reduces the interference of the novel branch on the old classes to preserve the previous knowledge, and merges the novel branch to the base model without performance loss and parameter growth. Extensive experiments on several datasets show that ADM significantly outperforms existing class-incremental Novel Class Discovery (class-iNCD) approaches. Moreover, our AMM also benefits the class-incremental Learning (class-IL) task by alleviating the catastrophic forgetting problem. The source code is included in the supplementary materials.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"8 7","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140397312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

PromptCharm: Text-to-Image Generation through Multi-modal Prompting and Refinement PromptCharm：通过多模式提示和细化实现文本到图像的生成

ArXiv Pub Date : 2024-03-06 DOI: 10.1145/3613904.3642803

Zhijie Wang, Yuheng Huang, Da Song, Lei Ma, Tianyi Zhang

{"title":"PromptCharm: Text-to-Image Generation through Multi-modal Prompting and Refinement","authors":"Zhijie Wang, Yuheng Huang, Da Song, Lei Ma, Tianyi Zhang","doi":"10.1145/3613904.3642803","DOIUrl":"https://doi.org/10.1145/3613904.3642803","url":null,"abstract":"The recent advancements in Generative AI have significantly advanced the field of text-to-image generation. The state-of-the-art text-to-image model, Stable Diffusion, is now capable of synthesizing high-quality images with a strong sense of aesthetics. Crafting text prompts that align with the model's interpretation and the user's intent thus becomes crucial. However, prompting remains challenging for novice users due to the complexity of the stable diffusion model and the non-trivial efforts required for iteratively editing and refining the text prompts. To address these challenges, we propose PromptCharm, a mixed-initiative system that facilitates text-to-image creation through multi-modal prompt engineering and refinement. To assist novice users in prompting, PromptCharm first automatically refines and optimizes the user's initial prompt. Furthermore, PromptCharm supports the user in exploring and selecting different image styles within a large database. To assist users in effectively refining their prompts and images, PromptCharm renders model explanations by visualizing the model's attention values. If the user notices any unsatisfactory areas in the generated images, they can further refine the images through model attention adjustment or image inpainting within the rich feedback loop of PromptCharm. To evaluate the effectiveness and usability of PromptCharm, we conducted a controlled user study with 12 participants and an exploratory user study with another 12 participants. These two studies show that participants using PromptCharm were able to create images with higher quality and better aligned with the user's expectations compared with using two variants of PromptCharm that lacked interaction or visualization support.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"3 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140397167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Whodunit: Classifying Code as Human Authored or GPT-4 Generated - A case study on CodeChef problems 侦探：将代码分类为人工编写还是 GPT-4 生成--关于 CodeChef 问题的案例研究

ArXiv Pub Date : 2024-03-06 DOI: 10.1145/3643991.3644926

Oseremen Joy Idialu, N. Mathews, Rungroj Maipradit, J. Atlee, Mei Nagappan

{"title":"Whodunit: Classifying Code as Human Authored or GPT-4 Generated - A case study on CodeChef problems","authors":"Oseremen Joy Idialu, N. Mathews, Rungroj Maipradit, J. Atlee, Mei Nagappan","doi":"10.1145/3643991.3644926","DOIUrl":"https://doi.org/10.1145/3643991.3644926","url":null,"abstract":"Artificial intelligence (AI) assistants such as GitHub Copilot and ChatGPT, built on large language models like GPT-4, are revolutionizing how programming tasks are performed, raising questions about whether code is authored by generative AI models. Such questions are of particular interest to educators, who worry that these tools enable a new form of academic dishonesty, in which students submit AI generated code as their own work. Our research explores the viability of using code stylometry and machine learning to distinguish between GPT-4 generated and human-authored code. Our dataset comprises human-authored solutions from CodeChef and AI-authored solutions generated by GPT-4. Our classifier outperforms baselines, with an F1-score and AUC-ROC score of 0.91. A variant of our classifier that excludes gameable features (e.g., empty lines, whitespace) still performs well with an F1-score and AUC-ROC score of 0.89. We also evaluated our classifier with respect to the difficulty of the programming problem and found that there was almost no difference between easier and intermediate problems, and the classifier performed only slightly worse on harder problems. Our study shows that code stylometry is a promising approach for distinguishing between GPT-4 generated code and human-authored code.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140397321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Dcl-Net: Dual Contrastive Learning Network for Semi-Supervised Multi-Organ Segmentation Dcl-Net：用于半监督多器官分割的双对比学习网络

ArXiv Pub Date : 2024-03-06 DOI: 10.1109/icassp48485.2024.10447495

L. Wen, Zheng-Kai Feng, Yun Hou, Peng Wang, Xi Wu, Jiliu Zhou, Yan Wang

引用次数: 0

The Visual Debugger: Past, Present, and Future 可视化调试器：过去、现在和未来

ArXiv Pub Date : 2024-03-06 DOI: 10.1145/3643796.3648443

Tim Kräuter, Patrick Stünkel, Adrian Rutle, Yngve Lamo

引用次数: 0

German also Hallucinates! Inconsistency Detection in News Summaries with the Absinth Dataset 德国人也会产生幻觉！利用 Absinth 数据集检测新闻摘要中的不一致性

ArXiv Pub Date : 2024-03-06 DOI: 10.3929/ethz-b-000661775

Laura Mascarell, Ribin Chalumattu, Annette Rios

引用次数: 0

ArXiv最新文献