Yike Zhang, Eduardo Davalos, Dingjie Su, Ange Lou, Jack H. Noble
{"title":"Monocular Microscope to CT Registration using Pose Estimation of the Incus for Augmented Reality Cochlear Implant Surgery","authors":"Yike Zhang, Eduardo Davalos, Dingjie Su, Ange Lou, Jack H. Noble","doi":"10.1117/12.3008830","DOIUrl":"https://doi.org/10.1117/12.3008830","url":null,"abstract":"For those experiencing severe-to-profound sensorineural hearing loss, the cochlear implant (CI) is the preferred treatment. Augmented reality (AR) aided surgery can potentially improve CI procedures and hearing outcomes. Typically, AR solutions for image-guided surgery rely on optical tracking systems to register pre-operative planning information to the display so that hidden anatomy or other important information can be overlayed and co-registered with the view of the surgical scene. In this paper, our goal is to develop a method that permits direct 2D-to-3D registration of the microscope video to the pre-operative Computed Tomography (CT) scan without the need for external tracking equipment. Our proposed solution involves using surface mapping of a portion of the incus in surgical recordings and determining the pose of this structure relative to the surgical microscope by performing pose estimation via the perspective-n-point (PnP) algorithm. This registration can then be applied to pre-operative segmentations of other anatomy-of-interest, as well as the planned electrode insertion trajectory to co-register this information for the AR display. Our results demonstrate the accuracy with an average rotation error of less than 25 degrees and a translation error of less than 2 mm, 3 mm, and 0.55% for the x, y, and z axes, respectively. Our proposed method has the potential to be applicable and generalized to other surgical procedures while only needing a monocular microscope during intra-operation.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"53 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140394480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ArXivPub Date : 2024-03-12DOI: 10.1145/3589334.3645631
Rui Zhao, Jun Zhao
{"title":"Perennial Semantic Data Terms of Use for Decentralized Web","authors":"Rui Zhao, Jun Zhao","doi":"10.1145/3589334.3645631","DOIUrl":"https://doi.org/10.1145/3589334.3645631","url":null,"abstract":"In today's digital landscape, the Web has become increasingly centralized, raising concerns about user privacy violations. Decentralized Web architectures, such as Solid, offer a promising solution by empowering users with better control over their data in their personal `Pods'. However, a significant challenge remains: users must navigate numerous applications to decide which application can be trusted with access to their data Pods. This often involves reading lengthy and complex Terms of Use agreements, a process that users often find daunting or simply ignore. This compromises user autonomy and impedes detection of data misuse. We propose a novel formal description of Data Terms of Use (DToU), along with a DToU reasoner. Users and applications specify their own parts of the DToU policy with local knowledge, covering permissions, requirements, prohibitions and obligations. Automated reasoning verifies compliance, and also derives policies for output data. This constitutes a ``perennial'' DToU language, where the policy authoring only occurs once, and we can conduct ongoing automated checks across users, applications and activity cycles. Our solution is built on Turtle, Notation 3 and RDF Surfaces, for the language and the reasoning engine. It ensures seamless integration with other semantic tools for enhanced interoperability. We have successfully integrated this language into the Solid framework, and conducted performance benchmark. We believe this work demonstrates a practicality of a perennial DToU language and the potential of a paradigm shift to how users interact with data and applications in a decentralized Web, offering both improved privacy and usability.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"18 S25","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140395238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ArXivPub Date : 2024-03-12DOI: 10.1145/3613904.3642158
Xun Qian, Tianyi Wang, Xuhai Xu, Tanya R. Jonker, Kashyap Todi
{"title":"Fast-Forward Reality: Authoring Error-Free Context-Aware Policies with Real-Time Unit Tests in Extended Reality","authors":"Xun Qian, Tianyi Wang, Xuhai Xu, Tanya R. Jonker, Kashyap Todi","doi":"10.1145/3613904.3642158","DOIUrl":"https://doi.org/10.1145/3613904.3642158","url":null,"abstract":"Advances in ubiquitous computing have enabled end-user authoring of context-aware policies (CAPs) that control smart devices based on specific contexts of the user and environment. However, authoring CAPs accurately and avoiding run-time errors is challenging for end-users as it is difficult to foresee CAP behaviors under complex real-world conditions. We propose Fast-Forward Reality, an Extended Reality (XR) based authoring workflow that enables end-users to iteratively author and refine CAPs by validating their behaviors via simulated unit test cases. We develop a computational approach to automatically generate test cases based on the authored CAP and the user's context history. Our system delivers each test case with immersive visualizations in XR, facilitating users to verify the CAP behavior and identify necessary refinements. We evaluated Fast-Forward Reality in a user study (N=12). Our authoring and validation process improved the accuracy of CAPs and the users provided positive feedback on the system usability.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"23 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140395647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ArXivPub Date : 2024-03-12DOI: 10.1145/3589335.3651548
Shuxian Bi, Wenjie Wang, Hang Pan, Fuli Feng, Xiangnan He
{"title":"Proactive Recommendation with Iterative Preference Guidance","authors":"Shuxian Bi, Wenjie Wang, Hang Pan, Fuli Feng, Xiangnan He","doi":"10.1145/3589335.3651548","DOIUrl":"https://doi.org/10.1145/3589335.3651548","url":null,"abstract":"Recommender systems mainly tailor personalized recommendations according to user interests learned from user feedback. However, such recommender systems passively cater to user interests and even reinforce existing interests in the feedback loop, leading to problems like filter bubbles and opinion polarization. To counteract this, proactive recommendation actively steers users towards developing new interests in a target item or topic by strategically modulating recommendation sequences. Existing work for proactive recommendation faces significant hurdles: 1) overlooking the user feedback in the guidance process; 2) lacking explicit modeling of the guiding objective; and 3) insufficient flexibility for integration into existing industrial recommender systems. To address these issues, we introduce an Iterative Preference Guidance (IPG) framework. IPG performs proactive recommendation in a flexible post-processing manner by ranking items according to their IPG scores that consider both interaction probability and guiding value. These scores are explicitly estimated with iteratively updated user representation that considers the most recent user interactions. Extensive experiments validate that IPG can effectively guide user interests toward target interests with a reasonable trade-off in recommender accuracy. The code is available at https://github.com/GabyUSTC/IPG-Rec.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"75 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140395777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ArXivPub Date : 2024-03-12DOI: 10.1109/icassp48485.2024.10448096
Anastasios Arsenos, D. Kollias, Evangelos Petrongonas, Christos Skliros, S. Kollias
{"title":"Uncertainty-guided Contrastive Learning for Single Source Domain Generalisation","authors":"Anastasios Arsenos, D. Kollias, Evangelos Petrongonas, Christos Skliros, S. Kollias","doi":"10.1109/icassp48485.2024.10448096","DOIUrl":"https://doi.org/10.1109/icassp48485.2024.10448096","url":null,"abstract":"In the context of single domain generalisation, the objective is for models that have been exclusively trained on data from a single domain to demonstrate strong performance when confronted with various unfamiliar domains. In this paper, we introduce a novel model referred to as Contrastive Uncertainty Domain Generalisation Network (CUDGNet). The key idea is to augment the source capacity in both input and label spaces through the fictitious domain generator and jointly learn the domain invariant representation of each class through contrastive learning. Extensive experiments on two Single Source Domain Generalisation (SSDG) datasets demonstrate the effectiveness of our approach, which surpasses the state-of-the-art single-DG methods by up to $7.08%$. Our method also provides efficient uncertainty estimation at inference time from a single forward pass through the generator subnetwork.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"113 10","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140394947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ArXivPub Date : 2024-03-12DOI: 10.1145/3589335.3652001
Lucas Vogel, Thomas Springer, Matthias Wählisch
{"title":"From Files to Streams: Revisiting Web History and Exploring Potentials for Future Prospects","authors":"Lucas Vogel, Thomas Springer, Matthias Wählisch","doi":"10.1145/3589335.3652001","DOIUrl":"https://doi.org/10.1145/3589335.3652001","url":null,"abstract":"Over the last 30 years, the World Wide Web has changed significantly. In this paper, we argue that common practices to prepare web pages for delivery conflict with many efforts to present content with minimal latency, one fundamental goal that pushed changes in the WWW. To bolster our arguments, we revisit reasons that led to changes of HTTP and compare them systematically with techniques to prepare web pages. We found that the structure of many web pages leverages features of HTTP/1.1 but hinders the use of recent HTTP features to present content quickly. To improve the situation in the future, we propose fine-grained content segmentation. This would allow to exploit streaming capabilities of recent HTTP versions and to render content as quickly as possible without changing underlying protocols or web browsers.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"96 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140395499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ArXivPub Date : 2024-03-12DOI: 10.1609/aaai.v38i18.29966
Di Mi, Yanjun Zhang, Leo Yu Zhang, Shengshan Hu, Qi Zhong, Haizhuan Yuan, Shirui Pan
{"title":"Towards Model Extraction Attacks in GAN-Based Image Translation via Domain Shift Mitigation","authors":"Di Mi, Yanjun Zhang, Leo Yu Zhang, Shengshan Hu, Qi Zhong, Haizhuan Yuan, Shirui Pan","doi":"10.1609/aaai.v38i18.29966","DOIUrl":"https://doi.org/10.1609/aaai.v38i18.29966","url":null,"abstract":"Model extraction attacks (MEAs) enable an attacker to replicate the functionality of a victim deep neural network (DNN) model by only querying its API service remotely, posing a severe threat to the security and integrity of pay-per-query DNN-based services. Although the majority of current research on MEAs has primarily concentrated on neural classifiers, there is a growing prevalence of image-to-image translation (I2IT) tasks in our everyday activities. However, techniques developed for MEA of DNN classifiers cannot be directly transferred to the case of I2IT, rendering the vulnerability of I2IT models to MEA attacks often underestimated. This paper unveils the threat of MEA in I2IT tasks from a new perspective. Diverging from the traditional approach of bridging the distribution gap between attacker queries and victim training samples, we opt to mitigate the effect caused by the different distributions, known as the domain shift. This is achieved by introducing a new regularization term that penalizes high-frequency noise, and seeking a flatter minimum to avoid overfitting to the shifted distribution. Extensive experiments on different image translation tasks, including image super-resolution and style transfer, are performed on different backbone victim models, and the new design consistently outperforms the baseline by a large margin across all metrics. A few real-life I2IT APIs are also verified to be extremely vulnerable to our attack, emphasizing the need for enhanced defenses and potentially revised API publishing policies.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"19 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140395699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ArXivPub Date : 2024-03-12DOI: 10.1145/3613904.3642179
Ruican Zhong, Donghoon Shin, Rosemary Meza, P. Klasnja, Lucas Colusso, Gary Hsieh
{"title":"AI-Assisted Causal Pathway Diagram for Human-Centered Design","authors":"Ruican Zhong, Donghoon Shin, Rosemary Meza, P. Klasnja, Lucas Colusso, Gary Hsieh","doi":"10.1145/3613904.3642179","DOIUrl":"https://doi.org/10.1145/3613904.3642179","url":null,"abstract":"This paper explores the integration of causal pathway diagrams (CPD) into human-centered design (HCD), investigating how these diagrams can enhance the early stages of the design process. A dedicated CPD plugin for the online collaborative whiteboard platform Miro was developed to streamline diagram creation and offer real-time AI-driven guidance. Through a user study with designers (N=20), we found that CPD's branching and its emphasis on causal connections supported both divergent and convergent processes during design. CPD can also facilitate communication among stakeholders. Additionally, we found our plugin significantly reduces designers' cognitive workload and increases their creativity during brainstorming, highlighting the implications of AI-assisted tools in supporting creative work and evidence-based designs.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"63 S291","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140394815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ArXivPub Date : 2024-03-12DOI: 10.1145/3613904.3642266
Donghoon Shin, Lucy Lu Wang, Gary Hsieh
{"title":"From Paper to Card: Transforming Design Implications with Generative AI","authors":"Donghoon Shin, Lucy Lu Wang, Gary Hsieh","doi":"10.1145/3613904.3642266","DOIUrl":"https://doi.org/10.1145/3613904.3642266","url":null,"abstract":"Communicating design implications is common within the HCI community when publishing academic papers, yet these papers are rarely read and used by designers. One solution is to use design cards as a form of translational resource that communicates valuable insights from papers in a more digestible and accessible format to assist in design processes. However, creating design cards can be time-consuming, and authors may lack the resources/know-how to produce cards. Through an iterative design process, we built a system that helps create design cards from academic papers using an LLM and text-to-image model. Our evaluation with designers (N=21) and authors of selected papers (N=12) revealed that designers perceived the design implications from our design cards as more inspiring and generative, compared to reading original paper texts, and the authors viewed our system as an effective way of communicating their design implications. We also propose future enhancements for AI-generated design cards.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"113 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140395128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}