ArXivPub Date : 2024-03-04DOI: 10.59275/j.melba.2024-d434
Aisha L. Shuaibu, Ivor J. A. Simpson
{"title":"HyperPredict: Estimating Hyperparameter Effects for Instance-Specific Regularization in Deformable Image Registration","authors":"Aisha L. Shuaibu, Ivor J. A. Simpson","doi":"10.59275/j.melba.2024-d434","DOIUrl":"https://doi.org/10.59275/j.melba.2024-d434","url":null,"abstract":"Methods for medical image registration infer geometric transformations that align pairs, or groups, of images by maximising an image similarity metric. This problem is ill-posed as several solutions may have equivalent likelihoods, also optimising purely for image similarity can yield implausible deformable transformations. For these reasons regularization terms are essential to obtain meaningful registration results. However, this requires the introduction of at least one hyperparameter, often termed λ, which serves as a trade-off between loss terms. In some approaches and situations, the quality of the estimated transformation greatly depends on hyperparameter choice, and different choices may be required depending on the characteristics of the data. Analyzing the effect of these hyperparameters requires labelled data, which is not commonly available at test-time. In this paper, we propose a novel method for evaluating the influence of hyperparameters and subsequently selecting an optimal value for given pair of images. Our approach, which we call HyperPredict, implements a Multi-Layer Perceptron that learns the effect of selecting particular hyperparameters for registering an image pair by predicting the resulting segmentation overlap and measures of deformation smoothness. This approach enables us to select optimal hyperparameters at test time without requiring labelled data, removing the need for a one-size-fits-all cross-validation approach. Furthermore, the criteria used to define optimal hyperparameter is flexible post-training, allowing us to efficiently choose specific properties (e.g. overlap of specific anatomical regions of interest, smoothness/plausibility of the final displacement field). We evaluate our proposed method on the OASIS brain MR standard benchmark dataset using a recent deep learning approach (cLapIRN) and an algorithmic method (Niftyreg). Our results demonstrate good performance in predicting the effects of regularization hyperparameters and highlight the benefits of our image-pair specific approach to hyperparameter selection.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"18 3‐4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140398051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ArXivPub Date : 2024-03-04DOI: 10.1109/ICASSP48485.2024.10447643
Bo Li, Yuyan Chen, Liang Zeng
{"title":"KeNet: Knowledge-enhanced Doc-Label Attention Network for Multi-label text classification","authors":"Bo Li, Yuyan Chen, Liang Zeng","doi":"10.1109/ICASSP48485.2024.10447643","DOIUrl":"https://doi.org/10.1109/ICASSP48485.2024.10447643","url":null,"abstract":"Multi-Label Text Classification (MLTC) is a fundamental task in the field of Natural Language Processing (NLP) that involves the assignment of multiple labels to a given text. MLTC has gained significant importance and has been widely applied in various domains such as topic recognition, recommendation systems, sentiment analysis, and information retrieval. However, traditional machine learning and Deep neural network have not yet addressed certain issues, such as the fact that some documents are brief but have a large number of labels and how to establish relationships between the labels. It is imperative to additionally acknowledge that the significance of knowledge is substantiated in the realm of MLTC. To address this issue, we provide a novel approach known as Knowledge-enhanced Doc-Label Attention Network (KeNet). Specifically, we design an Attention Network that incorporates external knowledge, label embedding, and a comprehensive attention mechanism. In contrast to conventional methods, we use comprehensive representation of documents, knowledge and labels to predict all labels for each single text. Our approach has been validated by comprehensive research conducted on three multi-label datasets. Experimental results demonstrate that our method outperforms state-of-the-art MLTC method. Additionally, a case study is undertaken to illustrate the practical implementation of KeNet.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"62 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140398014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ArXivPub Date : 2024-03-04DOI: 10.1145/3613904.3642239
Litao Yan, Alyssa Hwang, Zhiyuan Wu, Andrew Head
{"title":"Ivie: Lightweight Anchored Explanations of Just-Generated Code","authors":"Litao Yan, Alyssa Hwang, Zhiyuan Wu, Andrew Head","doi":"10.1145/3613904.3642239","DOIUrl":"https://doi.org/10.1145/3613904.3642239","url":null,"abstract":"Programming assistants have reshaped the experience of programming into one where programmers spend less time writing and more time critically examining code. In this paper, we explore how programming assistants can be extended to accelerate the inspection of generated code. We introduce an extension to the programming assistant called Ivie, or instantly visible in-situ explanations. When using Ivie, a programmer's generated code is instantly accompanied by explanations positioned just adjacent to the code. Our design was optimized for extremely low-cost invocation and dismissal. Explanations are compact and informative. They describe meaningful expressions, from individual variables to entire blocks of code. We present an implementation of Ivie that forks VS Code, applying a modern LLM for timely segmentation and explanation of generated code. In a lab study, we compared Ivie to a contemporary baseline tool for code understanding. Ivie improved understanding of generated code, and was received by programmers as a highly useful, low distraction, desirable complement to the programming assistant.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"4 5‐6","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140397920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ArXivPub Date : 2024-03-04DOI: 10.1145/3613904.3641971
Nina Rajcic, M. T. Llano, Jon McCormack
{"title":"Towards A Diffractive Analysis of Prompt-Based Generative AI","authors":"Nina Rajcic, M. T. Llano, Jon McCormack","doi":"10.1145/3613904.3641971","DOIUrl":"https://doi.org/10.1145/3613904.3641971","url":null,"abstract":"Recent developments in prompt-based generative AI has given rise to discourse surrounding the perceived ethical concerns, economic implications, and consequences for the future of cultural production. As generative imagery becomes pervasive in mainstream society, dominated primarily by emerging industry leaders, we encourage that the role of the CHI community be one of inquiry; to investigate the numerous ways in which generative AI has the potential to, and already is, augmenting human creativity. In this paper, we conducted a diffractive analysis exploring the potential role of prompt-based interfaces in artists' creative practice. Over a two week period, seven visual artists were given access to a personalised instance of Stable Diffusion, fine-tuned on a dataset of their work. In the following diffractive analysis, we identified two dominant modes adopted by participants, AI for ideation, and AI for production. We furthermore present a number of ethical design considerations for the future development of generative AI interfaces.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"27 1‐2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140397904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ArXivPub Date : 2024-03-04DOI: 10.1609/aaai.v38i3.27968
Shuai Guo, Q. Wang, Yijie Gao, Rong Xie, Li Song
{"title":"Depth-Guided Robust and Fast Point Cloud Fusion NeRF for Sparse Input Views","authors":"Shuai Guo, Q. Wang, Yijie Gao, Rong Xie, Li Song","doi":"10.1609/aaai.v38i3.27968","DOIUrl":"https://doi.org/10.1609/aaai.v38i3.27968","url":null,"abstract":"Novel-view synthesis with sparse input views is important for real-world applications like AR/VR and autonomous driving. Recent methods have integrated depth information into NeRFs for sparse input synthesis, leveraging depth prior for geometric and spatial understanding. However, most existing works tend to overlook inaccuracies within depth maps and have low time efficiency. To address these issues, we propose a depth-guided robust and fast point cloud fusion NeRF for sparse inputs. We perceive radiance fields as an explicit voxel grid of features. A point cloud is constructed for each input view, characterized within the voxel grid using matrices and vectors. We accumulate the point cloud of each input view to construct the fused point cloud of the entire scene. Each voxel determines its density and appearance by referring to the point cloud of the entire scene. Through point cloud fusion and voxel grid fine-tuning, inaccuracies in depth values are refined or substituted by those from other views. Moreover, our method can achieve faster reconstruction and greater compactness through effective vector-matrix decomposition. Experimental results underline the superior performance and time efficiency of our approach compared to state-of-the-art baselines.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"115 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140397958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ArXivPub Date : 2024-03-04DOI: 10.1145/3613904.3642450
Wazeer Zulfikar, Samantha Chan, Pattie Maes
{"title":"Memoro: Using Large Language Models to Realize a Concise Interface for Real-Time Memory Augmentation","authors":"Wazeer Zulfikar, Samantha Chan, Pattie Maes","doi":"10.1145/3613904.3642450","DOIUrl":"https://doi.org/10.1145/3613904.3642450","url":null,"abstract":"People have to remember an ever-expanding volume of information. Wearables that use information capture and retrieval for memory augmentation can help but can be disruptive and cumbersome in real-world tasks, such as in social settings. To address this, we developed Memoro, a wearable audio-based memory assistant with a concise user interface. Memoro uses a large language model (LLM) to infer the user's memory needs in a conversational context, semantically search memories, and present minimal suggestions. The assistant has two interaction modes: Query Mode for voicing queries and Queryless Mode for on-demand predictive assistance, without explicit query. Our study of (N=20) participants engaged in a real-time conversation demonstrated that using Memoro reduced device interaction time and increased recall confidence while preserving conversational quality. We report quantitative results and discuss the preferences and experiences of users. This work contributes towards utilizing LLMs to design wearable memory augmentation systems that are minimally disruptive.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"110 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140398178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ArXivPub Date : 2024-03-04DOI: 10.1145/3613904.3642389
Xinyan Yu, Marius Hoggenmueller, M. Tomitsch
{"title":"From Agent Autonomy to Casual Collaboration: A Design Investigation on Help-Seeking Urban Robots","authors":"Xinyan Yu, Marius Hoggenmueller, M. Tomitsch","doi":"10.1145/3613904.3642389","DOIUrl":"https://doi.org/10.1145/3613904.3642389","url":null,"abstract":"As intelligent agents transition from controlled to uncontrolled environments, they face challenges that sometimes exceed their operational capabilities. In many scenarios, they rely on assistance from bystanders to overcome those challenges. Using robots that get stuck in urban settings as an example, we investigate how agents can prompt bystanders into providing assistance. We conducted four focus group sessions with 17 participants that involved bodystorming, where participants assumed the role of robots and bystander pedestrians in role-playing activities. Generating insights from both assumed robot and bystander perspectives, we were able to identify potential non-verbal help-seeking strategies (i.e., addressing bystanders, cueing intentions, and displaying emotions) and factors shaping the assistive behaviours of bystanders. Drawing on these findings, we offer design considerations for help-seeking urban robots and other agents operating in uncontrolled environments to foster casual collaboration, encompass expressiveness, align with agent social categories, and curate appropriate incentives.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"137 1‐3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140398174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comprehensive evaluation of Mal-API-2019 dataset by machine learning in malware detection","authors":"Zhenglin Li, Haibei Zhu, Houze Liu, Jintong Song, Qishuo Cheng","doi":"10.62051/ijcsit.v2n1.01","DOIUrl":"https://doi.org/10.62051/ijcsit.v2n1.01","url":null,"abstract":"This study conducts a thorough examination of malware detection using machine learning techniques, focusing on the evaluation of various classification models using the Mal-API-2019 dataset. The aim is to advance cybersecurity capabilities by identifying and mitigating threats more effectively. Both ensemble and non-ensemble machine learning methods, such as Random Forest, XGBoost, K Nearest Neighbor (KNN), and Neural Networks, are explored. Special emphasis is placed on the importance of data pre-processing techniques, particularly TF-IDF representation and Principal Component Analysis, in improving model performance. Results indicate that ensemble methods, particularly Random Forest and XGBoost, exhibit superior accuracy, precision, and recall compared to others, highlighting their effectiveness in malware detection. The paper also discusses limitations and potential future directions, emphasizing the need for continuous adaptation to address the evolving nature of malware. This research contributes to ongoing discussions in cybersecurity and provides practical insights for developing more robust malware detection systems in the digital era.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"14 4‐6","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140398058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ArXivPub Date : 2024-03-04DOI: 10.1109/icassp48485.2024.10445878
Kuan-Hsun Ho, J. Hung, Berlin Chen
{"title":"What do neural networks listen to? Exploring the crucial bands in Speech Enhancement using Sinc-convolution","authors":"Kuan-Hsun Ho, J. Hung, Berlin Chen","doi":"10.1109/icassp48485.2024.10445878","DOIUrl":"https://doi.org/10.1109/icassp48485.2024.10445878","url":null,"abstract":"This study introduces a reformed Sinc-convolution (Sincconv) framework tailored for the encoder component of deep networks for speech enhancement (SE). The reformed Sincconv, based on parametrized sinc functions as band-pass filters, offers notable advantages in terms of training efficiency, filter diversity, and interpretability. The reformed Sinc-conv is evaluated in conjunction with various SE models, showcasing its ability to boost SE performance. Furthermore, the reformed Sincconv provides valuable insights into the specific frequency components that are prioritized in an SE scenario. This opens up a new direction of SE research and improving our knowledge of their operating dynamics.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"101 10‐12","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140398192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ArXivPub Date : 2024-03-04DOI: 10.14722/madweb.2024.23035
Naif Mehanna, Walter Rudametkin, Pierre Laperdrix, Antoine Vastel
{"title":"Free Proxies Unmasked: A Vulnerability and Longitudinal Analysis of Free Proxy Services","authors":"Naif Mehanna, Walter Rudametkin, Pierre Laperdrix, Antoine Vastel","doi":"10.14722/madweb.2024.23035","DOIUrl":"https://doi.org/10.14722/madweb.2024.23035","url":null,"abstract":"Free-proxies have been widespread since the early days of the Web, helping users bypass geo-blocked content and conceal their IP addresses. Various proxy providers promise faster Internet or increased privacy while advertising their lists comprised of hundreds of readily available free proxies. However, while paid proxy services advertise the support of encrypted connections and high stability, free proxies often lack such guarantees, making them prone to malicious activities such as eavesdropping or modifying content. Furthermore, there is a market that encourages exploiting devices to install proxies. In this paper, we present a 30-month longitudinal study analyzing the stability, security, and potential manipulation of free web proxies that we collected from 11 providers. Our collection resulted in over 640,600 proxies, that we cumulatively tested daily. We find that only 34.5% of proxies were active at least once during our tests, showcasing the general instability of free proxies. Geographically, a majority of proxies originate from the US and China. Leveraging the Shodan search engine, we identified 4,452 distinct vulnerabilities on the proxies' IP addresses, including 1,755 vulnerabilities that allow unauthorized remote code execution and 2,036 that enable privilege escalation on the host device. Through the software analysis on the proxies' IP addresses, we find that 42,206 of them appear to run on MikroTik routers. Worryingly, we also discovered 16,923 proxies that manipulate content, indicating potential malicious intent by proxy owners. Ultimately, our research reveals that the use of free web proxies poses significant risks to users' privacy and security. The instability, vulnerabilities, and potential for malicious actions uncovered in our analysis lead us to strongly caution users against relying on free proxies.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"25 1‐2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140397909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}