{"title":"DEMAU: Decompose, Explore, Model and Analyse Uncertainties","authors":"Arthur Hoarau, Vincent Lemaire","doi":"arxiv-2409.08105","DOIUrl":"https://doi.org/arxiv-2409.08105","url":null,"abstract":"Recent research in machine learning has given rise to a flourishing\u0000literature on the quantification and decomposition of model uncertainty. This\u0000information can be very useful during interactions with the learner, such as in\u0000active learning or adaptive learning, and especially in uncertainty sampling.\u0000To allow a simple representation of these total, epistemic (reducible) and\u0000aleatoric (irreducible) uncertainties, we offer DEMAU, an open-source\u0000educational, exploratory and analytical tool allowing to visualize and explore\u0000several types of uncertainty for classification models in machine learning.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142180605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aye Phyu Phyu Aung, Jay Chaudhary, Ji Wei Yoon, Senthilnath Jayavelu
{"title":"XMOL: Explainable Multi-property Optimization of Molecules","authors":"Aye Phyu Phyu Aung, Jay Chaudhary, Ji Wei Yoon, Senthilnath Jayavelu","doi":"arxiv-2409.07786","DOIUrl":"https://doi.org/arxiv-2409.07786","url":null,"abstract":"Molecular optimization is a key challenge in drug discovery and material\u0000science domain, involving the design of molecules with desired properties.\u0000Existing methods focus predominantly on single-property optimization,\u0000necessitating repetitive runs to target multiple properties, which is\u0000inefficient and computationally expensive. Moreover, these methods often lack\u0000transparency, making it difficult for researchers to understand and control the\u0000optimization process. To address these issues, we propose a novel framework,\u0000Explainable Multi-property Optimization of Molecules (XMOL), to optimize\u0000multiple molecular properties simultaneously while incorporating\u0000explainability. Our approach builds on state-of-the-art geometric diffusion\u0000models, extending them to multi-property optimization through the introduction\u0000of spectral normalization and enhanced molecular constraints for stabilized\u0000training. Additionally, we integrate interpretive and explainable techniques\u0000throughout the optimization process. We evaluated XMOL on the real-world\u0000molecular datasets i.e., QM9, demonstrating its effectiveness in both single\u0000property and multiple properties optimization while offering interpretable\u0000results, paving the way for more efficient and reliable molecular design.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142223713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tristan Benoit, Yunru Wang, Moritz Dannehl, Johannes Kinder
{"title":"BLens: Contrastive Captioning of Binary Functions using Ensemble Embedding","authors":"Tristan Benoit, Yunru Wang, Moritz Dannehl, Johannes Kinder","doi":"arxiv-2409.07889","DOIUrl":"https://doi.org/arxiv-2409.07889","url":null,"abstract":"Function names can greatly aid human reverse engineers, which has spurred\u0000development of machine learning-based approaches to predicting function names\u0000in stripped binaries. Much current work in this area now uses transformers,\u0000applying a metaphor of machine translation from code to function names. Still,\u0000function naming models face challenges in generalizing to projects completely\u0000unrelated to the training set. In this paper, we take a completely new approach\u0000by transferring advances in automated image captioning to the domain of binary\u0000reverse engineering, such that different parts of a binary function can be\u0000associated with parts of its name. We propose BLens, which combines multiple\u0000binary function embeddings into a new ensemble representation, aligns it with\u0000the name representation latent space via a contrastive learning approach, and\u0000generates function names with a transformer architecture tailored for function\u0000names. In our experiments, we demonstrate that BLens significantly outperforms\u0000the state of the art. In the usual setting of splitting per binary, we achieve\u0000an $F_1$ score of 0.77 compared to 0.67. Moreover, in the cross-project\u0000setting, which emphasizes generalizability, we achieve an $F_1$ score of 0.46\u0000compared to 0.29.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142227503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Reda Alami, Ali Khalifa Almansoori, Ahmed Alzubaidi, Mohamed El Amine Seddik, Mugariya Farooq, Hakim Hacid
{"title":"Alignment with Preference Optimization Is All You Need for LLM Safety","authors":"Reda Alami, Ali Khalifa Almansoori, Ahmed Alzubaidi, Mohamed El Amine Seddik, Mugariya Farooq, Hakim Hacid","doi":"arxiv-2409.07772","DOIUrl":"https://doi.org/arxiv-2409.07772","url":null,"abstract":"We demonstrate that preference optimization methods can effectively enhance\u0000LLM safety. Applying various alignment techniques to the Falcon 11B model using\u0000safety datasets, we achieve a significant boost in global safety score (from\u0000$57.64%$ to $99.90%$) as measured by LlamaGuard 3 8B, competing with\u0000state-of-the-art models. On toxicity benchmarks, average scores in adversarial\u0000settings dropped from over $0.6$ to less than $0.07$. However, this safety\u0000improvement comes at the cost of reduced general capabilities, particularly in\u0000math, suggesting a trade-off. We identify noise contrastive alignment\u0000(Safe-NCA) as an optimal method for balancing safety and performance. Our study\u0000ultimately shows that alignment techniques can be sufficient for building safe\u0000and robust models.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142180632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improve Machine Learning carbon footprint using Nvidia GPU and Mixed Precision training for classification algorithms","authors":"Andrew Antonopoulos","doi":"arxiv-2409.07853","DOIUrl":"https://doi.org/arxiv-2409.07853","url":null,"abstract":"This study was part of my dissertation for my master degree and compares the\u0000power consumption using the default floating point (32bit) and Nvidia mixed\u0000precision (16bit and 32bit) while training a classification ML model. A custom\u0000PC with specific hardware was built to perform the experiments, and different\u0000ML hyper-parameters, such as batch size, neurons, and epochs, were chosen to\u0000build Deep Neural Networks (DNN). Additionally, various software was used\u0000during the experiments to collect the power consumption data in Watts from the\u0000Graphics Processing Unit (GPU), Central Processing Unit (CPU), Random Access\u0000Memory (RAM) and manually from a wattmeter connected to the wall. A\u0000benchmarking test with default hyper parameter values for the DNN was used as a\u0000reference, while the experiments used a combination of different settings. The\u0000results were recorded in Excel, and descriptive statistics were chosen to\u0000calculate the mean between the groups and compare them using graphs and tables.\u0000The outcome was positive when using mixed precision combined with specific\u0000hyper-parameters. Compared to the benchmarking, the optimisation for the\u0000classification reduced the power consumption between 7 and 11 Watts. Similarly,\u0000the carbon footprint is reduced because the calculation uses the same power\u0000consumption data. Still, a consideration is required when configuring\u0000hyper-parameters because it can negatively affect hardware performance.\u0000However, this research required inferential statistics, specifically ANOVA and\u0000T-test, to compare the relationship between the means. Furthermore, tests\u0000indicated no statistical significance of the relationship between the\u0000benchmarking and experiments. However, a more extensive implementation with a\u0000cluster of GPUs can increase the sample size significantly, as it is an\u0000essential factor and can change the outcome of the statistical analysis.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142180615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Large Language Models are Pattern Matchers: Editing Semi-Structured and Structured Documents with ChatGPT","authors":"Irene Weber","doi":"arxiv-2409.07732","DOIUrl":"https://doi.org/arxiv-2409.07732","url":null,"abstract":"Large Language Models (LLMs) offer numerous applications, the full extent of\u0000which is not yet understood. This paper investigates if LLMs can be applied for\u0000editing structured and semi-structured documents with minimal effort. Using a\u0000qualitative research approach, we conduct two case studies with ChatGPT and\u0000thoroughly analyze the results. Our experiments indicate that LLMs can\u0000effectively edit structured and semi-structured documents when provided with\u0000basic, straightforward prompts. ChatGPT demonstrates a strong ability to\u0000recognize and process the structure of annotated documents. This suggests that\u0000explicitly structuring tasks and data in prompts might enhance an LLM's ability\u0000to understand and solve tasks. Furthermore, the experiments also reveal\u0000impressive pattern matching skills in ChatGPT. This observation deserves\u0000further investigation, as it may contribute to understanding the processes\u0000leading to hallucinations in LLMs.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142180618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Heterogeneous Sheaf Neural Networks","authors":"Luke Braithwaite, Iulia Duta, Pietro Liò","doi":"arxiv-2409.08036","DOIUrl":"https://doi.org/arxiv-2409.08036","url":null,"abstract":"Heterogeneous graphs, with nodes and edges of different types, are commonly\u0000used to model relational structures in many real-world applications. Standard\u0000Graph Neural Networks (GNNs) struggle to process heterogeneous data due to\u0000oversmoothing. Instead, current approaches have focused on accounting for the\u0000heterogeneity in the model architecture, leading to increasingly complex\u0000models. Inspired by recent work, we propose using cellular sheaves to model the\u0000heterogeneity in the graph's underlying topology. Instead of modelling the data\u0000as a graph, we represent it as cellular sheaves, which allows us to encode the\u0000different data types directly in the data structure, eliminating the need to\u0000inject them into the architecture. We introduce HetSheaf, a general framework\u0000for heterogeneous sheaf neural networks, and a series of heterogeneous sheaf\u0000predictors to better encode the data's heterogeneity into the sheaf structure.\u0000Finally, we empirically evaluate HetSheaf on several standard heterogeneous\u0000graph benchmarks, achieving competitive results whilst being more\u0000parameter-efficient.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142180607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"FPMT: Enhanced Semi-Supervised Model for Traffic Incident Detection","authors":"Xinying Lu, Jianli Xiao","doi":"arxiv-2409.07839","DOIUrl":"https://doi.org/arxiv-2409.07839","url":null,"abstract":"For traffic incident detection, the acquisition of data and labels is notably\u0000resource-intensive, rendering semi-supervised traffic incident detection both a\u0000formidable and consequential challenge. Thus, this paper focuses on traffic\u0000incident detection with a semi-supervised learning way. It proposes a\u0000semi-supervised learning model named FPMT within the framework of MixText. The\u0000data augmentation module introduces Generative Adversarial Networks to balance\u0000and expand the dataset. During the mix-up process in the hidden space, it\u0000employs a probabilistic pseudo-mixing mechanism to enhance regularization and\u0000elevate model precision. In terms of training strategy, it initiates with\u0000unsupervised training on all data, followed by supervised fine-tuning on a\u0000subset of labeled data, and ultimately completing the goal of semi-supervised\u0000training. Through empirical validation on four authentic datasets, our FPMT\u0000model exhibits outstanding performance across various metrics. Particularly\u0000noteworthy is its robust performance even in scenarios with low label rates.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142180614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A framework for measuring the training efficiency of a neural architecture","authors":"Eduardo Cueto-Mendoza, John D. Kelleher","doi":"arxiv-2409.07925","DOIUrl":"https://doi.org/arxiv-2409.07925","url":null,"abstract":"Measuring Efficiency in neural network system development is an open research\u0000problem. This paper presents an experimental framework to measure the training\u0000efficiency of a neural architecture. To demonstrate our approach, we analyze\u0000the training efficiency of Convolutional Neural Networks and Bayesian\u0000equivalents on the MNIST and CIFAR-10 tasks. Our results show that training\u0000efficiency decays as training progresses and varies across different stopping\u0000criteria for a given neural model and learning task. We also find a non-linear\u0000relationship between training stopping criteria, training Efficiency, model\u0000size, and training Efficiency. Furthermore, we illustrate the potential confounding effects of overtraining\u0000on measuring the training efficiency of a neural architecture. Regarding\u0000relative training efficiency across different architectures, our results\u0000indicate that CNNs are more efficient than BCNNs on both datasets. More\u0000generally, as a learning task becomes more complex, the relative difference in\u0000training efficiency between different architectures becomes more pronounced.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142223712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Click2Mask: Local Editing with Dynamic Mask Generation","authors":"Omer Regev, Omri Avrahami, Dani Lischinski","doi":"arxiv-2409.08272","DOIUrl":"https://doi.org/arxiv-2409.08272","url":null,"abstract":"Recent advancements in generative models have revolutionized image generation\u0000and editing, making these tasks accessible to non-experts. This paper focuses\u0000on local image editing, particularly the task of adding new content to a\u0000loosely specified area. Existing methods often require a precise mask or a\u0000detailed description of the location, which can be cumbersome and prone to\u0000errors. We propose Click2Mask, a novel approach that simplifies the local\u0000editing process by requiring only a single point of reference (in addition to\u0000the content description). A mask is dynamically grown around this point during\u0000a Blended Latent Diffusion (BLD) process, guided by a masked CLIP-based\u0000semantic loss. Click2Mask surpasses the limitations of segmentation-based and\u0000fine-tuning dependent methods, offering a more user-friendly and contextually\u0000accurate solution. Our experiments demonstrate that Click2Mask not only\u0000minimizes user effort but also delivers competitive or superior local image\u0000manipulation results compared to SoTA methods, according to both human\u0000judgement and automatic metrics. Key contributions include the simplification\u0000of user input, the ability to freely add objects unconstrained by existing\u0000segments, and the integration potential of our dynamic mask approach within\u0000other editing methods.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142180642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}