{"title":"Reinforcement Learning for an Efficient and Effective Malware Investigation during Cyber Incident Response","authors":"Dipo Dunsin, Mohamed Chahine Ghanem, Karim Ouazzane, Vassil Vassilev","doi":"arxiv-2408.01999","DOIUrl":"https://doi.org/arxiv-2408.01999","url":null,"abstract":"This research focused on enhancing post-incident malware forensic\u0000investigation using reinforcement learning RL. We proposed an advanced MDP post\u0000incident malware forensics investigation model and framework to expedite post\u0000incident forensics. We then implement our RL Malware Investigation Model based\u0000on structured MDP within the proposed framework. To identify malware artefacts,\u0000the RL agent acquires and examines forensics evidence files, iteratively\u0000improving its capabilities using Q Table and temporal difference learning. The\u0000Q learning algorithm significantly improved the agent ability to identify\u0000malware. An epsilon greedy exploration strategy and Q learning updates enabled\u0000efficient learning and decision making. Our experimental testing revealed that\u0000optimal learning rates depend on the MDP environment complexity, with simpler\u0000environments benefiting from higher rates for quicker convergence and complex\u0000ones requiring lower rates for stability. Our model performance in identifying\u0000and classifying malware reduced malware analysis time compared to human\u0000experts, demonstrating robustness and adaptability. The study highlighted the\u0000significance of hyper parameter tuning and suggested adaptive strategies for\u0000complex environments. Our RL based approach produced promising results and is\u0000validated as an alternative to traditional methods notably by offering\u0000continuous learning and adaptation to new and evolving malware threats which\u0000ultimately enhance the post incident forensics investigations.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"23 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141937230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Extracting Object Heights From LiDAR & Aerial Imagery","authors":"Jesus Guerrero","doi":"arxiv-2408.00967","DOIUrl":"https://doi.org/arxiv-2408.00967","url":null,"abstract":"This work shows a procedural method for extracting object heights from LiDAR\u0000and aerial imagery. We discuss how to get heights and the future of LiDAR and\u0000imagery processing. SOTA object segmentation allows us to take get object\u0000heights with no deep learning background. Engineers will be keeping track of\u0000world data across generations and reprocessing them. They will be using older\u0000procedural methods like this paper and newer ones discussed here. SOTA methods\u0000are going beyond analysis and into generative AI. We cover both a procedural\u0000methodology and the newer ones performed with language models. These include\u0000point cloud, imagery and text encoding allowing for spatially aware AI.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"22 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141937233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Giorgia Adorni, Francesca Mangili, Alberto Piatti, Claudio Bonesana, Alessandro Antonucci
{"title":"Rubric-based Learner Modelling via Noisy Gates Bayesian Networks for Computational Thinking Skills Assessment","authors":"Giorgia Adorni, Francesca Mangili, Alberto Piatti, Claudio Bonesana, Alessandro Antonucci","doi":"arxiv-2408.01221","DOIUrl":"https://doi.org/arxiv-2408.01221","url":null,"abstract":"In modern and personalised education, there is a growing interest in\u0000developing learners' competencies and accurately assessing them. In a previous\u0000work, we proposed a procedure for deriving a learner model for automatic skill\u0000assessment from a task-specific competence rubric, thus simplifying the\u0000implementation of automated assessment tools. The previous approach, however,\u0000suffered two main limitations: (i) the ordering between competencies defined by\u0000the assessment rubric was only indirectly modelled; (ii) supplementary skills,\u0000not under assessment but necessary for accomplishing the task, were not\u0000included in the model. In this work, we address issue (i) by introducing dummy\u0000observed nodes, strictly enforcing the skills ordering without changing the\u0000network's structure. In contrast, for point (ii), we design a network with two\u0000layers of gates, one performing disjunctive operations by noisy-OR gates and\u0000the other conjunctive operations through logical ANDs. Such changes improve the\u0000model outcomes' coherence and the modelling tool's flexibility without\u0000compromising the model's compact parametrisation, interpretability and simple\u0000experts' elicitation. We used this approach to develop a learner model for\u0000Computational Thinking (CT) skills assessment. The CT-cube skills assessment\u0000framework and the Cross Array Task (CAT) are used to exemplify it and\u0000demonstrate its feasibility.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141937236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analyzing Quantum Circuit Depth Reduction with Ancilla Qubits in MCX Gates","authors":"Ahmad Bennakhi, Paul Franzon, Gregory T. Byrd","doi":"arxiv-2408.01304","DOIUrl":"https://doi.org/arxiv-2408.01304","url":null,"abstract":"This paper aims to give readers a high-level overview of the different MCX\u0000depth reduction techniques that utilize ancilla qubits. We also exhibit a brief\u0000analysis of how they would perform under different quantum topological\u0000settings. The techniques examined are recursion and v-chain, as they are the\u0000most commonly used techniques in the most popular quantum computing libraries,\u0000Qiskit. The target audience of this paper is people who do not have intricate\u0000mathematical or physics knowledge related to quantum computing.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"29 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141937286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Energy Cost of Artificial Intelligence of Things Lifecycle","authors":"Shih-Kai Chou, Jernej Hribar, Mihael Mohorčič, Carolina Fortuna","doi":"arxiv-2408.00540","DOIUrl":"https://doi.org/arxiv-2408.00540","url":null,"abstract":"Artificial intelligence (AI)coupled with existing Internet of Things (IoT)\u0000enables more streamlined and autonomous operations across various economic\u0000sectors. Consequently, the paradigm of Artificial Intelligence of Things (AIoT)\u0000having AI techniques at its core implies additional energy and carbon costs\u0000that may become significant with more complex neural architectures. To better\u0000understand the energy and Carbon Footprint (CF) of some AIoT components, very\u0000recent studies employ conventional metrics. However, these metrics are not\u0000designed to capture energy efficiency aspects of inference. In this paper, we\u0000propose a new metric, the Energy Cost of AIoT Lifecycle (eCAL) to capture the\u0000overall energy cost of inference over the lifecycle of an AIoT system. We\u0000devise a new methodology for determining eCAL of an AIoT system by analyzing\u0000the complexity of data manipulation in individual components involved in the\u0000AIoT lifecycle and derive the overall and per bit energy consumption. With eCAL\u0000we show that the better a model is and the more it is used, the more energy\u0000efficient an inference is. For an example AIoT configuration, eCAL for making\u0000$100$ inferences is $1.43$ times higher than for $1000$ inferences. We also\u0000evaluate the CF of the AIoT system by calculating the equivalent CO$_{2}$\u0000emissions based on the energy consumption and the Carbon Intensity (CI) across\u0000different countries. Using 2023 renewable data, our analysis reveals that\u0000deploying an AIoT system in Germany results in emitting $4.62$ times higher\u0000CO$_2$ than in Finland, due to latter using more low-CI energy sources.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141881507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatic Generation of Behavioral Test Cases For Natural Language Processing Using Clustering and Prompting","authors":"Ying Li, Rahul Singh, Tarun Joshi, Agus Sudjianto","doi":"arxiv-2408.00161","DOIUrl":"https://doi.org/arxiv-2408.00161","url":null,"abstract":"Recent work in behavioral testing for natural language processing (NLP)\u0000models, such as Checklist, is inspired by related paradigms in software\u0000engineering testing. They allow evaluation of general linguistic capabilities\u0000and domain understanding, hence can help evaluate conceptual soundness and\u0000identify model weaknesses. However, a major challenge is the creation of test\u0000cases. The current packages rely on semi-automated approach using manual\u0000development which requires domain expertise and can be time consuming. This\u0000paper introduces an automated approach to develop test cases by exploiting the\u0000power of large language models and statistical techniques. It clusters the text\u0000representations to carefully construct meaningful groups and then apply\u0000prompting techniques to automatically generate Minimal Functionality Tests\u0000(MFT). The well-known Amazon Reviews corpus is used to demonstrate our\u0000approach. We analyze the behavioral test profiles across four different\u0000classification algorithms and discuss the limitations and strengths of those\u0000models.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"35 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141881505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Explainable Vision Transformer with Transfer Learning Combined with Support Vector Machine Based Efficient Drought Stress Identification","authors":"Aswini Kumar Patra, Ankit Varshney, Lingaraj Sahoo","doi":"arxiv-2407.21666","DOIUrl":"https://doi.org/arxiv-2407.21666","url":null,"abstract":"Early detection of drought stress is critical for taking timely measures for\u0000reducing crop loss before the drought impact becomes irreversible. The subtle\u0000phenotypical and physiological changes in response to drought stress are\u0000captured by non-invasive imaging techniques and these imaging data serve as\u0000valuable resource for machine learning methods to identify drought stress.\u0000While convolutional neural networks (CNNs) are in wide use, vision transformers\u0000(ViTs) present a promising alternative in capturing long-range dependencies and\u0000intricate spatial relationships, thereby enhancing the detection of subtle\u0000indicators of drought stress. We propose an explainable deep learning pipeline\u0000that leverages the power of ViTs for drought stress detection in potato crops\u0000using aerial imagery. We applied two distinct approaches: a synergistic\u0000combination of ViT and support vector machine (SVM), where ViT extracts\u0000intricate spatial features from aerial images, and SVM classifies the crops as\u0000stressed or healthy and an end-to-end approach using a dedicated classification\u0000layer within ViT to directly detect drought stress. Our key findings explain\u0000the ViT model's decision-making process by visualizing attention maps. These\u0000maps highlight the specific spatial features within the aerial images that the\u0000ViT model focuses as the drought stress signature. Our findings demonstrate\u0000that the proposed methods not only achieve high accuracy in drought stress\u0000identification but also shedding light on the diverse subtle plant features\u0000associated with drought stress. This offers a robust and interpretable solution\u0000for drought stress monitoring for farmers to undertake informed decisions for\u0000improved crop management.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"46 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141866682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marie Tcholakian, Karolina Gorna, Maryline Laurent, Hella Kaffel Ben Ayed, Montassar Naghmouchi
{"title":"Self-Sovereign Identity for Consented and Content-Based Access to Medical Records using Blockchain","authors":"Marie Tcholakian, Karolina Gorna, Maryline Laurent, Hella Kaffel Ben Ayed, Montassar Naghmouchi","doi":"arxiv-2407.21559","DOIUrl":"https://doi.org/arxiv-2407.21559","url":null,"abstract":"Electronic Health Records (EHRs) and Medical Data are classified as personal\u0000data in every privacy law, meaning that any related service that includes\u0000processing such data must come with full security, confidentiality, privacy and\u0000accountability. Solutions for health data management, as in storing it, sharing\u0000and processing it, are emerging quickly and were significantly boosted by the\u0000Covid-19 pandemic that created a need to move things online. EHRs makes a\u0000crucial part of digital identity data, and the same digital identity trends --\u0000as in self sovereign identity powered by decentralized ledger technologies like\u0000Blockchain, are being researched or implemented in contexts managing digital\u0000interactions between health facilities, patients and health professionals. In\u0000this paper, we propose a blockchain-based solution enabling secure exchange of\u0000EHRs between different parties powered by a self-sovereign identity (SSI)\u0000wallet and decentralized identifiers. We also make use of a consortium IPFS\u0000network for off-chain storage and attribute-based encryption (ABE) to ensure\u0000data confidentiality and integrity. Through our solution, we grant users full\u0000control over their medical data, and enable them to securely share it in total\u0000confidentiality over secure communication channels between user wallets using\u0000encryption. We also use DIDs for better user privacy and limit any possible\u0000correlations or identification by using pairwise DIDs. Overall, combining this\u0000set of technologies guarantees secure exchange of EHRs, secure storage and\u0000management along with by-design features inherited from the technological\u0000stack.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"49 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141866683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CultureVo: The Serious Game of Utilizing Gen AI for Enhancing Cultural Intelligence","authors":"Ajita Agarwala, Anupam Purwar, Viswanadhasai Rao","doi":"arxiv-2407.20685","DOIUrl":"https://doi.org/arxiv-2407.20685","url":null,"abstract":"CultureVo, Inc. has developed the Integrated Culture Learning Suite (ICLS) to\u0000deliver foundational knowledge of world cultures through a combination of\u0000interactive lessons and gamified experiences. This paper explores how\u0000Generative AI powered by open source Large Langauge Models are utilized within\u0000the ICLS to enhance cultural intelligence. The suite employs Generative AI\u0000techniques to automate the assessment of learner knowledge, analyze behavioral\u0000patterns, and manage interactions with non-player characters using real time\u0000learner assessment. Additionally, ICLS provides contextual hint and recommend\u0000course content by assessing learner proficiency, while Generative AI\u0000facilitates the automated creation and validation of educational content.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"42 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141866684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Introducing a new hyper-parameter for RAG: Context Window Utilization","authors":"Kush Juvekar, Anupam Purwar","doi":"arxiv-2407.19794","DOIUrl":"https://doi.org/arxiv-2407.19794","url":null,"abstract":"This paper introduces a new hyper-parameter for Retrieval-Augmented\u0000Generation (RAG) systems called Context Window Utilization. RAG systems enhance\u0000generative models by incorporating relevant information retrieved from external\u0000knowledge bases, improving the factual accuracy and contextual relevance of\u0000generated responses. The size of the text chunks retrieved and processed is a\u0000critical factor influencing RAG performance. This study aims to identify the\u0000optimal chunk size that maximizes answer generation quality. Through systematic\u0000experimentation, we analyze the effects of varying chunk sizes on the\u0000efficiency and effectiveness of RAG frameworks. Our findings reveal that an\u0000optimal chunk size balances the trade-off between providing sufficient context\u0000and minimizing irrelevant information. These insights are crucial for enhancing\u0000the design and implementation of RAG systems, underscoring the importance of\u0000selecting an appropriate chunk size to achieve superior performance.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"205 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141866687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}