{"title":"Trend Analysis of Large Language Models through a Developer Community: A Focus on Stack Overflow","authors":"Jungha Son, Boyoung Kim","doi":"10.3390/info14110602","DOIUrl":"https://doi.org/10.3390/info14110602","url":null,"abstract":"In the rapidly advancing field of large language model (LLM) research, platforms like Stack Overflow offer invaluable insights into the developer community’s perceptions, challenges, and interactions. This research aims to analyze LLM research and development trends within the professional community. Through the rigorous analysis of Stack Overflow, employing a comprehensive dataset spanning several years, the study identifies the prevailing technologies and frameworks underlining the dominance of models and platforms such as Transformer and Hugging Face. Furthermore, a thematic exploration using Latent Dirichlet Allocation unravels a spectrum of LLM discussion topics. As a result of the analysis, twenty keywords were derived, and a total of five key dimensions, “OpenAI Ecosystem and Challenges”, “LLM Training with Frameworks”, “APIs, File Handling and App Development”, “Programming Constructs and LLM Integration”, and “Data Processing and LLM Functionalities”, were identified through intertopic distance mapping. This research underscores the notable prevalence of specific Tags and technologies within the LLM discourse, particularly highlighting the influential roles of Transformer models and frameworks like Hugging Face. This dominance not only reflects the preferences and inclinations of the developer community but also illuminates the primary tools and technologies they leverage in the continually evolving field of LLMs.","PeriodicalId":38479,"journal":{"name":"Information (Switzerland)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135589139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"EACH-COA: An Energy-Aware Cluster Head Selection for the Internet of Things Using the Coati Optimization Algorithm","authors":"Ramasubbareddy Somula, Yongyun Cho, Bhabendu Kumar Mohanta","doi":"10.3390/info14110601","DOIUrl":"https://doi.org/10.3390/info14110601","url":null,"abstract":"In recent years, the Internet of Things (IoT) has transformed human life by improving quality of life and revolutionizing all business sectors. The sensor nodes in IoT are interconnected to ensure data transfer to the sink node over the network. Owing to limited battery power, the energy in the nodes is conserved with the help of the clustering technique in IoT. Cluster head (CH) selection is essential for extending network lifetime and throughput in clustering. In recent years, many existing optimization algorithms have been adapted to select the optimal CH to improve energy usage in network nodes. Hence, improper CH selection approaches require more extended convergence and drain sensor batteries quickly. To solve this problem, this paper proposed a coati optimization algorithm (EACH-COA) to improve network longevity and throughput by evaluating the fitness function over the residual energy (RER) and distance constraints. The proposed EACH-COA simulation was conducted in MATLAB 2019a. The potency of the EACH-COA approach was compared with those of the energy-efficient rabbit optimization algorithm (EECHS-ARO), improved sparrow optimization technique (EECHS-ISSADE), and hybrid sea lion algorithm (PDU-SLno). The proposed EACH-COA improved the network lifetime by 8–15% and throughput by 5–10%.","PeriodicalId":38479,"journal":{"name":"Information (Switzerland)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135725298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring Key Issues in Cybersecurity Data Breaches: Analyzing Data Breach Litigation with ML-Based Text Analytics","authors":"Dominik Molitor, Wullianallur Raghupathi, Aditya Saharia, Viju Raghupathi","doi":"10.3390/info14110600","DOIUrl":"https://doi.org/10.3390/info14110600","url":null,"abstract":"While data breaches are a frequent and universal phenomenon, the characteristics and dimensions of data breaches are unexplored. In this novel exploratory research, we apply machine learning (ML) and text analytics to a comprehensive collection of data breach litigation cases to extract insights from the narratives contained within these cases. Our analysis shows stakeholders (e.g., litigants) are concerned about major topics related to identity theft, hacker, negligence, FCRA (Fair Credit Reporting Act), cybersecurity, insurance, phone device, TCPA (Telephone Consumer Protection Act), credit card, merchant, privacy, and others. The topics fall into four major clusters: “phone scams”, “cybersecurity”, “identity theft”, and “business data breach”. By utilizing ML, text analytics, and descriptive data visualizations, our study serves as a foundational piece for comprehensively analyzing large textual datasets. The findings hold significant implications for both researchers and practitioners in cybersecurity, especially those grappling with the challenges of data breaches.","PeriodicalId":38479,"journal":{"name":"Information (Switzerland)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135725450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Combining Software-Defined Radio Learning Modules and Neural Networks for Teaching Communication Systems Courses","authors":"Luis A. Camuñas-Mesa, José M. de la Rosa","doi":"10.3390/info14110599","DOIUrl":"https://doi.org/10.3390/info14110599","url":null,"abstract":"The paradigm known as Cognitive Radio (CR) proposes a continuous sensing of the electromagnetic spectrum in order to dynamically modify transmission parameters, making intelligent use of the environment by taking advantage of different techniques such as Neural Networks. This paradigm is becoming especially relevant due to the congestion in the spectrum produced by increasing numbers of IoT (Internet of Things) devices. Nowadays, many different Software-Defined Radio (SDR) platforms provide tools to implement CR systems in a teaching laboratory environment. Within the framework of a ‘Communication Systems’ course, this paper presents a methodology for learning the fundamentals of radio transmitters and receivers in combination with Convolutional Neural Networks (CNNs).","PeriodicalId":38479,"journal":{"name":"Information (Switzerland)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135774473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Angelo Casolaro, Vincenzo Capone, Gennaro Iannuzzo, Francesco Camastra
{"title":"Deep Learning for Time Series Forecasting: Advances and Open Problems","authors":"Angelo Casolaro, Vincenzo Capone, Gennaro Iannuzzo, Francesco Camastra","doi":"10.3390/info14110598","DOIUrl":"https://doi.org/10.3390/info14110598","url":null,"abstract":"A time series is a sequence of time-ordered data, and it is generally used to describe how a phenomenon evolves over time. Time series forecasting, estimating future values of time series, allows the implementation of decision-making strategies. Deep learning, the currently leading field of machine learning, applied to time series forecasting can cope with complex and high-dimensional time series that cannot be usually handled by other machine learning techniques. The aim of the work is to provide a review of state-of-the-art deep learning architectures for time series forecasting, underline recent advances and open problems, and also pay attention to benchmark data sets. Moreover, the work presents a clear distinction between deep learning architectures that are suitable for short-term and long-term forecasting. With respect to existing literature, the major advantage of the work consists in describing the most recent architectures for time series forecasting, such as Graph Neural Networks, Deep Gaussian Processes, Generative Adversarial Networks, Diffusion Models, and Transformers.","PeriodicalId":38479,"journal":{"name":"Information (Switzerland)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135773336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-Agent Reinforcement Learning for Online Food Delivery with Location Privacy Preservation","authors":"Suleiman Abahussein, Dayong Ye, Congcong Zhu, Zishuo Cheng, Umer Siddique, Sheng Shen","doi":"10.3390/info14110597","DOIUrl":"https://doi.org/10.3390/info14110597","url":null,"abstract":"Online food delivery services today are considered an essential service that gets significant attention worldwide. Many companies and individuals are involved in this field as it offers good income and numerous jobs to the community. In this research, we consider the problem of online food delivery services and how we can increase the number of received orders by couriers and thereby increase their income. Multi-agent reinforcement learning (MARL) is employed to guide the couriers to areas with high demand for food delivery requests. A map of the city is divided into small grids, and each grid represents a small area of the city that has different demand for online food delivery orders. The MARL agent trains and learns which grid has the highest demand and then selects it. Thus, couriers can get more food delivery orders and thereby increase long-term income. While increasing the number of received orders is important, protecting customer location is also essential. Therefore, the Protect User Location Method (PULM) is proposed in this research in order to protect customer location information. The PULM injects differential privacy (DP) Laplace noise based on two parameters: city area size and customer frequency of online food delivery orders. We use two datasets—Shenzhen, China, and Iowa, USA—to demonstrate the results of our experiments. The results show an increase in the number of received orders in the Shenzhen and Iowa City datasets. We also show the similarity and data utility of courier trajectories after we use our obfuscation (PULM) method.","PeriodicalId":38479,"journal":{"name":"Information (Switzerland)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135775127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Temporal Convolutional Networks and BERT-Based Multi-Label Emotion Analysis for Financial Forecasting","authors":"Charalampos M. Liapis, Sotiris Kotsiantis","doi":"10.3390/info14110596","DOIUrl":"https://doi.org/10.3390/info14110596","url":null,"abstract":"The use of deep learning in conjunction with models that extract emotion-related information from texts to predict financial time series is based on the assumption that what is said about a stock is correlated with the way that stock fluctuates. Given the above, in this work, a multivariate forecasting methodology incorporating temporal convolutional networks in combination with a BERT-based multi-label emotion classification procedure and correlation feature selection is proposed. The results from an extensive set of experiments, which included predictions of three different time frames and various multivariate ensemble schemes that capture 28 different types of emotion-relative information, are presented. It is shown that the proposed methodology exhibits universal predominance regarding aggregate performance over six different metrics, outperforming all the compared schemes, including a multitude of individual and ensemble methods, both in terms of aggregate average scores and Friedman rankings. Moreover, the results strongly indicate that the use of emotion-related features has beneficial effects on the derived forecasts.","PeriodicalId":38479,"journal":{"name":"Information (Switzerland)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135819393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing Walking Accessibility in Urban Transportation: A Comprehensive Analysis of Influencing Factors and Mechanisms","authors":"Yong Liu, Xueqi Ding, Yanjie Ji","doi":"10.3390/info14110595","DOIUrl":"https://doi.org/10.3390/info14110595","url":null,"abstract":"The rise in “urban diseases” like population density, traffic congestion, and environmental pollution has renewed attention to urban livability. Walkability, a critical measure of pedestrian friendliness, has gained prominence in urban and transportation planning. This research delves into a comprehensive analysis of walking accessibility, examining both subjective and objective aspects. This study aims to identify the influencing factors and explore the underlying mechanisms driving walkability within a specific area. Through a questionnaire survey, residents’ subjective perceptions were gathered concerning various factors such as traffic operations, walking facilities, and the living environment. Structural equation modeling was employed to analyze the collected data, revealing that travel experience significantly impacts perceived accessibility, followed by facility condition, traffic condition, and safety perception. In the objective analysis, various types of POI data served as explanatory variables, dividing the study area into grids using ArcGIS, with the Walk Score® as the dependent variable. Comparisons of OLS, GWR and MGWR demonstrated that MGWR yielded the most accurate fitting results. Mixed land use, shopping, hotels, residential, government, financial, and medical public services exhibited positive correlations with local walkability, while corporate enterprises and street greening showed negative correlations. These findings were attributed to the level of development, regional functions, population distribution, and supporting facility deployment, collectively influencing the walking accessibility of the area. In conclusion, this research presents crucial insights into enhancing walkability, with implications for urban planning and management, thereby enriching residents’ walking travel experience and promoting sustainable transportation practices. Finally, the limitations of the thesis are discussed.","PeriodicalId":38479,"journal":{"name":"Information (Switzerland)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135973727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abdelali Hadir, Naima Kaabouch, Mohammed-Alamine El Houssaini, Jamal El Kafi
{"title":"Range-Free Localization Approaches Based on Intelligent Swarm Optimization for Internet of Things","authors":"Abdelali Hadir, Naima Kaabouch, Mohammed-Alamine El Houssaini, Jamal El Kafi","doi":"10.3390/info14110592","DOIUrl":"https://doi.org/10.3390/info14110592","url":null,"abstract":"Recently, the precise location of sensor nodes has emerged as a significant challenge in the realm of Internet of Things (IoT) applications, including Wireless Sensor Networks (WSNs). The accurate determination of geographical coordinates for detected events holds pivotal importance in these applications. Despite DV-Hop gaining popularity due to its cost-effectiveness, feasibility, and lack of additional hardware requirements, it remains hindered by a relatively notable localization error. To overcome this limitation, our study introduces three new localization approaches that combine DV-Hop with Chicken Swarm Optimization (CSO). The primary objective is to improve the precision of DV-Hop-based approaches. In this paper, we compare the efficiency of the proposed localization algorithms with other existing approaches, including several algorithms based on Particle Swarm Optimization (PSO), while considering random network topologies. The simulation results validate the efficiency of our proposed algorithms. The proposed HW-DV-HopCSO algorithm achieves a considerable improvement in positioning accuracy compared to those of existing models.","PeriodicalId":38479,"journal":{"name":"Information (Switzerland)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135221111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CoDiS: Community Detection via Distributed Seed Set Expansion on Graph Streams","authors":"Austin Anderson, Petros Potikas, Katerina Potika","doi":"10.3390/info14110594","DOIUrl":"https://doi.org/10.3390/info14110594","url":null,"abstract":"Community detection has been (and remains) a very important topic in several fields. From marketing and social networking to biological studies, community detection plays a key role in advancing research in many different fields. Research on this topic originally looked at classifying nodes into discrete communities (non-overlapping communities) but eventually moved forward to placing nodes in multiple communities (overlapping communities). Unfortunately, community detection has always been a time-inefficient process, and datasets are too large to realistically process them using traditional methods. Because of this, recent methods have turned to parallelism and graph stream models, where the edge list is accessed one edge at a time. However, all these methods, while offering a significant decrease in processing time, still have several shortcomings. We propose a new parallel algorithm called community detection with seed sets (CoDiS), which solves the overlapping community detection problem in graph streams. Initially, some nodes (seed sets) have known community structures, and the aim is to expand these communities by processing one edge at a time. The innovation of our approach is that it splits communities among the parallel computation workers so that each worker is only updating a subset of all the communities. By doing so, we decrease the edge processing throughput and decrease the amount of time each worker spends on each edge. Crucially, we remove the need for every worker to have access to every community. Experimental results show that we are able to gain a significant improvement in running time with no loss of accuracy.","PeriodicalId":38479,"journal":{"name":"Information (Switzerland)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135371364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}