Big Data最新文献_第6页

Gaussian Adapted Markov Model with Overhauled Fluctuation Analysis-Based Big Data Streaming Model in Cloud. 基于高斯自适应马尔可夫模型和检修波动分析的云中大数据流模型。

IF 2.6 4区计算机科学

Big Data Pub Date : 2024-02-01 Epub Date: 2023-10-30 DOI: 10.1089/big.2023.0035

M Ananthi, Annapoorani Gopal, K Ramalakshmi, P Mohan Kumar

{"title":"Gaussian Adapted Markov Model with Overhauled Fluctuation Analysis-Based Big Data Streaming Model in Cloud.","authors":"M Ananthi, Annapoorani Gopal, K Ramalakshmi, P Mohan Kumar","doi":"10.1089/big.2023.0035","DOIUrl":"10.1089/big.2023.0035","url":null,"abstract":"An accurate resource usage prediction in the big data streaming applications still remains as one of the complex processes. In the existing works, various resource scaling techniques are developed for forecasting the resource usage in the big data streaming systems. However, the baseline streaming mechanisms limit with the issues of inefficient resource scaling, inaccurate forecasting, high latency, and running time. Therefore, the proposed work motivates to develop a new framework, named as Gaussian adapted Markov model (GAMM)-overhauled fluctuation analysis (OFA), for an efficient big data streaming in the cloud systems. The purpose of this work is to efficiently manage the time-bounded big data streaming applications with reduced error rate. In this study, the gating strategy is also used to extract the set of features for obtaining nonlinear distribution of data and fat convergence solution, used to perform the fluctuation analysis. Moreover, the layered architecture is developed for simplifying the process of resource forecasting in the streaming applications. During experimentation, the results of the proposed stream model GAMM-OFA are validated and compared by using different measures.","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"1-18"},"PeriodicalIF":2.6,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71415224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Acknowledgment of Reviewers 2023. 鸣谢 2023 年审稿人。

IF 4.6 4区计算机科学

Big Data Pub Date : 2024-02-01 Epub Date: 2023-12-19 DOI: 10.1089/big.2023.29063.ack

引用次数: 0

Impact of Cooperative Innovation on the Technological Innovation Performance of High-Tech Firms: A Dual Moderating Effect Model of Big Data Capabilities and Policy Support. 合作创新对高科技企业技术创新绩效的影响：大数据能力与政策支持的双重调节效应模型。

IF 2.6 4区计算机科学

Big Data Pub Date : 2024-02-01 Epub Date: 2023-09-14 DOI: 10.1089/big.2022.0301

Xianglong Li, Qingjin Wang, Renbo Shi, Xueling Wang, Kaiyun Zhang, Xiao Liu

{"title":"Impact of Cooperative Innovation on the Technological Innovation Performance of High-Tech Firms: A Dual Moderating Effect Model of Big Data Capabilities and Policy Support.","authors":"Xianglong Li, Qingjin Wang, Renbo Shi, Xueling Wang, Kaiyun Zhang, Xiao Liu","doi":"10.1089/big.2022.0301","DOIUrl":"10.1089/big.2022.0301","url":null,"abstract":"The mechanism of cooperative innovation (CI) for high-tech firms aims to improve their technological innovation performance. It is the effective integration of the internal and external innovation resources of these firms, along with the simultaneous reduction in the uncertainty of technological innovation and the maintenance of the comparative advantage of the firms in the competition. This study used 322 high-tech firms as our sample, which were located in 33 national innovation demonstration bases identified by the Chinese government. We implemented a multiple linear regression to test the impact of CI conducted by these high-tech firms at the level of their technological innovation performance. In addition, the study further examined the moderating effect of two boundary conditions-big data capabilities and policy support (PS)-on the main hypotheses. Our study found that high-tech firms carrying out CI can effectively improve their technological innovation performance, with big data capabilities and PS significantly enhancing the degree of this influence. The study reveals the intrinsic mechanism of the impact of CI on the technological innovation performance of high-tech firms, which, to a certain extent, expands the application context of CI and enriches the research perspective on the impact of CI on the innovation performance of firms. At the same time, the findings provide insight for how high-tech firms in the digital era can make reasonable use of data empowerment in the process of CI to achieve improved technological innovation performance.","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"63-80"},"PeriodicalIF":2.6,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10243508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Automated Natural Language Processing-Based Supplier Discovery for Financial Services. 基于自然语言处理的金融服务供应商自动发现。

IF 2.6 4区计算机科学

Big Data Pub Date : 2024-02-01 Epub Date: 2023-07-07 DOI: 10.1089/big.2022.0215

Mauro Papa, Ioannis Chatzigiannakis, Aris Anagnostopoulos

{"title":"Automated Natural Language Processing-Based Supplier Discovery for Financial Services.","authors":"Mauro Papa, Ioannis Chatzigiannakis, Aris Anagnostopoulos","doi":"10.1089/big.2022.0215","DOIUrl":"10.1089/big.2022.0215","url":null,"abstract":"Public procurement is viewed as a major market force that can be used to promote innovation and drive small and medium-sized enterprises growth. In such cases, procurement system design relies on intermediates that provide vertical linkages between suppliers and providers of innovative services and products. In this work we propose an innovative methodology for decision support in the process of supplier discovery, which precedes the final supplier selection. We focus on data gathered from community-based sources such as Reddit and Wikidata and avoid any use of historical open procurement datasets to identify small and medium sized suppliers of innovative products and services that own very little market shares. We look into a real-world procurement case study from the financial sector focusing on the Financial and Market Data offering and develop an interactive web-based support tool to address certain requirements of the Italian central bank. We demonstrate how a suitable selection of natural language processing models, such as a part-of-speech tagger and a word-embedding model, in combination with a novel named-entity-disambiguation algorithm, can efficiently analyze huge quantity of textual data, increasing the probability of a full coverage of the market.","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"30-48"},"PeriodicalIF":2.6,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9749953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Large-Scale Estimation and Analysis of Web Users' Mood from Web Search Query and Mobile Sensor Data. 从网络搜索查询和移动传感器数据中大规模估计和分析网络用户的情绪。

IF 2.6 4区计算机科学

Big Data Pub Date : 2024-01-01 Epub Date: 2023-06-02 DOI: 10.1089/big.2022.0211

Wataru Sasaki, Satoki Hamanaka, Satoko Miyahara, Kota Tsubouchi, Jin Nakazawa, Tadashi Okoshi

{"title":"Large-Scale Estimation and Analysis of Web Users' Mood from Web Search Query and Mobile Sensor Data.","authors":"Wataru Sasaki, Satoki Hamanaka, Satoko Miyahara, Kota Tsubouchi, Jin Nakazawa, Tadashi Okoshi","doi":"10.1089/big.2022.0211","DOIUrl":"10.1089/big.2022.0211","url":null,"abstract":"The ability to estimate the current mood states of web users has considerable potential for realizing user-centric opportune services in pervasive computing. However, it is difficult to determine the data type used for such estimation and collect the ground truth of such mood states. Therefore, we built a model to estimate the mood states from search-query data in an easy-to-collect and non-invasive manner. Then, we built a model to estimate mood states from mobile sensor data as another estimation model and supplemented its output to the ground-truth label of the model estimated from search queries. This novel two-step model building contributed to boosting the performance of estimating the mood states of web users. Our system was also deployed in the commercial stack, and large-scale data analysis with >11 million users was conducted. We proposed a nationwide mood score, which bundles the mood values of users across the country. It shows the daily and weekly rhythm of people's moods and explains the ups and downs of moods during the COVID-19 pandemic, which is inversely synchronized to the number of new COVID-19 cases. It detects big news that simultaneously affects the mood states of many users, even under fine-grained time resolution, such as the order of hours. In addition, we identified a certain class of advertisements that indicated a clear tendency in the mood of the users who clicked such advertisements.","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"191-209"},"PeriodicalIF":2.6,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11304759/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9565593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Computational Efficient Approximations of the Concordance Probability in a Big Data Setting. 大数据环境下一致概率的高效计算近似。

IF 2.6 4区计算机科学

Big Data Pub Date : 2024-01-01 Epub Date: 2023-06-07 DOI: 10.1089/big.2022.0107

Robin Van Oirbeek, Jolien Ponnet, Bart Baesens, Tim Verdonck

引用次数: 0

Small Files Problem Resolution via Hierarchical Clustering Algorithm. 通过分层聚类算法解决小文件问题

IF 2.6 4区计算机科学

Big Data Pub Date : 2024-01-01 Epub Date: 2023-05-16 DOI: 10.1089/big.2022.0181

Oded Koren, Aviel Shamalov, Nir Perel

{"title":"Small Files Problem Resolution via Hierarchical Clustering Algorithm.","authors":"Oded Koren, Aviel Shamalov, Nir Perel","doi":"10.1089/big.2022.0181","DOIUrl":"10.1089/big.2022.0181","url":null,"abstract":"The Small Files Problem in Hadoop Distributed File System (HDFS) is an ongoing challenge that has not yet been solved. However, various approaches have been developed to tackle the obstacles this problem creates. Properly managing the size of blocks in a file system is essential as it saves memory and computing time and may reduce bottlenecks. In this article, a new approach using a Hierarchical Clustering Algorithm is suggested for dealing with small files. The proposed method identifies the files by their structure and via a special Dendrogram analysis, and then recommends which files can be merged. As a simulation, the proposed algorithm was applied via 100 CSV files with different structures, containing 2-4 columns with different data types (integer, decimal and text). Also, 20 files that were not CSV files were created to demonstrate that the algorithm only works on CSV files. All data were analyzed via a machine learning hierarchical clustering method, and a Dendrogram was created. According to the merge process that was performed, seven files from the Dendrogram analysis were chosen as appropriate files to be merged. This reduced the memory space in the HDFS. Furthermore, the results showed that using the suggested algorithm led to efficient file management.","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"229-242"},"PeriodicalIF":2.6,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9830746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Predicting Sociodemographic Attributes from Mobile Usage Patterns: Applications and Privacy Implications. 从移动使用模式预测社会人口属性：应用与隐私影响

IF 2.6 4区计算机科学

Big Data Pub Date : 2024-01-01 Epub Date: 2023-08-14 DOI: 10.1089/big.2022.0182

Rouzbeh Razavi, Guisen Xue, Ikpe Justice Akpan

{"title":"Predicting Sociodemographic Attributes from Mobile Usage Patterns: Applications and Privacy Implications.","authors":"Rouzbeh Razavi, Guisen Xue, Ikpe Justice Akpan","doi":"10.1089/big.2022.0182","DOIUrl":"10.1089/big.2022.0182","url":null,"abstract":"When users interact with their mobile devices, they leave behind unique digital footprints that can be viewed as predictive proxies that reveal an array of users' characteristics, including their demographics. Predicting users' demographics based on mobile usage can provide significant benefits for service providers and users, including improving customer targeting, service personalization, and market research efforts. This study uses machine learning algorithms and mobile usage data from 235 demographically diverse users to examine the accuracy of predicting their sociodemographic attributes (age, gender, income, and education) from mobile usage metadata, filling the gap in the current literature by quantifying the predictive power of each attribute and discussing the practical applications and privacy implications. According to the results, gender can be most accurately predicted (balanced accuracy = 0.862) from mobile usage footprints, whereas predicting users' education level is more challenging (balanced accuracy = 0.719). Moreover, the classification models were able to classify users based on whether their age or income was above or below a certain threshold with acceptable accuracy. The study also presents the practical applications of inferring demographic attributes from mobile usage data and discusses the implications of the findings, such as privacy and discrimination risks, from the perspectives of different stakeholders.","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"213-228"},"PeriodicalIF":2.6,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9997249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An Improved Influence Maximization Method for Online Advertising in Social Internet of Things. 社交物联网中网络广告影响力最大化的改进方法。

IF 2.6 4区计算机科学

Big Data Pub Date : 2024-01-01 Epub Date: 2023-08-02 DOI: 10.1089/big.2023.0042

Reza Molaei, Kheirollah Rahsepar Fard, Asgarali Bouyer

{"title":"An Improved Influence Maximization Method for Online Advertising in Social Internet of Things.","authors":"Reza Molaei, Kheirollah Rahsepar Fard, Asgarali Bouyer","doi":"10.1089/big.2023.0042","DOIUrl":"10.1089/big.2023.0042","url":null,"abstract":"Recently, a new subject known as the Social Internet of Things (SIoT) has been presented based on the integration the Internet of Things and social network concepts. SIoT is increasingly popular in modern human living, including applications such as smart transportation, online health care systems, and viral marketing. In advertising based on SIoT, identifying the most effective diffuser nodes to maximize reach is a critical challenge. This article proposes an efficient heuristic algorithm named Influence Maximization of advertisement for Social Internet of Things (IMSoT), inspired by real-world advertising. The IMSoT algorithm consists of two steps: selecting candidate objects and identifying the final seed set. In the first step, influential candidate objects are selected based on factors, such as degree, local importance value, and weak and sensitive neighbors set. In the second step, effective influence is calculated based on overlapping between candidate objects to identify the appropriate final seed set. The IMSoT algorithm ensures maximum influence and minimum overlap, reducing the spreading caused by the seed set. A unique feature of IMSoT is its focus on preventing duplicate advertising, which reduces extra costs, and considering weak objects to reach the maximum target audience. Experimental evaluations in both real-world and synthetic networks demonstrate that our algorithm outperforms other state-of-the-art algorithms in terms of paying attention to weak objects by 38%-193% and in terms of preventing duplicate advertising (reducing extra cost) by 26%-77%. Additionally, the running time of the IMSoT algorithm is shorter than other state-of-the-art algorithms.","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"173-190"},"PeriodicalIF":2.6,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9922927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Acknowledgment of Reviewers 2023. 鸣谢 2023 年审稿人。

IF 4.6 4区计算机科学

Big Data Pub Date : 2023-12-19 DOI: 10.1089/big.2023.29063.ack

引用次数: 0