Frontiers in Big DataPub Date : 2023-10-25eCollection Date: 2023-01-01DOI: 10.3389/fdata.2023.1227156
Ilaria Bartolini, Marco Patella
{"title":"A stream processing abstraction framework.","authors":"Ilaria Bartolini, Marco Patella","doi":"10.3389/fdata.2023.1227156","DOIUrl":"https://doi.org/10.3389/fdata.2023.1227156","url":null,"abstract":"<p><p>Real-time analysis of large multimedia streams is nowadays made efficient by the existence of several Big Data streaming platforms, like Apache Flink and Samza. However, the use of such platforms is difficult due to the fact that facilities they offer are often too raw to be effectively exploited by analysts. We describe the evolution of RAM3S, a software infrastructure for the integration of Big Data stream processing platforms, to SPAF, an abstraction framework able to provide programmers with a simple but powerful API to ease the development of stream processing applications. By using SPAF, the programmer can easily implement real-time complex analyses of massive streams on top of a distributed computing infrastructure, able to manage the volume and velocity of Big Data streams, thus effectively transforming data into value.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"6 ","pages":"1227156"},"PeriodicalIF":3.1,"publicationDate":"2023-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10634501/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89720556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Frontiers in Big DataPub Date : 2023-10-24eCollection Date: 2023-01-01DOI: 10.3389/fdata.2023.1319729
{"title":"Erratum: Evaluation of methods for assigning causes of death from verbal autopsies in India.","authors":"","doi":"10.3389/fdata.2023.1319729","DOIUrl":"https://doi.org/10.3389/fdata.2023.1319729","url":null,"abstract":"<p><p>[This corrects the article DOI: 10.3389/fdata.2023.1197471.].</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"6 ","pages":"1319729"},"PeriodicalIF":3.1,"publicationDate":"2023-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10628717/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71523316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Frontiers in Big DataPub Date : 2023-10-23eCollection Date: 2023-01-01DOI: 10.3389/fdata.2023.1239017
Nataliya Shakhovska, Roman Kaminskyi, Bohdan Khudoba
{"title":"Experimental study and clustering of operating staff of search systems in the sense of stress resistance.","authors":"Nataliya Shakhovska, Roman Kaminskyi, Bohdan Khudoba","doi":"10.3389/fdata.2023.1239017","DOIUrl":"10.3389/fdata.2023.1239017","url":null,"abstract":"<p><strong>Introduction: </strong>The main goal of this study is to develop a methodology for the organization of experimental selection of operator personnel based on the analysis of their behavior under the influence of micro-stresses.</p><p><strong>Methods: </strong>A human-machine interface model has been developed, which considers the change in the functional state of the human operator. The presented concept of the difficulty of detecting the object of attention contributed to developing a particular sequence of ordinary test images with stressor images included in it and presented models of the flow of presenting test images to the recipient.</p><p><strong>Results: </strong>With the help of descriptive statistics, the parameters of individual box-plot diagrams were determined, and the recipient group was clustered.</p><p><strong>Discussion: </strong>Overall, the proposed approach based on the example of the conducted grouping makes it possible to ensure the objectivity and efficiency of the professional selection of applicants for operator specialties.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"6 ","pages":"1239017"},"PeriodicalIF":3.1,"publicationDate":"2023-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10626476/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71488837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Frontiers in Big DataPub Date : 2023-10-19eCollection Date: 2023-01-01DOI: 10.3389/fdata.2023.1241899
Turker Berk Donmez, Mohammed Mansour, Mustafa Kutlu, Chris Freeman, Shekhar Mahmud
{"title":"Anemia detection through non-invasive analysis of lip mucosa images.","authors":"Turker Berk Donmez, Mohammed Mansour, Mustafa Kutlu, Chris Freeman, Shekhar Mahmud","doi":"10.3389/fdata.2023.1241899","DOIUrl":"10.3389/fdata.2023.1241899","url":null,"abstract":"<p><p>This paper aims to detect anemia using images of the lip mucosa, where the skin tissue is thin, and to confirm the feasibility of detecting anemia noninvasively and in the home environment using machine learning (ML). Data were collected from 138 patients, including 100 women and 38 men. Six ML algorithms: artificial neural network (ANN), decision tree (DT), k-nearest neighbors (KNN), logistic regression (LR), naive bayes (NB), and support vector machine (SVM) which are widely used in medical applications, were used to classify the collected data. Two different data types were obtained from participants' images (RGB red color values and HSV saturation values) as features, with age, sex, and hemoglobin levels utilized to perform classification. The ML algorithm was used to analyze and classify images of the lip mucosa quickly and accurately, potentially increasing the efficiency of anemia screening programs. The accuracy, precision, recall, and F-measure were evaluated to assess how well ML models performed in predicting anemia. The results showed that NB reported the highest accuracy (96%) among the other ML models used. DT, KNN and ANN reported an accuracies of (93%), while LR and SVM had an accuracy of (79%) and (75%) receptively. This research suggests that employing ML approaches to identify anemia will help classify the diagnosis, which will then help to create efficient preventive measures. Compared to blood tests, this noninvasive procedure is more practical and accessible to patients. Furthermore, ML algorithms may be created and trained to assess lip mucosa photos at a minimal cost, making it an affordable screening method in regions with a shortage of healthcare resources.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"6 ","pages":"1241899"},"PeriodicalIF":3.1,"publicationDate":"2023-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10620602/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71488836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Frontiers in Big DataPub Date : 2023-10-19eCollection Date: 2023-01-01DOI: 10.3389/fdata.2023.1271639
Luca Clissa, Mario Lassnig, Lorenzo Rinaldi
{"title":"How big is Big Data? A comprehensive survey of data production, storage, and streaming in science and industry.","authors":"Luca Clissa, Mario Lassnig, Lorenzo Rinaldi","doi":"10.3389/fdata.2023.1271639","DOIUrl":"https://doi.org/10.3389/fdata.2023.1271639","url":null,"abstract":"<p><p>The contemporary surge in data production is fueled by diverse factors, with contributions from numerous stakeholders across various sectors. Comparing the volumes at play among different big data entities is challenging due to the scarcity of publicly available data. This survey aims to offer a comprehensive perspective on the orders of magnitude involved in yearly data generation by some public and private leading organizations, using an array of online sources for estimation. These estimates are based on meaningful, individual data production metrics and plausible per-unit sizes. The primary objective is to offer insights into the comparative scales of major big data players, their sources, and data production flows, rather than striving for precise measurements or incorporating the latest updates. The results are succinctly conveyed through a visual representation of the relative data generation volumes across these entities.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"6 ","pages":"1271639"},"PeriodicalIF":3.1,"publicationDate":"2023-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10620515/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71488775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Frontiers in Big DataPub Date : 2023-10-17eCollection Date: 2023-01-01DOI: 10.3389/fdata.2023.1258051
Namita Gupta
{"title":"Editorial: Smart cities challenges, technologies and trends.","authors":"Namita Gupta","doi":"10.3389/fdata.2023.1258051","DOIUrl":"https://doi.org/10.3389/fdata.2023.1258051","url":null,"abstract":"","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"6 ","pages":"1258051"},"PeriodicalIF":3.1,"publicationDate":"2023-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10616893/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71432285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Frontiers in Big DataPub Date : 2023-10-16eCollection Date: 2023-01-01DOI: 10.3389/fdata.2023.1301942
Allison McCarn Deiana, Nhan Tran, Joshua Agar, Michaela Blott, Giuseppe Di Guglielmo, Javier Duarte, Philip Harris, Scott Hauck, Mia Liu, Mark S Neubauer, Jennifer Ngadiuba, Seda Ogrenci-Memik, Maurizio Pierini, Thea Aarrestad, Steffen Bähr, Jürgen Becker, Anne-Sophie Berthold, Richard J Bonventre, Tomás E Müller Bravo, Markus Diefenthaler, Zhen Dong, Nick Fritzsche, Amir Gholami, Ekaterina Govorkova, Dongning Guo, Kyle J Hazelwood, Christian Herwig, Babar Khan, Sehoon Kim, Thomas Klijnsma, Yaling Liu, Kin Ho Lo, Tri Nguyen, Gianantonio Pezzullo, Seyedramin Rasoulinezhad, Ryan A Rivera, Kate Scholberg, Justin Selig, Sougata Sen, Dmitri Strukov, William Tang, Savannah Thais, Kai Lukas Unger, Ricardo Vilalta, Belina von Krosigk, Shen Wang, Thomas K Warburton
{"title":"Corrigendum: Applications and techniques for fast machine learning in science.","authors":"Allison McCarn Deiana, Nhan Tran, Joshua Agar, Michaela Blott, Giuseppe Di Guglielmo, Javier Duarte, Philip Harris, Scott Hauck, Mia Liu, Mark S Neubauer, Jennifer Ngadiuba, Seda Ogrenci-Memik, Maurizio Pierini, Thea Aarrestad, Steffen Bähr, Jürgen Becker, Anne-Sophie Berthold, Richard J Bonventre, Tomás E Müller Bravo, Markus Diefenthaler, Zhen Dong, Nick Fritzsche, Amir Gholami, Ekaterina Govorkova, Dongning Guo, Kyle J Hazelwood, Christian Herwig, Babar Khan, Sehoon Kim, Thomas Klijnsma, Yaling Liu, Kin Ho Lo, Tri Nguyen, Gianantonio Pezzullo, Seyedramin Rasoulinezhad, Ryan A Rivera, Kate Scholberg, Justin Selig, Sougata Sen, Dmitri Strukov, William Tang, Savannah Thais, Kai Lukas Unger, Ricardo Vilalta, Belina von Krosigk, Shen Wang, Thomas K Warburton","doi":"10.3389/fdata.2023.1301942","DOIUrl":"https://doi.org/10.3389/fdata.2023.1301942","url":null,"abstract":"<p><p>[This corrects the article DOI: 10.3389/fdata.2022.787421.].</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"6 ","pages":"1301942"},"PeriodicalIF":3.1,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10614289/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71432284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Frontiers in Big DataPub Date : 2023-10-13eCollection Date: 2023-01-01DOI: 10.3389/fdata.2023.1303367
Lin-Ching Chang, Anastasia Angelopoulou
{"title":"Editorial: Women in AI medicine and public health 2022.","authors":"Lin-Ching Chang, Anastasia Angelopoulou","doi":"10.3389/fdata.2023.1303367","DOIUrl":"https://doi.org/10.3389/fdata.2023.1303367","url":null,"abstract":"","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"6 ","pages":"1303367"},"PeriodicalIF":3.1,"publicationDate":"2023-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10614155/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71429080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Frontiers in Big DataPub Date : 2023-10-12eCollection Date: 2023-01-01DOI: 10.3389/fdata.2023.1249997
Peter Müllner, Elisabeth Lex, Markus Schedl, Dominik Kowald
{"title":"Differential privacy in collaborative filtering recommender systems: a review.","authors":"Peter Müllner, Elisabeth Lex, Markus Schedl, Dominik Kowald","doi":"10.3389/fdata.2023.1249997","DOIUrl":"10.3389/fdata.2023.1249997","url":null,"abstract":"<p><p>State-of-the-art recommender systems produce high-quality recommendations to support users in finding relevant content. However, through the utilization of users' data for generating recommendations, recommender systems threaten users' privacy. To alleviate this threat, often, differential privacy is used to protect users' data via adding random noise. This, however, leads to a substantial drop in recommendation quality. Therefore, several approaches aim to improve this trade-off between accuracy and user privacy. In this work, we first overview threats to user privacy in recommender systems, followed by a brief introduction to the differential privacy framework that can protect users' privacy. Subsequently, we review recommendation approaches that apply differential privacy, and we highlight research that improves the trade-off between recommendation quality and user privacy. Finally, we discuss open issues, e.g., considering the relation between privacy and fairness, and the users' different needs for privacy. With this review, we hope to provide other researchers an overview of the ways in which differential privacy has been applied to state-of-the-art collaborative filtering recommender systems.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"6 ","pages":"1249997"},"PeriodicalIF":2.4,"publicationDate":"2023-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10601453/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71415185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Frontiers in Big DataPub Date : 2023-10-11eCollection Date: 2023-01-01DOI: 10.3389/fdata.2023.1214029
Iustina Ivanova, Mike Wald
{"title":"Climbing crags recommender system in Arco, Italy: a comparative study.","authors":"Iustina Ivanova, Mike Wald","doi":"10.3389/fdata.2023.1214029","DOIUrl":"https://doi.org/10.3389/fdata.2023.1214029","url":null,"abstract":"<p><p>Outdoor sport climbing is popular in Northern Italy due to its vast amount of rock climbing places (such as crags). New climbing crags appear yearly, creating an information overload problem for tourists who plan their sport climbing vacation. Recommender systems partly addressed this issue by suggesting climbing crags according to the most visited places or the number of suitable climbing routes. Unfortunately, these methods do not consider contextual information. However, in sport climbing, as in other outdoor activities, the possibility of visiting certain places depends on several contextual factors, for instance, a suitable season (winter/summer), parking space availability if traveling with a car, or the possibility of climbing with children if traveling with children. To address this limitation, we collected and analyzed the crag visits in Arco (Italy) from an online guidebook. We found that climbing contextual information, similar to users' content preferences, can be modeled by a correlation between recorded visits and crags features. Based on that, we developed and evaluated a novel context-aware climbing crags recommender system Visit & Climb, which consists of three stages as follows: (1) contextual information and content tastes are learned automatically from the users' logs by computing correlation between users' visits and crags' features; (2) those learned tastes are further made adjustable in a preference elicitation web interface; (3) the user receives recommendations on the map according to the number of visits made by a climber with similar learned tastes. To measure the quality of this system, we performed an offline evaluation (where we calculated Mean Average Precision, Recall, and Normalized Discounted Cumulative Gain for top-N), a formative study, and an online evaluation (in a within-subject design with experienced outdoor climbers <i>N</i> = 40, who tried three similar systems including Visit & Climb). Offline tests showed that the proposed system suggests crags to climbers accurately as the other classical models for top-N recommendations. Meanwhile, online tests indicated that the system provides a significantly higher level of information sufficiency than other systems in this domain. The overall results demonstrated that the developed system provides recommendations according to the users' requirements, and incorporating contextual information and crag characteristics into the climbing recommender system leads to increased information sufficiency caused by transparency, which improves satisfaction and use intention.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"6 ","pages":"1214029"},"PeriodicalIF":3.1,"publicationDate":"2023-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10598720/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"54232132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}