Regina Bispo, Francisca G. Vieira, Clara Yokochi, Filipe J. Marques, Pedro Espadinha-Cruz, Alexandre Penha, António Grilo
{"title":"Using spatial point process models, clustering and space partitioning to reconfigure fire stations layout","authors":"Regina Bispo, Francisca G. Vieira, Clara Yokochi, Filipe J. Marques, Pedro Espadinha-Cruz, Alexandre Penha, António Grilo","doi":"10.1007/s41060-023-00455-z","DOIUrl":"https://doi.org/10.1007/s41060-023-00455-z","url":null,"abstract":"Abstract Fire stations (FS) are typically non-uniformly distributed across space, and their service area is, in general, defined based on administrative boundaries. Since the location of FS may considerably influence the readiness and the effectiveness of the provided services, national and regional governments need research-based information to adequately plan where to establish firefighting facilities. In this study, we propose a method to reconfigure the fire stations layout using spatial point process models, clustering and space partitioning. First, modelling fire intensity variation across space through a point process model enables to replicate the process independently by simulation. Subsequently, for each simulation, the k -means algorithm is used to define a siting location, minimizing the total within distance between the fire occurrences and the new position. This method allows to obtain a set of locations from which the respective distribution is inferred. Assuming a bivariate normal spatial distribution, we further define confidence siting regions. Ultimately, new FS service areas are defined by Voronoi tessellation. To exemplify the application of the method, we apply it to reconfigure the fire station layout at Aveiro, Portugal.","PeriodicalId":45667,"journal":{"name":"International Journal of Data Science and Analytics","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135591365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Theoretical and practical data science and analytics: challenges and solutions","authors":"Carson K. Leung, Gabriella Pasi, Li Wang","doi":"10.1007/s41060-023-00465-x","DOIUrl":"https://doi.org/10.1007/s41060-023-00465-x","url":null,"abstract":"Big data have become a core technology for providing innovative solutions in numerical applications and services in many fields. Embedded in these big data is valuable information and knowledge. This calls for data science and analytics, which has emerged as an important paradigm for driving the new economy and domains (e.g., Internet of Things, social and mobile networks, cloud computing), reforming classic disciplines (e.g., telecommunications, biology, health and social science), as well as upgrading core business and economic activity. In this article, we focus on both theoretical and practical data science and analytics. We summarize and highlight some of its challenges and solutions, which are covered in the eight articles in the current Special Issue on \"theoretical and practical data science and analytics.\"","PeriodicalId":45667,"journal":{"name":"International Journal of Data Science and Analytics","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135568940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A deep learning-based approach for identifying unresolved questions on Stack Exchange Q &A communities through graph-based communication modelling","authors":"Hassan Abedi Firouzjaei","doi":"10.1007/s41060-023-00454-0","DOIUrl":"https://doi.org/10.1007/s41060-023-00454-0","url":null,"abstract":"Abstract In recent years, online question–answer (Q &A) platforms, such as Stack Exchange (SE), have become increasingly popular for information and knowledge sharing. Despite the vast amount of information available on these platforms, many questions remain unresolved. In this work, we aim to address this issue by proposing a novel approach to identify unresolved questions in SE Q &A communities. Our approach utilises the graph structure of communication formed around a question by users to model the communication network surrounding it. We employ a property graph model and graph neural networks (GNNs), which can effectively capture both the structure of communication and the content of messages exchanged among users. By leveraging the power of graph representation and GNNs, our approach can effectively identify unresolved questions in SE communities. Experimental results on the complete historical data from three distinct Q &A communities demonstrate the superiority of our proposed approach over baseline methods that only consider the content of questions. Finally, our work represents a first but important step towards better understanding the factors that can affect questions becoming and remaining unresolved in SE communities.","PeriodicalId":45667,"journal":{"name":"International Journal of Data Science and Analytics","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136341742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Duo satellite-based remotely sensed land surface temperature prediction by various methods of machine learning","authors":"Shivam Chauhan, Ajay Singh Jethoo, Ajay Mishra, Vaibhav Varshney","doi":"10.1007/s41060-023-00459-9","DOIUrl":"https://doi.org/10.1007/s41060-023-00459-9","url":null,"abstract":"","PeriodicalId":45667,"journal":{"name":"International Journal of Data Science and Analytics","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136279790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Muneeb Ahmad Wani, Peer Bilal Ahmad, Bilal Ahmad Para, Na Elah
{"title":"A new regression model for count data with applications to health care data","authors":"Muneeb Ahmad Wani, Peer Bilal Ahmad, Bilal Ahmad Para, Na Elah","doi":"10.1007/s41060-023-00453-1","DOIUrl":"https://doi.org/10.1007/s41060-023-00453-1","url":null,"abstract":"","PeriodicalId":45667,"journal":{"name":"International Journal of Data Science and Analytics","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135816989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Stefan Bloemheuvel, Jurgen van den Hoogen, Martin Atzmueller
{"title":"Graph construction on complex spatiotemporal data for enhancing graph neural network-based approaches","authors":"Stefan Bloemheuvel, Jurgen van den Hoogen, Martin Atzmueller","doi":"10.1007/s41060-023-00452-2","DOIUrl":"https://doi.org/10.1007/s41060-023-00452-2","url":null,"abstract":"Abstract Graph neural networks (GNNs) haven proven to be an indispensable approach in modeling complex data, in particular spatial temporal data, e.g., relating to sensor data given as time series with according spatial information. Although GNNs provide powerful modeling capabilities on such kind of data, they require adequate input data in terms of both signal and the underlying graph structures. However, typically the according graphs are not automatically available or even predefined, such that typically an ad hoc graph representation needs to be constructed. However, often the construction of the underlying graph structure is given insufficient attention. Therefore, this paper performs an in-depth analysis of several methods for constructing graphs from a set of sensors attributed with spatial information, i.e., geographical coordinates, or using their respective attached signal data. We apply a diverse set of standard methods for estimating groups and similarities between graph nodes as location-based as well as signal-driven approaches on multiple benchmark datasets for evaluation and assessment. Here, for both areas, we specifically include distance-based, clustering-based, as well as correlation-based approaches for estimating the relationships between nodes for subsequent graph construction. In addition, we consider two different GNN approaches, i.e., regression and forecasting in order to enable a broader experimental assessment. Typically, no predefined graph is given, such that (ad hoc) graph creation is necessary. Here, our results indicate the criticality of factoring in the crucial step of graph construction into GNN-based research on spatial temporal data. Overall, in our experimentation no single approach for graph construction emerged as a clear winner. However, in our analysis we are able to provide specific indications based on the obtained results, for a specific class of methods. Collectively, the findings highlight the need for researchers to carefully consider graph construction when employing GNNs in the analysis of spatial temporal data.","PeriodicalId":45667,"journal":{"name":"International Journal of Data Science and Analytics","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135816920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A machine learning approach to predict geomechanical properties of rocks from well logs","authors":"None Rohit, Shri Ram Manda, Aditya Raj, Nagababu Andraju","doi":"10.1007/s41060-023-00451-3","DOIUrl":"https://doi.org/10.1007/s41060-023-00451-3","url":null,"abstract":"","PeriodicalId":45667,"journal":{"name":"International Journal of Data Science and Analytics","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136154592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A new generalization of the zero-truncated negative binomial distribution by a Lagrange expansion with associated regression model and applications","authors":"Mohanan Monisha, Radhakumari Maya, Muhammed Rasheed Irshad, Christophe Chesneau, Damodaran Santhamani Shibu","doi":"10.1007/s41060-023-00449-x","DOIUrl":"https://doi.org/10.1007/s41060-023-00449-x","url":null,"abstract":"","PeriodicalId":45667,"journal":{"name":"International Journal of Data Science and Analytics","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135307743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatic identification of rank correlation between image sequences","authors":"Lior Shamir","doi":"10.1007/s41060-023-00450-4","DOIUrl":"https://doi.org/10.1007/s41060-023-00450-4","url":null,"abstract":"","PeriodicalId":45667,"journal":{"name":"International Journal of Data Science and Analytics","volume":"2013 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135436799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
E. M. A. Stephanie, L. G. B. Ruiz, M. A. Vila, M. C. Pegalajar
{"title":"Study of violence against women and its characteristics through the application of text mining techniques","authors":"E. M. A. Stephanie, L. G. B. Ruiz, M. A. Vila, M. C. Pegalajar","doi":"10.1007/s41060-023-00448-y","DOIUrl":"https://doi.org/10.1007/s41060-023-00448-y","url":null,"abstract":"The Internet provides a wide variety of information that can be collected and studied, creating a massive data repository. Among the data available on the Internet, we can find articles about Violence against Women (VAW) published in the digital press, which are of great societal interest. In this work, we utilized Web scraping techniques to gather VAW-related news from the internet. Applying Text Mining techniques, we conducted a study on VAW and its characteristics. Our work comprises an exploratory analysis and the application of Topic Modelling to VAW events to identify latent topics and their semantic structures. We employed classification algorithms on a set of VAW press articles to determine the type of violence they refer to, namely physical, psychological, sexual, or a combination of them. We proposed two methodologies to target the data: the first one is based on dictionaries of VAW types, while the second approach extends the former by using the predominant violence to identify other associated types. Furthermore, we implemented two feature selection techniques: TF-IDF and $${Chi}^{2}$$ . Then, we applied Support Vector Machine, Decision Tree, Bayesian Networks, XGBoost Classifier, Random Forest, and Artificial Neural Networks. The results obtained showed that the classifiers achieved better performance when using $${Chi}^{2}$$ . The Boost Classifier demonstrated the best performance, followed by Random Forest.","PeriodicalId":45667,"journal":{"name":"International Journal of Data Science and Analytics","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134912231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}