Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery最新文献_第4页

Machine intelligence in dynamical systems: A state‐of‐art review 动态系统中的机器智能:最新进展

IF 7.8 2区计算机科学

Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery Pub Date : 2022-05-13 DOI: 10.1002/widm.1461

A. Sahoo, S. Chakraverty

引用次数: 3

Critical review of bio‐inspired data optimization techniques: An image steganalysis perspective 生物启发数据优化技术的关键审查:图像隐写分析的角度

IF 7.8 2区计算机科学

Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery Pub Date : 2022-05-03 DOI: 10.1002/widm.1460

Anita Christaline Johnvictor, Austin Joe Amalanathan, Ramya Meghana Pariti Venkata, Nishtha Jethi

{"title":"Critical review of bio‐inspired data optimization techniques: An image steganalysis perspective","authors":"Anita Christaline Johnvictor, Austin Joe Amalanathan, Ramya Meghana Pariti Venkata, Nishtha Jethi","doi":"10.1002/widm.1460","DOIUrl":"https://doi.org/10.1002/widm.1460","url":null,"abstract":"Image steganalysis involves the discovery of secret information embedded in an image. The common method is blind image steganalysis, which is a two‐class classification problem. Blind steganalysis extracts all possible feature variations in an image due to embedding, select the most appropriate feature data, and then classifies the image. The dimensionality of the extracted image features are high and demand data reduction to identify the most relevant features and to aid accurate classification of an image. The classification is under two classes namely, clean (cover) image and stego (image with embedded secret data) image. Since the classification accuracy depends on selection of most appropriate features, opting for the best data reduction or data optimization algorithms becomes a prime requisite. Research shows that most of the statistical optimization techniques converge to local minima and lead to less classification accuracy as compared to bio‐inspired methods. Bio‐inspired optimization methods obtain improved classification accuracy by reducing the high‐dimensional image features. These methods start with an initial population and then optimize them in steps till a global optimal point is reached. Examples of such methods include Ant Lion Optimization (ALO), Fire Fly Algorithm (FFA), and literature shows around 54 such algorithms. Bio‐inspired optimization has been applied in various fields of design optimization and is novel to image steganalysis. This article analyses the various bio‐inspired optimization techniques and their accuracy in image steganalysis pertaining to the discovery of embedded information in both JPEG and spatial domain steganalysis.","PeriodicalId":48970,"journal":{"name":"Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery","volume":"20 1","pages":""},"PeriodicalIF":7.8,"publicationDate":"2022-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85287828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Artificial intelligence for climate change adaptation 适应气候变化的人工智能

IF 7.8 2区计算机科学

Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery Pub Date : 2022-04-12 DOI: 10.1002/widm.1459

S. Cheong, K. Sankaran, Hamsa Bastani

引用次数: 5

A review on data fusion in multimodal learning analytics and educational data mining 多模态学习分析与教育数据挖掘中的数据融合研究综述

IF 7.8 2区计算机科学

Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery Pub Date : 2022-04-05 DOI: 10.1002/widm.1458

Wilson Chango, J. Lara, Rebeca Cerezo, C. Romero

{"title":"A review on data fusion in multimodal learning analytics and educational data mining","authors":"Wilson Chango, J. Lara, Rebeca Cerezo, C. Romero","doi":"10.1002/widm.1458","DOIUrl":"https://doi.org/10.1002/widm.1458","url":null,"abstract":"The new educational models such as smart learning environments use of digital and context‐aware devices to facilitate the learning process. In this new educational scenario, a huge quantity of multimodal students' data from a variety of different sources can be captured, fused, and analyze. It offers to researchers and educators a unique opportunity of being able to discover new knowledge to better understand the learning process and to intervene if necessary. However, it is necessary to apply correctly data fusion approaches and techniques in order to combine various sources of multimodal learning analytics (MLA). These sources or modalities in MLA include audio, video, electrodermal activity data, eye‐tracking, user logs, and click‐stream data, but also learning artifacts and more natural human signals such as gestures, gaze, speech, or writing. This survey introduces data fusion in learning analytics (LA) and educational data mining (EDM) and how these data fusion techniques have been applied in smart learning. It shows the current state of the art by reviewing the main publications, the main type of fused educational data, and the data fusion approaches and techniques used in EDM/LA, as well as the main open problems, trends, and challenges in this specific research area.","PeriodicalId":48970,"journal":{"name":"Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery","volume":"13 1","pages":""},"PeriodicalIF":7.8,"publicationDate":"2022-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84559881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17

A review of bus arrival time prediction using artificial intelligence 基于人工智能的公交到达时间预测研究综述

IF 7.8 2区计算机科学

Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery Pub Date : 2022-04-03 DOI: 10.1002/widm.1457

Nisha Singh, K. Kumar

引用次数: 5

Gaining insights in datasets in the shade of “garbage in, garbage out” rationale: Feature space distribution fitting 在“垃圾输入，垃圾输出”原理的阴影下获得数据集的见解:特征空间分布拟合

IF 7.8 2区计算机科学

Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery Pub Date : 2022-03-30 DOI: 10.1002/widm.1456

Gürol Canbek

{"title":"Gaining insights in datasets in the shade of “garbage in, garbage out” rationale: Feature space distribution fitting","authors":"Gürol Canbek","doi":"10.1002/widm.1456","DOIUrl":"https://doi.org/10.1002/widm.1456","url":null,"abstract":"This article emphasizes comprehending the “Garbage In, Garbage Out” (GIGO) rationale and ensuring the dataset quality in Machine Learning (ML) applications to achieve high and generalizable performance. An initial step should be added in an ML workflow where researchers evaluate the insights gained by quantitative analysis of the datasets sample and feature spaces. This study contributes towards achieving such a goal by suggesting a technique to quantify datasets in terms of feature frequency distribution characteristics. Hence a unique insight is provided into how the features in the available dataset samples are frequent. The technique was demonstrated in 11 benign and malign (malware) Android application datasets belonging to six academic Android mobile malware classification studies. The permissions requested by applications such as CALL_PHONE compose a relatively high‐dimensional binary feature space. The results showed that the distributions fit well into two of the four long right‐tail statistical distributions: log‐normal, exponential, power law, and Poisson. Precisely, log‐normal was the most exhibited statistical distribution except the two malign datasets that were in exponential. This study also explores statistical distribution fit/unfit feature analysis that enhances the insights in feature space. Finally, the study compiles phenomena examples in the literature exhibiting these statistical distributions that should be considered for interpreting the fitted distributions. In conclusion, conducting well‐formed statistical methods provides a clear understanding of the datasets and intra‐class and inter‐class differences before proceeding with selecting features and building a classifier model. Feature distribution characteristics should be one to analyze beforehand.","PeriodicalId":48970,"journal":{"name":"Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery","volume":"18 1","pages":""},"PeriodicalIF":7.8,"publicationDate":"2022-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79092308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Review and data mining of linguistic studies of English modal verbs 英语情态动词语言学研究综述与数据挖掘

IF 7.8 2区计算机科学

Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery Pub Date : 2022-03-29 DOI: 10.1002/widm.1455

Jianping Yu, Jilin Fu, Tana Bai, Xueping Xu

{"title":"Review and data mining of linguistic studies of English modal verbs","authors":"Jianping Yu, Jilin Fu, Tana Bai, Xueping Xu","doi":"10.1002/widm.1455","DOIUrl":"https://doi.org/10.1002/widm.1455","url":null,"abstract":"Modal verbs express modality, and modality is concerned with the status of the proposition that describes an event, it also expresses the opinion and attitude of a speaker toward the proposition of an utterance. Since modalities are directly related to the objective world, subjective world, and language use, they have been a hot topic of philosophers, logicians and linguists. Philosophers concern with the relations between the objective world and the true/false values of the modality; logicians are interested in the relations among the possibility, necessity and the objective world; and linguists pay attention to the modality category, sense category, function, recognition, and application of modal verbs. In recent years, the linguistic studies of modal verbs have extended from general linguistic studies to computational linguistic studies. Since modal verbs are a complex semantic system and they are often indeterminate in senses, they have been a tough issue in linguistic studies and have attracted great attention. To clarify the status of the previous linguistic studies of modal verbs and reveal the characteristics of the studies will be of great significance for the further study. Therefore, this article will focus on the review of the previous linguistic studies of English modal verbs and the data mining of the characteristics of the previous studies, and based on the summary of the previous studies, give suggestions for the further study of the English modal verbs.","PeriodicalId":48970,"journal":{"name":"Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery","volume":"11 1","pages":""},"PeriodicalIF":7.8,"publicationDate":"2022-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74774200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Subgraph mining in a large graph: A review 大图中的子图挖掘:综述

IF 7.8 2区计算机科学

Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery Pub Date : 2022-03-08 DOI: 10.1002/widm.1454

Lam B. Q. Nguyen, I. Zelinka, V. Snás̃el, Loan T. T. Nguyen, Bay Vo

引用次数: 8

Machine learning in postgenomic biology and personalized medicine. 后基因组生物学和个性化医疗中的机器学习。

IF 7.8 2区计算机科学

Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery Pub Date : 2022-03-01 DOI: 10.1002/widm.1451

Animesh Ray

{"title":"Machine learning in postgenomic biology and personalized medicine.","authors":"Animesh Ray","doi":"10.1002/widm.1451","DOIUrl":"https://doi.org/10.1002/widm.1451","url":null,"abstract":"<p><p>In recent years Artificial Intelligence in the form of machine learning has been revolutionizing biology, biomedical sciences, and gene-based agricultural technology capabilities. Massive data generated in biological sciences by rapid and deep gene sequencing and protein or other molecular structure determination, on the one hand, requires data analysis capabilities using machine learning that are distinctly different from classical statistical methods; on the other, these large datasets are enabling the adoption of novel data-intensive machine learning algorithms for the solution of biological problems that until recently had relied on mechanistic model-based approaches that are computationally expensive. This review provides a bird's eye view of the applications of machine learning in post-genomic biology. Attempt is also made to indicate as far as possible the areas of research that are poised to make further impacts in these areas, including the importance of explainable artificial intelligence (XAI) in human health. Further contributions of machine learning are expected to transform medicine, public health, agricultural technology, as well as to provide invaluable gene-based guidance for the management of complex environments in this age of global warming.</p>","PeriodicalId":48970,"journal":{"name":"Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery","volume":"12 2","pages":""},"PeriodicalIF":7.8,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9371441/pdf/nihms-1770264.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9375926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Review of automated time series forecasting pipelines 回顾自动化时间序列预测管道

IF 7.8 2区计算机科学

Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery Pub Date : 2022-02-03 DOI: 10.1002/widm.1475

Stefan Meisenbacher, Marian Turowski, Kaleb Phipps, Martin Ratz, D. Muller, V. Hagenmeyer, R. Mikut

{"title":"Review of automated time series forecasting pipelines","authors":"Stefan Meisenbacher, Marian Turowski, Kaleb Phipps, Martin Ratz, D. Muller, V. Hagenmeyer, R. Mikut","doi":"10.1002/widm.1475","DOIUrl":"https://doi.org/10.1002/widm.1475","url":null,"abstract":"Time series forecasting is fundamental for various use cases in different domains such as energy systems and economics. Creating a forecasting model for a specific use case requires an iterative and complex design process. The typical design process includes five sections (1) data preprocessing, (2) feature engineering, (3) hyperparameter optimization, (4) forecasting method selection, and (5) forecast ensembling, which are commonly organized in a pipeline structure. One promising approach to handle the ever‐growing demand for time series forecasts is automating this design process. The article, thus, reviews existing literature on automated time series forecasting pipelines and analyzes how the design process of forecasting models is currently automated. Thereby, we consider both automated machine learning (AutoML) and automated statistical forecasting methods in a single forecasting pipeline. For this purpose, we first present and compare the identified automation methods for each pipeline section. Second, we analyze these automation methods regarding their interaction, combination, and coverage of the five pipeline sections. For both, we discuss the reviewed literature that contributes toward automating the design process, identify problems, give recommendations, and suggest future research. This review reveals that the majority of the reviewed literature only covers two or three of the five pipeline sections. We conclude that future research has to holistically consider the automation of the forecasting pipeline to enable the large‐scale application of time series forecasting.","PeriodicalId":48970,"journal":{"name":"Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery","volume":"23 3 1","pages":""},"PeriodicalIF":7.8,"publicationDate":"2022-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89386499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 19