Advances in Data Science and Adaptive Analysis最新文献_第5页

Data-Mining Homogeneous Subgroups in Multiple Regression When Heteroscedasticity, Multicollinearity, and Missing Variables Confound Predictor Effects 当异方差、多重共线性和缺失变量混淆预测效应时，多元回归中同质子群的数据挖掘

IF 0.6

Advances in Data Science and Adaptive Analysis Pub Date : 2020-09-05 DOI: 10.1142/s2424922x20410041

R. Francoeur

{"title":"Data-Mining Homogeneous Subgroups in Multiple Regression When Heteroscedasticity, Multicollinearity, and Missing Variables Confound Predictor Effects","authors":"R. Francoeur","doi":"10.1142/s2424922x20410041","DOIUrl":"https://doi.org/10.1142/s2424922x20410041","url":null,"abstract":"Multiple regression is not reliable to recover predictor slopes within homogeneous subgroups from heterogeneous samples. In contrast to Monte Carlo analysis, which assigns completely to the first-specified predictor the variation it shares with the remaining predictors, multiple regression does not assign this shared variation to any predictor, and it is sequestered in the residual term. This unassigned and confounding variation may correlate with specified predictors, lead to heteroscedasticity, and distort multicollinearity. I develop and test an iterative, sequential algorithm to estimate a two-part series of weighted least-square (WLS) multiple regressions for recovering the Monte Carlo predictor slopes in three homogeneous subgroups (each generated with 500 observations) of a heterogeneous sample [Formula: see text]. Each variable has a different nonnormal distribution. The algorithm mines each subgroup and then adjusts bias within it from 1) heteroscedasticity related to one, some, or all specified predictors and 2) “nonessential” multicollinearity. It recovers all three specified predictor slopes across the three subgroups in two scenarios, with one influenced also by two unspecified predictors. The algorithm extends adaptive analysis to discover and appraise patterns in field research and machine learning when predictors are inter-correlated, and even unspecified, in order to reveal unbiased outcome clusters in heterogeneous and homogeneous samples with nonnormal outcome and predictors.","PeriodicalId":47145,"journal":{"name":"Advances in Data Science and Adaptive Analysis","volume":"24 1","pages":"2041004:1-2041004:59"},"PeriodicalIF":0.6,"publicationDate":"2020-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73909655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Design and Implementation of a Novel Hybrid Rental Apartment Recommender System 一种新型混合出租公寓推荐系统的设计与实现

IF 0.6

Advances in Data Science and Adaptive Analysis Pub Date : 2020-08-13 DOI: 10.1142/s2424922x2041003x

A. A. Neloy, S. Alam, R. A. Bindu

{"title":"Design and Implementation of a Novel Hybrid Rental Apartment Recommender System","authors":"A. A. Neloy, S. Alam, R. A. Bindu","doi":"10.1142/s2424922x2041003x","DOIUrl":"https://doi.org/10.1142/s2424922x2041003x","url":null,"abstract":"Recommender Systems (RSs) have become an essential part of most e-commerce sites nowadays. Though there are several studies conducted on RSs, a hybrid recommender system for the real state search engine to find appropriate rental apartment taking users preferences into account is still due. To address this problem, a hybrid recommender system is proposed in this paper constructed by two of the most popular recommendation approaches — Collaborative Filtering (CF), Content-Based Recommender (CBR). CF-based methods use the ratings given to items by users as the sole source of information for learning to make a recommendation. However, these ratings are often very sparse in applications like a search engine, causing CF-based methods to degrade accuracy and performance. To reduce this sparsity problem in the CF method, the Cosine Similarity Score (CSS) between the user and predicted apartment, based on their Feature Vectors (FV) from the CBR module is utilized. Improved and optimized Singular Value Decomposition (SVD) with Bias-Matrix Factorization (MF) of the CF model and CSS with FV of CBR constructs this hybrid recommender. The proposed recommender was evaluated using the Statistical Cross-Validation consisting of Leave-One-Out Validation (LOOCV). Experimental results show that it significantly outperformed a benchmark random recommender in terms of precision and recall. In addition, a graphical analysis of the relationships between the accuracy and error minimization is presented to provide further evidence for the potentiality of this hybrid recommender system in this area.","PeriodicalId":47145,"journal":{"name":"Advances in Data Science and Adaptive Analysis","volume":"543 1","pages":"2041003:1-2041003:17"},"PeriodicalIF":0.6,"publicationDate":"2020-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77695242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Optimizing Data Transmission from IoT Devices Through Weighted Online Data Changing Detectors 通过加权在线数据变化检测器优化物联网设备的数据传输

IF 0.6

Advances in Data Science and Adaptive Analysis Pub Date : 2020-08-13 DOI: 10.1142/s2424922x20410016

M. Diván, M. Reynoso

{"title":"Optimizing Data Transmission from IoT Devices Through Weighted Online Data Changing Detectors","authors":"M. Diván, M. Reynoso","doi":"10.1142/s2424922x20410016","DOIUrl":"https://doi.org/10.1142/s2424922x20410016","url":null,"abstract":"The real-time data analysis requires an integrated approach to know the last known state of variables of a concept under monitoring. Thereby, the Internet-of-Thing (IoT) devices have provided alternatives to address distributed data collection strategies. However, the autonomy of IoT devices represents one of the main challenges to implement the collecting strategy. Battery autonomy is affected directly by the energy consumption derived from data transmissions. The Data Stream Processing Strategy (DSPS) is an architecture oriented to the implementation of measurement projects based on a measurement and evaluation framework. Its online processing is guided by the measurement metadata informed from IoT devices associated with a component named Measurement Adapter (MA). This paper presents a new data buffer organization based on measurement metadata articulated with online data filtering to optimize the data transmissions from MA. As contributions, a weighted data change detection approach is incorporated, while a new local buffer based on logical windows is proposed for MA. Also, an articulation among the data buffer, a temporal barrier, and data change detectors is introduced. The proposal was implemented and released on the pabmmCommons library. A discrete simulation on the library is here described to provide initial applicability patterns. The data buffer consumed 568 Kb for monitoring 100 simultaneous metrics. The online estimation of the mean and variance based on the Statistical Process Control consumed 238 ns. However, as a limitation, other scenarios need to be addressed before generalizing results. As future work, new alternatives to filter noise online will be addressed.","PeriodicalId":47145,"journal":{"name":"Advances in Data Science and Adaptive Analysis","volume":"51 1","pages":"2041001:1-2041001:33"},"PeriodicalIF":0.6,"publicationDate":"2020-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74604332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

IDC Breast Cancer Detection Using Deep Learning Schemes 使用深度学习方案的IDC乳腺癌检测

IF 0.6

Advances in Data Science and Adaptive Analysis Pub Date : 2020-08-13 DOI: 10.1142/s2424922x20410028

K. Kumar, Umair Saeed, Athaul Rai, Noman Islam, G. Shaikh, A. Qayoom

{"title":"IDC Breast Cancer Detection Using Deep Learning Schemes","authors":"K. Kumar, Umair Saeed, Athaul Rai, Noman Islam, G. Shaikh, A. Qayoom","doi":"10.1142/s2424922x20410028","DOIUrl":"https://doi.org/10.1142/s2424922x20410028","url":null,"abstract":"During the past few years, deep learning (DL) architectures are being employed in many potential areas such as object detection, face recognition, natural language processing, medical image analysis and other related applications. In these applications, DL has achieved remarkable results matching the performance of human experts. This paper presents a novel convolutional neural networks (CNN)-based approach for the detection of breast cancer in invasive ductal carcinoma tissue regions using whole slide images (WSI). It has been observed that breast cancer has been a leading cause of death among women. It also remains a striving task for pathologist to find the malignancy regions from WSI. In this research, we have implemented different CNN models which include VGG16, VGG19, Xception, Inception V3, MobileNetV2, ResNet50, and DenseNet. The experiments were performed on standard WSI slides data-set which include 163 patients of IDC. For performance evaluation, same data-set was divided into 113 and 49 images for training and testing, respectively. The testing was carried out separately over each model and the obtained results showed that our proposed CNN model achieved 83% accuracy which is better than the other models.","PeriodicalId":47145,"journal":{"name":"Advances in Data Science and Adaptive Analysis","volume":"8 1","pages":"2041002:1-2041002:19"},"PeriodicalIF":0.6,"publicationDate":"2020-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85193501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Index 指数

IF 0.6

Advances in Data Science and Adaptive Analysis Pub Date : 2020-01-15 DOI: 10.1002/9781119695110.index

引用次数: 0

Other titles from iSTE in Innovation, Entrepreneurship and Management iSTE在创新、创业和管理方面的其他头衔

IF 0.6

Advances in Data Science and Adaptive Analysis Pub Date : 2020-01-15 DOI: 10.1002/9781119695110.oth

引用次数: 0

Bayesian Kernel Regression for Noisy Inputs Based on Nadaraya-Watson Estimator Constructed from Noiseless Training Data 基于Nadaraya-Watson估计的噪声输入贝叶斯核回归

IF 0.6

Advances in Data Science and Adaptive Analysis Pub Date : 2020-01-01 DOI: 10.1142/s2424922x20500047

Ryo Hanafusa, T. Okadome

{"title":"Bayesian Kernel Regression for Noisy Inputs Based on Nadaraya-Watson Estimator Constructed from Noiseless Training Data","authors":"Ryo Hanafusa, T. Okadome","doi":"10.1142/s2424922x20500047","DOIUrl":"https://doi.org/10.1142/s2424922x20500047","url":null,"abstract":"In regression for noisy inputs, noise is typically removed from a given noisy input if possible, and then the resulting noise-free input is provided to the regression function. In some cases, however, there is no available time or method for removing noise. The regression method proposed in this paper determines a regression function for noisy inputs using the estimated posterior of their noise-free constituents with a nonparametric estimator for noiseless explanatory values, which is constructed from noiseless training data. In addition, a probabilistic generative model is presented for estimating the noise distribution. This enables us to determine the noise distribution parametrically from a single noisy input, using the distribution of the noise-free constituent of noisy input estimated from the training data set as a prior. Experiments conducted using artificial and real data sets show that the proposed method suppresses the overfitting of the regression function for noisy inputs and the root mean squared errors (RMSEs) of the predictions are smaller compared with those of an existing method.","PeriodicalId":47145,"journal":{"name":"Advances in Data Science and Adaptive Analysis","volume":"137 1","pages":"2050004:1-2050004:17"},"PeriodicalIF":0.6,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74521454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Hybrid Machine Learning and Geographic Information Systems Approach - A Case for Grade Crossing Crash Data Analysis 混合机器学习和地理信息系统方法——一个平交道口碰撞数据分析的案例

IF 0.6

Advances in Data Science and Adaptive Analysis Pub Date : 2020-01-01 DOI: 10.1142/s2424922x20500035

A. Lasisi, Pengyu Li, Jian Chen

{"title":"Hybrid Machine Learning and Geographic Information Systems Approach - A Case for Grade Crossing Crash Data Analysis","authors":"A. Lasisi, Pengyu Li, Jian Chen","doi":"10.1142/s2424922x20500035","DOIUrl":"https://doi.org/10.1142/s2424922x20500035","url":null,"abstract":"Highway-rail grade crossing (HRGC) accidents continue to be a major source of transportation casualties in the United States. This can be attributed to increased road and rail operations and/or lack of adequate safety programs based on comprehensive HRGC accidents analysis amidst other reasons. The focus of this study is to predict HRGC accidents in a given rail network based on a machine learning analysis of a similar network with cognate attributes. This study is an improvement on past studies that either attempt to predict accidents in a given HRGC or spatially analyze HRGC accidents for a particular rail line. In this study, a case for a hybrid machine learning and geographic information systems (GIS) approach is presented in a large rail network. The study involves collection and wrangling of relevant data from various sources; exploratory analysis, and supervised machine learning (classification and regression) of HRGC data from 2008 to 2017 in California. The models developed from this analysis were used to make binary predictions [98.9% accuracy & 0.9838 Receiver Operating Characteristic (ROC) score] and quantitative estimations of HRGC casualties in a similar network over the next 10 years. While results are spatially presented in GIS, this novel hybrid application of machine learning and GIS in HRGC accidents’ analysis will help stakeholders to pro-actively engage with casualties through addressing major accident causes as identified in this study. This paper is concluded with a Systems-Action-Management (SAM) approach based on text analysis of HRGC accident risk reports from Federal Railroad Administration.","PeriodicalId":47145,"journal":{"name":"Advances in Data Science and Adaptive Analysis","volume":"7 1 1","pages":"2050003:1-2050003:30"},"PeriodicalIF":0.6,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79423402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Weak and Strong Compatibility in Data Fitting Problems Under Interval Uncertainty 区间不确定性下数据拟合问题的弱与强兼容性

IF 0.6

Advances in Data Science and Adaptive Analysis Pub Date : 2020-01-01 DOI: 10.1142/s2424922x20500023

S. P. Shary

引用次数: 5

Condition Monitoring of Equipment in Oil Wells using Deep Learning 基于深度学习的油井设备状态监测

IF 0.6

Advances in Data Science and Adaptive Analysis Pub Date : 2020-01-01 DOI: 10.1142/s2424922x20500011

Y. Imamverdiyev, F. Abdullayeva

引用次数: 3