Annals of Data Science最新文献

筛选
英文 中文
Real Estate Market Prediction Using Deep Learning Models 利用深度学习模型预测房地产市场
Annals of Data Science Pub Date : 2024-06-04 DOI: 10.1007/s40745-024-00543-2
Ramchandra Rimal, Binod Rimal, Hum Nath Bhandari, Nawa Raj Pokhrel, Keshab R. Dahal
{"title":"Real Estate Market Prediction Using Deep Learning Models","authors":"Ramchandra Rimal,&nbsp;Binod Rimal,&nbsp;Hum Nath Bhandari,&nbsp;Nawa Raj Pokhrel,&nbsp;Keshab R. Dahal","doi":"10.1007/s40745-024-00543-2","DOIUrl":"10.1007/s40745-024-00543-2","url":null,"abstract":"<div><p>Real estate significantly contributes to the broader stock market and garners substantial attention from individual households to the overall country’s economy. Predicting real estate trends holds great importance for investors, policymakers, and stakeholders to make informed decisions. However, accurate forecasting remains challenging due to it’s complex, volatile, and nonlinear behavior. This study develops a unified computational framework for implementing state-of-the-art deep learning model architectures the long short-term memory (LSTM), the gated recurrent unit (GRU), the convolutional neural network (CNN), their variants, and hybridizations, to predict the next day’s closing price of the real estate index S &amp;P500-60. We incorporate diverse data sources by integrating real estate-specific indicators on top of fundamental data, macroeconomic factors, and technical indicators, capturing multifaceted features. Several models with varying degrees of complexity are constructed using different architectures and configurations. Model performance is evaluated using standard regression metrics, and statistical analysis is employed for model selection and validation to ensure robustness. The experimental results illustrate that the base GRU model, followed by the bidirectional GRU model, offers a superior fit with high accuracy in predicting the closing price of the index. We additionally tested the constructed models on the Vanguard Real Estate Index Fund ETF and the Dow Jones U.S. Real Estate Index for robustness and obtained consistent outcomes. The proposed framework can easily be generalized to model sequential data in various other domains.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 4","pages":"1113 - 1156"},"PeriodicalIF":0.0,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141267658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analysis of the HIV/AIDS Data Using Joint Modeling of Longitudinal (k,l)-Inflated Count and Time to Event Data in Clinical Trials 临床试验中纵向(k,l)膨胀计数和事件时间数据联合建模的HIV/AIDS数据分析
Annals of Data Science Pub Date : 2024-05-30 DOI: 10.1007/s40745-024-00532-5
Mojtaba Zeinali Najafabadi, Ehsan Bahrami Samani
{"title":"Analysis of the HIV/AIDS Data Using Joint Modeling of Longitudinal (k,l)-Inflated Count and Time to Event Data in Clinical Trials","authors":"Mojtaba Zeinali Najafabadi,&nbsp;Ehsan Bahrami Samani","doi":"10.1007/s40745-024-00532-5","DOIUrl":"10.1007/s40745-024-00532-5","url":null,"abstract":"<div><p>Generalized linear mixed effect models (GLMEMs) are widely applied for the analysis of correlated non-Gaussian data such as those found in longitudinal studies. On the other hand, the Cox (proportional hazards, PHs) and the accelerated failure time (AFT) regression models are two well-known approaches in survival analysis to modeling time to event (TTE) data. In this article, we develop joint modeling of longitudinal count (LC) and TTE data and consider extensions with fixed effects and parametric random effects in our proposed joint models. The LC response is inflated in two points k and l (k &lt; l) and we use some members of (k, l)-inflated power series distribution (PSD) as the distribution of this response. Also, for modeling of TTE process, the PHs model of Cox and the AFT model, based on a flexible hazard function, are separately proposed. One of the goals of the present paper is to evaluate and compare the performance of joint models of (k, l)-inflated LC and TTE data under two mentioned approaches via extensive simulations. The estimation is through the penalized likelihood method, and our concentration is on efficient computation and effective parameter selection. To assist efficient computation, the joint likelihoods of the observations and the latent variables of the random effects are used instead of the marginal likelihood of the observations. Finally, a real AIDS data example is presented to illustrate the potential applications of our joint models.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 2","pages":"695 - 719"},"PeriodicalIF":0.0,"publicationDate":"2024-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143809249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Omega ({{omega}})—Type Probability Models: A Parametric Modification of Probability Distributions Omega ({{omega}})型概率模型:概率分布的参数化修正
Annals of Data Science Pub Date : 2024-05-27 DOI: 10.1007/s40745-024-00539-y
Udochukwu Victor Echebiri, Nosakhare Liberty Osawe, Chukwuemeka Thomas Onyia
{"title":"Omega ({{omega}})—Type Probability Models: A Parametric Modification of Probability Distributions","authors":"Udochukwu Victor Echebiri,&nbsp;Nosakhare Liberty Osawe,&nbsp;Chukwuemeka Thomas Onyia","doi":"10.1007/s40745-024-00539-y","DOIUrl":"10.1007/s40745-024-00539-y","url":null,"abstract":"<div><p>A mathematical approach to developing new distributions is reviewed. The method which composes of integration and the concept of a normalizing constant, allows for primitive interjection of new parameter(s) in an existing distribution to form new model(s), called <i>Omega-Type</i> probability models. A probability distribution is proposed from a root model, Lindley distribution, and some properties, such as the series representation of the density and cumulative distribution functions, shape of the density, hazard and survival functions, moments and related measures, quantile function, order statistics, parameter estimation and interval estimate, were studied. Amidst the usual hazard and survival shapes, a constant or uniform trend was realized for the survival function, which projects the possibility of modeling systems that may not terminate over a given period of time. Three different methods of estimation, namely, the Cramer‒von Mises estimator, maximum product of the spacing estimator and maximum likelihood estimator, were used. The modified unimodal shape of the proposed distribution is added as a special feature in the improvements made among the Lindley family of distributions. Finally, two real-life datasets were fitted to the new distribution to demonstrate its economic importance.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 3","pages":"855 - 876"},"PeriodicalIF":0.0,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145170703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
UAV-YOLOv5: A Swin-Transformer-Enabled Small Object Detection Model for Long-Range UAV Images UAV-YOLOv5:斯温变换器支持的远距离无人机图像小目标检测模型
Annals of Data Science Pub Date : 2024-05-25 DOI: 10.1007/s40745-024-00546-z
Jun Li, Chong Xie, Sizheng Wu, Yawei Ren
{"title":"UAV-YOLOv5: A Swin-Transformer-Enabled Small Object Detection Model for Long-Range UAV Images","authors":"Jun Li,&nbsp;Chong Xie,&nbsp;Sizheng Wu,&nbsp;Yawei Ren","doi":"10.1007/s40745-024-00546-z","DOIUrl":"10.1007/s40745-024-00546-z","url":null,"abstract":"<div><p>This paper tackle the challenges associated with low recognition accuracy and the detection of occlusions when identifying long-range and diminutive targets (such as UAVs). We introduce a sophisticated detection framework named UAV-YOLOv5, which amalgamates the strengths of Swin Transformer V2 and YOLOv5. Firstly, we introduce Focal-EIOU, a refinement of the K-means algorithm tailored to generate anchor boxes better suited for the current dataset, thereby improving detection performance. Second, the convolutional and pooling layers in the network with step size greater than 1 are replaced to prevent information loss during feature extraction. Then, the Swin Transformer V2 module is introduced in the Neck to improve the accuracy of the model, and the BiFormer module is introduced to improve the ability of the model to acquire global and local feature information at the same time. In addition, BiFPN is introduced to replace the original FPN structure so that the network can acquire richer semantic information and fuse features across scales more effectively. Lastly, a small target detection head is appended to the existing architecture, augmenting the model’s proficiency in detecting smaller targets with heightened precision. Furthermore, various experiments are conducted on the comprehensive dataset to verify the effectiveness of UAV-YOLOv5, achieving an average accuracy of 87%. Compared with YOLOv5, the mAP of UAV-YOLOv5 is improved by 8.5%, which verifies that it has high-precision long-range small-target UAV optoelectronic detection capability.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 4","pages":"1109 - 1138"},"PeriodicalIF":0.0,"publicationDate":"2024-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142413758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Deep Convolutional Neural Network-Based Approach for Visual Search & Recommendation of Grocery Products 基于深度卷积神经网络的杂货产品视觉搜索与推荐方法
Annals of Data Science Pub Date : 2024-05-23 DOI: 10.1007/s40745-024-00540-5
Nawreen Anan Khandaker, Amrin Rahman, Amrin Akter Pinky, Tasmiah Tamzid Anannya
{"title":"A Deep Convolutional Neural Network-Based Approach for Visual Search & Recommendation of Grocery Products","authors":"Nawreen Anan Khandaker,&nbsp;Amrin Rahman,&nbsp;Amrin Akter Pinky,&nbsp;Tasmiah Tamzid Anannya","doi":"10.1007/s40745-024-00540-5","DOIUrl":"10.1007/s40745-024-00540-5","url":null,"abstract":"<div><p>Search and recommendation are two essential features of any e-commerce website for finding and purchasing a specific product. Visual Search is a promising and quick method in comparison to a textual-based search method. Hence, the objective of this research is to propose a conceptual framework for developing a visual search and recommendation system for grocery products using Ensemble Learning with CNN models. Traditional Deep learning and Ensemble Learning techniques were implemented with a publicly available and a self-made data set containing 3174 and 3162 images respectively. Various combinations of the suitable models found from research findings were used to find the best-fitted model for both the search and recommendation functionalities. All the models were evaluated using suitable performance metrics and the Ensemble Learning approach performed better. The best-performed results for visual searching are obtained by incorporating VGG16 and MobileNet with an accuracy of 99.8% for classification and in the case of product recommendation, the combination of MobileNET and ResNET50 performs better than other techniques.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 3","pages":"877 - 897"},"PeriodicalIF":0.0,"publicationDate":"2024-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141104071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Survey of Artificial Intelligence for Industrial Detection 用于工业检测的人工智能调查
Annals of Data Science Pub Date : 2024-05-23 DOI: 10.1007/s40745-024-00545-0
Jun Li, YiFei Hai, SongJia Yin
{"title":"A Survey of Artificial Intelligence for Industrial Detection","authors":"Jun Li,&nbsp;YiFei Hai,&nbsp;SongJia Yin","doi":"10.1007/s40745-024-00545-0","DOIUrl":"10.1007/s40745-024-00545-0","url":null,"abstract":"<div><p>In the past decade, deep learning has greatly increased the complexity of industrial production intelligence by virtue of its powerful learning capability. At the same time, it has also brought security challenges to the field of industrial production information networks, mainly in two aspects: production safety and network information security. The former is mainly focused on ensuring the safety of personnel behavior in the production environment, including two different categories: detection of dangerous targets and identification of dangerous behaviors. The latter focuses on the safety of industrial information systems, especially networks. In recent years, deep learning-based detection techniques have made great strides in addressing these dual problems. Therefore, this paper presents an exhaustive study on the development of deep learning-based detection methods for industrial production safety analysis and information network security problem detection. The paper presents a comprehensive taxonomy for classifying production environments and production network information, classifying and clustering prevalent industrial security challenges, with a special emphasis on the role of deep learning in insecure behavior identification and information security risk detection.We provides an in-depth analysis of the advantages, limitations, and suitable application scenarios of these two approaches. In addition, the paper provides insights into contemporary challenges and future trends in this field and concludes with a discussion of prospects for future research.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 2","pages":"799 - 827"},"PeriodicalIF":0.0,"publicationDate":"2024-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141103821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Combining Nonlinear Features of EEG and MRI to Diagnose Alzheimer’s Disease 结合脑电图和磁共振成像的非线性特征诊断阿尔茨海默病
Annals of Data Science Pub Date : 2024-05-21 DOI: 10.1007/s40745-024-00533-4
Elias Mazrooei Rad, Mahdi Azarnoosh, Majid Ghoshuni, Mohammad Mahdi Khalilzadeh
{"title":"Combining Nonlinear Features of EEG and MRI to Diagnose Alzheimer’s Disease","authors":"Elias Mazrooei Rad,&nbsp;Mahdi Azarnoosh,&nbsp;Majid Ghoshuni,&nbsp;Mohammad Mahdi Khalilzadeh","doi":"10.1007/s40745-024-00533-4","DOIUrl":"10.1007/s40745-024-00533-4","url":null,"abstract":"<div><p>This article, a new method for the diagnosis of Alzheimer’s disease in the mild stage is presented according to combining the characteristics of EEG signal and MRI images. The brain signal is recorded in four modes of closed-eyes, open eye, reminder, and stimulation from three channels Pz, Cz, and Fz of 90 participants in three groups of healthy subjects, mild, and severe Alzheimer’s disease (AD) patients.In addition, MRI images are taken with at least 3 Tesla and the thickness of 3 mm so it can be examined the senile plaques and neurofibrillary tangles. Proper image segmentation, mask, and sharp filters are used for preprocessing. Then proper features of brain signals extracted according to the nonlinear and chaotic nature of the brain such as Lyapunov exponent, correlation dimension, and entropy. Results: These features combined with brain MRI images properties including Medial Temporal Lobe Atrophy (MTA), Cerebral Spinal Fluid (CSF), Gray Matter (GM), Index Asymmetry (IA) and White Matter (WM) to diagnose the disease. Then two classifiers, the support vector machine, and Elman neural network are used with the optimal combined features extracted by analysis of variance. Results showed that between the three brain signals, and between the four modes of evaluation, the accuracy of the Pz channel and excitation mode was more than the others. Conclusions: Finally, by using neural network dynamics because of the nonlinear properties studied and due to the nonlinear dynamics of the EEG signal, the Elman neural network is used. However, it is the important to note that, by the way of analyzing medical images, we can determine the most effective channel for recording brain signals. 3D segmentation of MRI images further helps researchers diagnose Alzheimer’s disease and obtain important information. The accuracy of the results in Elman neural network with the combination of brain signal features and medical images is 94.4% and in the case without combining the signal and image features, the accuracy of the results is 92.2%. The use of nonlinear classifiers is more appropriate than other classification methods due to the nonlinear dynamics of the brain signal. The accuracy of the results in the support vector machine with RBF core with the combination of brain signal features and medical images is 75.5% and in the case without combining the signal and image features, the accuracy of the results is 76.8%.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 1","pages":"95 - 116"},"PeriodicalIF":0.0,"publicationDate":"2024-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141115548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatial Data Analysis for Robust Classification of Network Topology Through Synthetic Combinatorics 通过合成组合学对网络拓扑结构进行稳健分类的空间数据分析
Annals of Data Science Pub Date : 2024-05-20 DOI: 10.1007/s40745-024-00523-6
Samrat Hore, Stabak Roy, Malabika Boruah, Saptarshi Mitra
{"title":"Spatial Data Analysis for Robust Classification of Network Topology Through Synthetic Combinatorics","authors":"Samrat Hore,&nbsp;Stabak Roy,&nbsp;Malabika Boruah,&nbsp;Saptarshi Mitra","doi":"10.1007/s40745-024-00523-6","DOIUrl":"10.1007/s40745-024-00523-6","url":null,"abstract":"<div><p>The measurement of network topology through various spatial topological indices like Alpha, Beta and Gamma are widely used for spatial data analysis. However, explaining the classification of the network topology of a city based on Alpha, Beta and Gamma indices is not conclusive, as the result of individual indices are different. To address an efficient classification of network topology, a Modified Synthetic Indicator (MSI) has been proposed and criticised over existing synthetic indicators based on the Composite Weighted Connectivity Index (CWCI), the linear combination of Alpha, Beta and Gamma indices. Application of the proposed MSI in micro-level (ward level) classification of network topology i.e., road network connectivity, has been verified in Agartala City and calibrates the efficiency of CWCI over Alpha, Beta and Gamma indices. The study reveals that the proposed CWCI is more robust than any individual graph-theoretic measure.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 4","pages":"1341 - 1359"},"PeriodicalIF":0.0,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141122125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating the Performance of Machine Learning Algorithm for Classification of Safer Sexual Negotiation among Married Women in Bangladesh 评估孟加拉已婚妇女安全性谈判分类机器学习算法的性能
Annals of Data Science Pub Date : 2024-05-20 DOI: 10.1007/s40745-024-00535-2
Md. Mizanur Rahman, Deluar J. Moloy, Mashfiqul Huq Chowdhury, Arzo Ahmed, Taksina Kabir
{"title":"Evaluating the Performance of Machine Learning Algorithm for Classification of Safer Sexual Negotiation among Married Women in Bangladesh","authors":"Md. Mizanur Rahman,&nbsp;Deluar J. Moloy,&nbsp;Mashfiqul Huq Chowdhury,&nbsp;Arzo Ahmed,&nbsp;Taksina Kabir","doi":"10.1007/s40745-024-00535-2","DOIUrl":"10.1007/s40745-024-00535-2","url":null,"abstract":"<div><p>Safer sexual practice is essential for improving women’s reproductive and sexual health outcomes. The goal of this study is to identify the contributing factors influencing safer sexual negotiations (SSN) through the application of machine learning algorithms. The algorithms include logistic regression (LR), random forest, Naïve Bayes, linear discriminant analysis, classification and regression trees, support vector machines (SVM), and K-nearest neighbors. This study utilized data from the 2017-18 Bangladesh Demographic and Health Survey, encompassing 19,457 married women within the ages of 15–49 years. The analysis reveals that the SVM algorithm achieved the highest classification accuracy (99.66%), along with high sensitivity (99.98%) and the lowest specificity. Conversely, the LR model produced the highest area under the curve statistics (0.6699), indicating good performance in distinguishing SSN among married women. The outcome illustrated that women’s autonomy, engagement with financial institutions, educational attainment, and their partner’s education play a significant role in SSN with their partners. The findings highlight the significance of empowering women, enhancing reproductive health awareness, and improving socio-economic conditions and education to encourage SSN. The government needs to consider all these risk factors to promote greater SSN for preventing sexually transmitted diseases among women in Bangladesh.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 2","pages":"721 - 737"},"PeriodicalIF":0.0,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141122786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unified Image Harmonization with Region Augmented Attention Normalization 利用区域增强注意力归一化统一图像协调
Annals of Data Science Pub Date : 2024-05-11 DOI: 10.1007/s40745-024-00531-6
Junjie Hou, Yuqi Zhang, Duo Su
{"title":"Unified Image Harmonization with Region Augmented Attention Normalization","authors":"Junjie Hou,&nbsp;Yuqi Zhang,&nbsp;Duo Su","doi":"10.1007/s40745-024-00531-6","DOIUrl":"10.1007/s40745-024-00531-6","url":null,"abstract":"<div><p>The image harmonization task endeavors to adjust foreground information within an image synthesis process to achieve visual consistency by leveraging background information. In academic research, this task conventionally involves the utilization of simple synthesized images and matching masks as inputs. However, obtaining precise masks for image harmonization in practical applications poses a significant challenge, thereby creating a notable disparity between research findings and real-world applicability. To mitigate this disparity, we propose a redefinition of the image harmonization task as “Unified Image Harmonization,” where the input comprises only a single image, thereby enhancing its applicability in real-world scenarios. To address this challenge, we have developed a novel framework. Within this framework, we initially employ inharmonious region localization to detect the mask, which is subsequently utilized for harmonization tasks. The pivotal aspect of the harmonization process lies in normalization, which is accountable for information transfer. Nonetheless, the current background-to-foreground information transfer and guidance mechanisms are limited by single-layer guidance, thereby constraining their effectiveness. To overcome this limitation, we introduce Region Augmented Attention Normalization (RA2N), which enhances the attention mechanism for foreground feature alignment, consequently leading to improved alignment and transfer capabilities. Through qualitative and quantitative comparisons on the iHarmony4 dataset, our model exhibits exceptional performance not only in unified image harmonization but also in conventional image harmonization tasks.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 5","pages":"1865 - 1886"},"PeriodicalIF":0.0,"publicationDate":"2024-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140989549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信