Big Data Research最新文献_第10页

Study on the Temporal and Spatial Evolution Characteristics of Chinese Public's Cognition and Attitude to “Double Reduction” Policy Based on Big Data 基于大数据的中国公众对“双减”政策认知与态度时空演化特征研究

IF 3.3 3区计算机科学

Big Data Research Pub Date : 2023-09-11 DOI: 10.1016/j.bdr.2023.100411

Jiahui Liu , Wei Liu , Chun Yan , Xinhong Liu

{"title":"Study on the Temporal and Spatial Evolution Characteristics of Chinese Public's Cognition and Attitude to “Double Reduction” Policy Based on Big Data","authors":"Jiahui Liu , Wei Liu , Chun Yan , Xinhong Liu","doi":"10.1016/j.bdr.2023.100411","DOIUrl":"https://doi.org/10.1016/j.bdr.2023.100411","url":null,"abstract":"<div><p><span><span>The “double reduction” policy is a policy innovation of China's comprehensive education reform to build a high-quality education system. The public's cognition and attitude toward it are of great significance to its actual implementation. A total of 98396 texts related to “double reduction” collected from Sina-Weibo by web crawler technology are investigated to explore the public's cognition and attitude towards the “double reduction” policy as well as its spatio-temporal evolution characteristics. Guided by life cycle theory, the evolution of the public's attitude is studied by </span>sentiment analysis based on the ERINE algorithm and DUTIR. Topics are selected with the adoption of TF-IDF and </span>LDA models to perform spatio-temporal evolution of public cognition and analyze group differences. The results are as follows: the evolution of public concern about the “double reduction” policy is phased and the period of high incidence is closely related to time nodes such as policy release, the new school term, and holidays. There are temporal and spatial differences in the evolution of public attitudes between different stages and groups. Although the public holds a relatively negative attitude, with more information about the “double reduction” policy available, the public's attitude is gradually easing. Topics of public concern vary in different periods, and different groups show different emotional attitudes and have distinctive evolution characteristics of cognitive themes. Compared with other age groups, teenagers pay more attention to topics related to their studies and life. The government's official micro-blog not only shoulders the responsibility of publicizing relevant policies, but also pays close attention to the implementation of relevant policies around the country. The influential groups hold a relatively firm attitude and stable emotions and often can orient public opinions. The regional attention to the “double reduction” policy is positively correlated with the level of local economic development. The research results can help government departments learn about the public's cognition and attitude towards the “double reduction” policy to provide decision-making support, and serve as an important basis for solving existing contradictions and promoting the effective implementation of policies.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"34 ","pages":"Article 100411"},"PeriodicalIF":3.3,"publicationDate":"2023-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49733798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An Improved CycleGAN for Data Augmentation in Person Re-Identification 一种用于人再识别数据增强的改进CycleGAN

IF 3.3 3区计算机科学

Big Data Research Pub Date : 2023-09-09 DOI: 10.1016/j.bdr.2023.100409

Zhenzhen Yang , Jing Shao , Yongpeng Yang

{"title":"An Improved CycleGAN for Data Augmentation in Person Re-Identification","authors":"Zhenzhen Yang , Jing Shao , Yongpeng Yang","doi":"10.1016/j.bdr.2023.100409","DOIUrl":"https://doi.org/10.1016/j.bdr.2023.100409","url":null,"abstract":"<div><p>Person re-identification (ReID) has attracted more and more attention, which is to retrieve interested persons across multiple non-overlapping cameras. Matching the same person between different camera styles has always been an enormous challenge. In the existing work, cross-camera styles images generated by the cycle-consistent generative adversarial network<span> (CycleGAN) only transfer the camera resolution and ambient lighting. The generated images produce considerable redundancy and inappropriate pictures at the same time. Although the data is added to prevent over-fitting, it also makes significant noise, so the accuracy is not significantly improved. In this paper, an improved CycleGAN is proposed to generate images for achieving improved data augmentation. The transfer of pedestrian posture is added at the same time as transferring the image style. It not only increases the diversity of pedestrian posture but also reduces the domain gap caused by the style change between cameras. Besides, through the multi-pseudo regularized label (MpRL), the generated images are assigned virtual tags dynamically in training. Through many experimental evaluations, we have achieved a very high identification accuracy on Market-1501, DukeMTMC-reID, and CUHK03-NP datasets. On the three datasets, the quantitative results of mAP are 96.20%, 93.72%, and 86.65%, and the quantitative results of rank-1 are 98.27%, 95.37%, and 90.71%, respectively. The experimental results fully show the superiority of our proposed method.</span></p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"34 ","pages":"Article 100409"},"PeriodicalIF":3.3,"publicationDate":"2023-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49711263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Classifier-Based Nonuniform Time Slicing Method for Local Community Evolution Analysis 基于分类器的非均匀时间切片局部群落演化分析方法

IF 3.3 3区计算机科学

Big Data Research Pub Date : 2023-09-09 DOI: 10.1016/j.bdr.2023.100408

Xiangyu Luo , Tian Wang , Gang Xin , Yan Lu , Ke Yan , Ying Liu

{"title":"Classifier-Based Nonuniform Time Slicing Method for Local Community Evolution Analysis","authors":"Xiangyu Luo , Tian Wang , Gang Xin , Yan Lu , Ke Yan , Ying Liu","doi":"10.1016/j.bdr.2023.100408","DOIUrl":"https://doi.org/10.1016/j.bdr.2023.100408","url":null,"abstract":"<div><p>With the rapid expansion of the scale of a dynamic network, local community evolution analysis attracts much attention because of its efficiency and accuracy. It concentrates on a particularly interested community rather than considering all communities together. A fundamental problem is how to divide time into slices so that a dynamic network is represented as a sequence of snapshots which accurately capture the evolutionary events of the interested community. Existing time slicing methods lead to inaccurate evolution analysis results. The reason is that they usually rely on a linear strategy while the community evolution is a nonlinear process. This paper investigates the problem and proposes a classifier-based time slicing method for local community evolution analysis. First, a classifier is trained for judging whether there is a community in the given network snapshot is identified as the continuing of the community defined by the given node subset. The features for classification include internal cohesion degree and external coupling degree. Second, a time slicing method is proposed based on the trained classifier. As the network evolves, the method continuously uses the classifier to predict whether there is a community in the newest network identified as the continuing of the interested community. Whenever the answer is negative, an evolutionary event is presumed to have occurred and a new time slice is generated. Experimental results show that compared with existing time slicing methods, our proposed method achieves higher recognition rate for given redundancy ratio.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"34 ","pages":"Article 100408"},"PeriodicalIF":3.3,"publicationDate":"2023-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49733790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Multi-View Filter for Relation-Free Knowledge Graph Completion 一种用于无关系知识图补全的多视图滤波器

IF 3.3 3区计算机科学

Big Data Research Pub Date : 2023-08-28 DOI: 10.1016/j.bdr.2023.100397

Juan Li , Wen Zhang , Hongtao Yu

{"title":"A Multi-View Filter for Relation-Free Knowledge Graph Completion","authors":"Juan Li , Wen Zhang , Hongtao Yu","doi":"10.1016/j.bdr.2023.100397","DOIUrl":"https://doi.org/10.1016/j.bdr.2023.100397","url":null,"abstract":"<div><p>As knowledge graphs are often incomplete, knowledge graph completion methods have been widely proposed to infer missing facts by predicting the missing element of a triple given the other two elements. However, the assumption that the two elements have to be correlated is strong. Thus in this paper, we investigate <em>relation-free knowledge graph completion</em> to predict relation-tail(r-t) pairs given a head entity. Considering the large scale of candidate relation-tail pairs, previous work proposed to filter r-t pairs before ranking them relying on entity types, which fails when entity types are missing or insufficient. To tackle the limitation, we propose a relation-free knowledge graph completion method that can cope with knowledge graphs without additional ontological information, such as entity types. Specifically, we propose a multi-view filter, including two intra-view modules and an inter-view module, to filter r-t pairs. For the intra-view modules, we construct <em>head-relation</em> and <em>tail-relation</em><span> graphs based on triples. Two graph neural networks are respectively trained on these two graphs to capture the correlations between the head entities and the relations, as well as the tail entities and the relations. The inter-view module is learned to bridge the embeddings of entities that appeared in the two graphs. In terms of ranking, existing knowledge graph embedding models are applied to score and rank the filtered candidate r-t pairs. Experimental results show the efficiency of our method in preserving higher-quality candidate r-t pairs for knowledge graphs and resulting in better relation-free knowledge graph completion.</span></p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"33 ","pages":"Article 100397"},"PeriodicalIF":3.3,"publicationDate":"2023-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49711261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Meta-Learning Based Dynamic Adaptive Relation Learning for Few-Shot Knowledge Graph Completion 基于元学习的动态自适应关系学习在少镜头知识图完成中的应用

IF 3.3 3区计算机科学

Big Data Research Pub Date : 2023-08-28 DOI: 10.1016/j.bdr.2023.100394

Linqin Cai, Lingjun Wang, Rongdi Yuan, Tingjie Lai

{"title":"Meta-Learning Based Dynamic Adaptive Relation Learning for Few-Shot Knowledge Graph Completion","authors":"Linqin Cai, Lingjun Wang, Rongdi Yuan, Tingjie Lai","doi":"10.1016/j.bdr.2023.100394","DOIUrl":"https://doi.org/10.1016/j.bdr.2023.100394","url":null,"abstract":"<div><p>As artificial intelligence<span> gradually steps into cognitive intelligence stage, knowledge graphs (KGs) play an increasingly important role in many natural language processing<span><span> tasks. Due to the prevalence of long-tail relations in KGs, few-shot knowledge graph completion (KGC) for link prediction of long-tail relations has gradually become a hot research topic. Current few-shot KGC methods mainly focus on the static representation of surrounding entities to explore the potential semantic features<span> of entities, while ignoring the dynamic properties among entities and the special influence of the long-tail relation on link prediction. In this paper, a new meta-learning based dynamic adaptive relation learning model (DARL) is proposed for few-shot KGC. For obtaining better semantic information of the meta knowledge, the proposed DARL model applies a dynamic neighbor encoder to incorporate neighbor relations into entity embedding. In addition, DARL builds </span></span>attention mechanism based fusion strategy for different attributes of the same relation to further enhance the relation-meta learning ability. We evaluate our DARL model on two public benchmark datasets NELL-One and WIKI-One for link prediction. Extensive experimental results indicate that our DARL outperforms the state-of-the-art models with an average relative improvement about 23.37%, 32.46% in MRR and Hits@1 on NELL-One, respectively.</span></span></p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"33 ","pages":"Article 100394"},"PeriodicalIF":3.3,"publicationDate":"2023-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49711677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Task-Oriented Collaborative Graph Embedding Using Explicit High-Order Proximity for Recommendation 基于显式高阶接近度推荐的面向任务的协同图嵌入

IF 3.3 3区计算机科学

Big Data Research Pub Date : 2023-08-28 DOI: 10.1016/j.bdr.2023.100382

Mintae Kim, Wooju Kim

{"title":"Task-Oriented Collaborative Graph Embedding Using Explicit High-Order Proximity for Recommendation","authors":"Mintae Kim, Wooju Kim","doi":"10.1016/j.bdr.2023.100382","DOIUrl":"https://doi.org/10.1016/j.bdr.2023.100382","url":null,"abstract":"<div><p><span><span>A recommender or recommendation system is a subclass<span> of information filtering systems that seeks to predict the “rating” or “preference” that a user would assign to an item. Although many collaborative filtering (CF) approaches based on neural matrix factorization (NMF) have been successful, significant scope for improvement in recommendation systems exists. The primary challenge in </span></span>recommender systems<span> is to extract high-quality user–item interaction information from sparse data. However, most studies have focused on additional review text or metadata instead of fully used high-order relationships between users and items. In this paper, we propose a novel model—Cross Neighborhood Attention Network (CNAN)—that solves this problem by designing high-order neighborhood selection and neighborhood attention networks to learn user–item interaction efficiently. Our CNAN performs rating prediction using an architecture considering only user–item interaction data. Furthermore, the proposed model uses only user–item interaction (from the user–item ratings matrix) information without additional information such as review text or metadata. We evaluated the effectiveness of the proposed model by performing experiments on five datasets with review text and three datasets with metadata. Consequently, the CNAN model demonstrated a performance improvement of up to 7.59% over the model using review text and up to 1.99% over the model using metadata. Experimental results show that CNAN achieves better recommendation performance through higher-order neighborhood </span></span>information integration with neighborhood selection and attention. The results show that our model delivers higher prediction performance via efficient structural improvement without using additional information.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"33 ","pages":"Article 100382"},"PeriodicalIF":3.3,"publicationDate":"2023-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49711464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Botnet DGA Domain Name Classification Using Transformer Network with Hybrid Embedding 基于混合嵌入变压器网络的Botnet DGA域名分类

IF 3.3 3区计算机科学

Big Data Research Pub Date : 2023-08-28 DOI: 10.1016/j.bdr.2023.100395

Ling Ding , Peng Du , Haiwei Hou , Jian Zhang , Di Jin , Shifei Ding

{"title":"Botnet DGA Domain Name Classification Using Transformer Network with Hybrid Embedding","authors":"Ling Ding , Peng Du , Haiwei Hou , Jian Zhang , Di Jin , Shifei Ding","doi":"10.1016/j.bdr.2023.100395","DOIUrl":"https://doi.org/10.1016/j.bdr.2023.100395","url":null,"abstract":"<div><p><span>One of the severest threats to cyber security is botnet, which typically uses domain names generated by Domain Generation Algorithms (DGAs) to communicate with their Command and Control (C&C) infrastructure. </span>DGA detection<span> and classification play an important role of assisting cyber security researchers to detect botnet C&C servers. However, many of the existing DGA detection models only focus on single scale word embedding<span> method, and very few models are specially designed to extract more effective features for DGA detection from multiple scales word embedding. To alleviate above questions, first we propose a hybrid word embedding method, which combines character level embedding and bigram level embedding to make full use of the domain names information, and then, we design a deep neural network with hybrid embedding method to distinguish DGA domains from known legitimate domains. Finally, we evaluate our hybrid embedding method and the proposed model on ONIST dataset and compare our methods with several state-of-the-art DGA classification methods.</span></span></p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"33 ","pages":"Article 100395"},"PeriodicalIF":3.3,"publicationDate":"2023-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49711678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Big Data Framework to Address Building Sum Insured Misestimation 解决建筑保额误估的大数据框架

IF 3.3 3区计算机科学

Big Data Research Pub Date : 2023-08-28 DOI: 10.1016/j.bdr.2023.100396

Callum Roberts, Adrian Gepp, James Todd

引用次数: 0

Parallel Framework for Memory-Efficient Computation of Image Descriptors for Megapixel Images 百万像素图像描述符内存高效计算的并行框架

IF 3.3 3区计算机科学

Big Data Research Pub Date : 2023-08-28 DOI: 10.1016/j.bdr.2023.100398

Amr M. Abdeltif , Khalid M. Hosny , Mohamed M. Darwish , Ahmad Salah , Kenli Li

{"title":"Parallel Framework for Memory-Efficient Computation of Image Descriptors for Megapixel Images","authors":"Amr M. Abdeltif , Khalid M. Hosny , Mohamed M. Darwish , Ahmad Salah , Kenli Li","doi":"10.1016/j.bdr.2023.100398","DOIUrl":"https://doi.org/10.1016/j.bdr.2023.100398","url":null,"abstract":"<div><p><span>Image moments are image descriptors widely utilized in several image processing, pattern recognition, computer vision, and multimedia security applications. In the era of big data, the computation of image moments yields a huge memory demand, especially for large moment order and/or high-resolution images (i.e., megapixel images). The state-of-the-art moment computation methods successfully accelerate the image moment computation for digital images of a resolution smaller than 1K × 1K pixels. For digital images of higher resolutions, image moment computation is problematic. Researchers utilized GPU-based </span>parallel processing<span> to overcome this problem. In practice, the parallel computation of image moments using GPUs encounters the non-extended memory problem, which is the main challenge. This paper proposed a recurrent-based method for computing the Polar Complex Exponent Transform (PCET) moments of fractional orders. The proposed method utilized the symmetry of the image kernel to reduce kernel computation. In the proposed method, once a kernel value is computed in one quaternion, the other three corresponding values in the remaining three quaternions can be trivially computed. Moreover, the proposed method utilized recurrence equations to compute kernels. Thus, the required memory to store the pre-computed memory is saved. Finally, we implemented the proposed method on the GPU parallel architecture. The proposed method overcomes the memory limit due to saving the kernel's memory. The experiments show that the proposed parallel-friendly and memory-efficient method is superior to the state-of-the-art moment computation methods in memory consumption and runtimes. The proposed method computes the PCET moment of order 50 for an image of size 2K × 2K pixels in 3.5 seconds while the state-of-the-art method of comparison needs 7.0 seconds to process the same image, the memory requirements for the proposed method and the method of comparison for the were 67.0 MB and 3.4 GB, respectively. The method of comparison could not compute the image moment for any image with a resolution higher than 2K × 2K pixels. In contrast, the proposed method managed to compute the image moment up to 16K × 16K pixels image.</span></p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"33 ","pages":"Article 100398"},"PeriodicalIF":3.3,"publicationDate":"2023-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49711262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A Large Comparison of Normalization Methods on Time Series 时间序列归一化方法的比较

IF 3.3 3区计算机科学

Big Data Research Pub Date : 2023-08-22 DOI: 10.1016/j.bdr.2023.100407

Felipe Tomazelli Lima, Vinicius M.A. Souza

{"title":"A Large Comparison of Normalization Methods on Time Series","authors":"Felipe Tomazelli Lima, Vinicius M.A. Souza","doi":"10.1016/j.bdr.2023.100407","DOIUrl":"10.1016/j.bdr.2023.100407","url":null,"abstract":"<div><p>Normalization is a mandatory preprocessing step<span><span><span> in time series problems to guarantee similarity comparisons invariant to unexpected distortions in amplitude and offset. Such distortions are usual for most time series data<span>. A typical example is gait recognition by motion collected on subjects with varying body height and width. To rescale the data for the same range of values, the vast majority of researchers consider z-normalization as the default method for any domain application, data, or task. This choice is made without a searching process as occurs to set the parameters of an algorithm or without any experimental evidence in the literature considering a variety of scenarios to support this decision. To address this gap, we evaluate the impact of different normalization methods on time series data. Our analysis is based on an extensive experimental comparison on classification problems involving 10 normalization methods, 3 state-of-the-art classifiers, and 38 benchmark datasets. We consider the </span></span>classification task<span> due to the simplicity of the experimental settings and well-defined metrics. However, our findings can be extrapolated for other time series mining tasks, such as forecasting or clustering. Based on our results, we suggest to evaluate the maximum absolute scale as an alternative to z-normalization. Besides being time efficient, this alternative shows promising results for similarity-based methods using Euclidean distance. For </span></span>deep learning, mean normalization could be considered.</span></p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"34 ","pages":"Article 100407"},"PeriodicalIF":3.3,"publicationDate":"2023-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43624406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1