Information Processing & Management最新文献

筛选
英文 中文
FairColor: An efficient algorithm for the Balanced and Fair Reviewer Assignment Problem FairColor:平衡与公平审稿人分配问题的高效算法
IF 7.4 1区 管理学
Information Processing & Management Pub Date : 2024-08-22 DOI: 10.1016/j.ipm.2024.103865
Khadra Bouanane , Abdeldjaouad Nusayr Medakene , Abdellah Benbelghit , Samir Brahim Belhaouari
{"title":"FairColor: An efficient algorithm for the Balanced and Fair Reviewer Assignment Problem","authors":"Khadra Bouanane ,&nbsp;Abdeldjaouad Nusayr Medakene ,&nbsp;Abdellah Benbelghit ,&nbsp;Samir Brahim Belhaouari","doi":"10.1016/j.ipm.2024.103865","DOIUrl":"10.1016/j.ipm.2024.103865","url":null,"abstract":"<div><p>As the volume of submitted papers continues to rise, ensuring a fair and accurate assignment of manuscripts to reviewers has become increasingly critical for academic conference organizers. Given the paper-reviewer similarity scores, this study introduces the Balanced and Fair Reviewer Assignment Problem (BFRAP), which aims to maximize the overall similarity score (efficiency) and the minimum paper score (fairness) subject to coverage, load balance, and fairness constraints. Addressing the challenges posed by these constraints, we conduct a theoretical investigation into the threshold conditions for the problem’s feasibility and optimality. To facilitate this investigation, we establish a connection between BFRAP, defined over <span><math><mi>m</mi></math></span> reviewers, and the Equitable m-Coloring Problem. Building on this theoretical foundation, we propose FairColor, an algorithm designed to retrieve fair and efficient assignments. We compare FairColor to Fairflow and FairIR, two state-of-the-art algorithms designed to find fair assignments under similar constraints. Empirical experiments were conducted on four real and two synthetic datasets involving (paper, reviewer) matching scores ranging from (100,100) to (10124,5880). Results demonstrate that FairColor is able to find efficient and fair assignments quickly compared to Fairflow and FairIR. Notably, in the largest instance involving 10,124 manuscripts and 5680 reviewers, FairColor retrieves fair and efficient assignments in just 67.64 s. This starkly contrasts both other methods, which require significantly longer computation times (45 min for Fairflow and 3 h 24 min for FairIR), even on more powerful machines. These results underscore FairColor as a promising alternative to current state-of-the-art assignment techniques.</p></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"61 6","pages":"Article 103865"},"PeriodicalIF":7.4,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142040406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An adaptive approach to noisy annotations in scientific information extraction 科学信息提取中噪声注释的自适应方法
IF 7.4 1区 管理学
Information Processing & Management Pub Date : 2024-08-12 DOI: 10.1016/j.ipm.2024.103857
Necva Bölücü, Maciej Rybinski, Xiang Dai, Stephen Wan
{"title":"An adaptive approach to noisy annotations in scientific information extraction","authors":"Necva Bölücü,&nbsp;Maciej Rybinski,&nbsp;Xiang Dai,&nbsp;Stephen Wan","doi":"10.1016/j.ipm.2024.103857","DOIUrl":"10.1016/j.ipm.2024.103857","url":null,"abstract":"<div><p>Despite recent advances in large language models (LLMs), the best effectiveness in information extraction (IE) is still achieved by fine-tuned models, hence the need for manually annotated datasets to train them. However, collecting human annotations for IE, especially for scientific IE, where expert annotators are often required, is expensive and time-consuming. Another issue widely discussed in the IE community is noisy annotations. Mislabelled training samples can hamper the effectiveness of trained models. In this paper, we propose a solution to alleviate problems originating from the high cost and difficulty of the annotation process. Our method distinguishes <em>clean</em> training samples from <em>noisy</em> samples and then employs weighted weakly supervised learning (WWSL) to leverage noisy annotations. Evaluation of Named Entity Recognition (NER) and Relation Classification (RC) tasks in Scientific IE demonstrates the substantial impact of detecting clean samples. Experimental results highlight that our method, utilising clean and noisy samples with WWSL, outperforms the baseline RoBERTa on NER (+4.28, +4.59, +29.27, and +5.21 gain for the ADE, SciERC, STEM-ECR, and WLPC datasets, respectively) and the RC (+6.09 and +4.39 gain for the SciERC and WLPC datasets, respectively) tasks. Comprehensive analyses of our method reveal its advantages over state-of-the-art denoising baseline models in scientific NER. Moreover, the framework is general enough to be adapted to different NLP tasks or domains, which means it could be useful in the broader NLP community.</p></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"61 6","pages":"Article 103857"},"PeriodicalIF":7.4,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0306457324002164/pdfft?md5=fff788405d49af01c42a5d5a7a592f76&pid=1-s2.0-S0306457324002164-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141979785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust and resource-efficient table-based fact verification through multi-aspect adversarial contrastive learning 通过多视角对抗性对比学习,实现基于表格的稳健且资源节约型事实验证
IF 7.4 1区 管理学
Information Processing & Management Pub Date : 2024-08-12 DOI: 10.1016/j.ipm.2024.103853
Ruiheng Liu , Yu Zhang , Bailong Yang , Qi Shi , Luogeng Tian
{"title":"Robust and resource-efficient table-based fact verification through multi-aspect adversarial contrastive learning","authors":"Ruiheng Liu ,&nbsp;Yu Zhang ,&nbsp;Bailong Yang ,&nbsp;Qi Shi ,&nbsp;Luogeng Tian","doi":"10.1016/j.ipm.2024.103853","DOIUrl":"10.1016/j.ipm.2024.103853","url":null,"abstract":"<div><p>Table-based fact verification focuses on determining the truthfulness of statements by cross-referencing data in tables. This task is challenging due to the complex interactions inherent in table structures. To address this challenge, existing methods have devised a range of specialized models. Although these models demonstrate good performance, they still exhibit limited sensitivity to essential variations in information relevant to reasoning within both statements and tables, thus learning spurious patterns and leading to potentially unreliable outcomes. In this work, we propose a novel approach named <strong>M</strong>ulti-Aspect <strong>A</strong>dversarial <strong>Co</strong>ntrastive <strong>L</strong>earning (<span>Macol</span>), aimed at enhancing the accuracy and robustness of table-based fact verification systems under the premise of resource efficiency. Specifically, we first extract pivotal logical reasoning clues to construct positive and adversarial negative instances for contrastive learning. We then propose a new training paradigm that introduces a contrastive learning objective, encouraging the model to recognize noise invariance between original and positive instances while also distinguishing logical differences between original and negative instances. Extensive experimental results on three widely studied datasets TABFACT, INFOTABS and SEM-TAB-FACTS demonstrate that <span>Macol</span> achieves state-of-the-art performance on benchmarks across various backbone architectures, with accuracy improvements reaching up to 5.4%. Furthermore, <span>Macol</span> exhibits significant advantages in robustness and low-data resource scenarios, with improvements of up to 8.2% and 9.1%. It is worth noting that our method achieves comparable or even superior performance while being more resource-efficient compared to approaches that employ specific additional pre-training or utilize large language models (LLMs).</p></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"61 6","pages":"Article 103853"},"PeriodicalIF":7.4,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141979786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dynamic Feature Focusing Network for small object detection 用于小物体检测的动态特征聚焦网络
IF 7.4 1区 管理学
Information Processing & Management Pub Date : 2024-08-12 DOI: 10.1016/j.ipm.2024.103858
Rudong Jing , Wei Zhang , Yuzhuo Li , Wenlin Li , Yanyan Liu
{"title":"Dynamic Feature Focusing Network for small object detection","authors":"Rudong Jing ,&nbsp;Wei Zhang ,&nbsp;Yuzhuo Li ,&nbsp;Wenlin Li ,&nbsp;Yanyan Liu","doi":"10.1016/j.ipm.2024.103858","DOIUrl":"10.1016/j.ipm.2024.103858","url":null,"abstract":"<div><p>Deep learning has driven research in object detection and achieved proud results. Despite its significant advancements in object detection, small object detection still struggles with low recognition rates and inaccurate positioning, primarily attributable to their miniature size. The location deviation of small objects induces severe feature misalignment, and the disequilibrium between classification and regression tasks hinders accurate recognition. To address these issues, we propose a Dynamic Feature Focusing Network (DFFN), which contains a duo of crucial modules: Visual Perception Enhancement Module (VPEM) and Task Association Module (TAM). Drawing upon the deformable convolution and attention mechanism, the VPEM concentrates on sparse key features and perceives the misalignment via positional offset. We aggregate multi-level features at identical spatial locations via layer average operation for learning a more discriminative representation. Incorporating class alignment and bounding box alignment parts, the TAM promotes classification ability, refines bounding box regression, and facilitates the joint learning of classification and localization. We conduct diverse experiments, and the proposed method considerably enhances the small object detection performance on four benchmark datasets of MS COCO, VisDrone, VOC, and TinyPerson. Our method has improved by 3.4 and 2.2 in mAP and AP<em>s</em>, making solid improvements on COCO. Compared to other classic detection models, DFFN exhibits a high level of competitiveness in precision.</p></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"61 6","pages":"Article 103858"},"PeriodicalIF":7.4,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141979793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
OBCTeacher: Resisting labeled data scarcity in oracle bone character detection by semi-supervised learning OBCTeacher:通过半监督学习抵御甲骨文字检测中的标记数据匮乏问题
IF 7.4 1区 管理学
Information Processing & Management Pub Date : 2024-08-07 DOI: 10.1016/j.ipm.2024.103864
Xiuan Wan , Zhengchen Li , Dandan Liang , Shouyong Pan , Yuchun Fang
{"title":"OBCTeacher: Resisting labeled data scarcity in oracle bone character detection by semi-supervised learning","authors":"Xiuan Wan ,&nbsp;Zhengchen Li ,&nbsp;Dandan Liang ,&nbsp;Shouyong Pan ,&nbsp;Yuchun Fang","doi":"10.1016/j.ipm.2024.103864","DOIUrl":"10.1016/j.ipm.2024.103864","url":null,"abstract":"<div><p>Oracle bone characters (OBCs) are ancient ideographs for divination and memorization, as well as first-hand evidence of ancient Chinese culture. The detection of OBC is the premise of advanced studies and was mainly done by authoritative experts in the past. Deep learning techniques have great potential to facilitate OBC detection, but the high annotation cost of OBC brings the scarcity of labeled data, hindering its application. This paper proposes a novel OBC detection framework called OBCTeacher based on semi-supervised learning (SSL) to resist labeled data scarcity. We first construct a large-scale OBC detection dataset. Through investigation, we find that spatial mismatching and class imbalance problems lead to decreased positive anchors and biased predictions, affecting the quality of pseudo labels and the performance of OBC detection. To mitigate the spatial mismatching problem, we introduce a geometric-priori-based anchor assignment strategy and a heatmap polishing procedure to increase positive anchors and improve the quality of pseudo labels. As for the class imbalance problem, we propose a re-weighting method based on estimated class information and a contrastive anchor loss to achieve prioritized learning on different OBC classes and better class boundaries. We evaluate our method by using only a small portion of labeled data while using the remaining data as unlabeled and all labeled data with extra unlabeled data. The results demonstrate the effectiveness of our method compared with other state-of-the-art methods by superior performance and significant improvements of an average of 11.97 in <span><math><mrow><mi>A</mi><msub><mrow><mi>P</mi></mrow><mrow><mn>50</mn><mo>:</mo><mn>95</mn></mrow></msub></mrow></math></span> against the only supervised baseline. In addition, our method achieves comparable performance using only 20% of labeled data to the fully-supervised baseline using 100% of labeled data, demonstrating that our method significantly reduces the dependence on labeled data for OBC detection.</p></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"61 6","pages":"Article 103864"},"PeriodicalIF":7.4,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141944143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Get by how much you pay: A novel data pricing scheme for data trading 付出多少,收获多少:数据交易的新型数据定价方案
IF 7.4 1区 管理学
Information Processing & Management Pub Date : 2024-08-07 DOI: 10.1016/j.ipm.2024.103849
Yu Lu , Jingyu Wang , Lixin Liu , Hanqing Yang
{"title":"Get by how much you pay: A novel data pricing scheme for data trading","authors":"Yu Lu ,&nbsp;Jingyu Wang ,&nbsp;Lixin Liu ,&nbsp;Hanqing Yang","doi":"10.1016/j.ipm.2024.103849","DOIUrl":"10.1016/j.ipm.2024.103849","url":null,"abstract":"<div><p>As a crucial step in promoting data sharing, data trading can stimulate the development of the data economy. However, the current data trading market primarily focuses on satisfying data owners' interests, overlooking the demands of data requesters. Ignoring the demands of data requesters may lead to a loss of market competitiveness, customer loss, and missed business opportunities while damaging reputation and innovation capabilities. Therefore, in this paper, we introduce a novel pricing mechanism named Get By How Much You Pay (GHMP) based on compressed sensing technology and game theory to address pricing problems according to data requesters' demands. This scheme employs a dictionary matrix as the sparse basis matrix in compressed sensing. The quality of this matrix directly affects the precision with which the requester can reconstruct the data. If the requester requires higher-precision data, the corresponding payment will also increase accordingly so as to realize the pricing method based on the requester's demands. A game pricing method is proposed to address the final pricing and purchasing issues between the data requester and the data owner by utilizing an authorized smart contract as an intermediary. As an entity participating in the game, the smart contract can only receive a higher transaction fee if it successfully assists the data requester and data owner in completing the pricing. Therefore, it strives to establish more reasonable prices for both parties during the trading process to obtain profits. The experimental results demonstrate that this game-based approach assists the data requester and owner in achieving optimal data pricing, thereby satisfying the maximization of interests for both parties.</p></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"61 6","pages":"Article 103849"},"PeriodicalIF":7.4,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141944158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The interaction of inter-organizational diversity and team size, and the scientific impact of papers 组织间多样性与团队规模的相互作用以及论文的科学影响力
IF 7.4 1区 管理学
Information Processing & Management Pub Date : 2024-08-06 DOI: 10.1016/j.ipm.2024.103851
Hyoung Sun Yoo , Ye Lim Jung , June Young Lee , Chul Lee
{"title":"The interaction of inter-organizational diversity and team size, and the scientific impact of papers","authors":"Hyoung Sun Yoo ,&nbsp;Ye Lim Jung ,&nbsp;June Young Lee ,&nbsp;Chul Lee","doi":"10.1016/j.ipm.2024.103851","DOIUrl":"10.1016/j.ipm.2024.103851","url":null,"abstract":"<div><p>Large teams are known to be more likely to publish highly cited papers, while small teams are known to be better at publishing highly disruptive papers. However, there is a lack of adequate theoretical understanding of the mechanisms by which scientific collaboration among researchers is related to the scientific impact of their papers. We investigated the mechanisms more closely by focusing on the interaction of inter-organizational diversity and team size in the process of team formation and knowledge dissemination. We analyzed 12,010,102 Web of Science papers and examined how inter-organizational diversity is associated with the relationship of team size with disruption and citations. As a result, we found that not only small teams, but also large teams with great inter-organizational diversity were able to disrupt science and technology effectively. We also found that large teams with greater inter-organizational diversity were more likely to produce highly cited papers. Our findings are robust and consistently observed regardless of publication year, team size, the number of references, and the degree of multidisciplinarity. These results have significant implications for researchers in selecting collaborators to achieve greater impact and for improving the qualitative efficiency of public research investments.</p></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"61 6","pages":"Article 103851"},"PeriodicalIF":7.4,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0306457324002103/pdfft?md5=24632fc550985135ce9f8be93795f4b2&pid=1-s2.0-S0306457324002103-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141944159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MOOCs video recommendation using low-rank and sparse matrix factorization with inter-entity relations and intra-entity affinity information 利用具有实体间关系和实体内亲和力信息的低秩稀疏矩阵因式分解进行 MOOC 视频推荐
IF 7.4 1区 管理学
Information Processing & Management Pub Date : 2024-08-06 DOI: 10.1016/j.ipm.2024.103861
Yunmei Gao
{"title":"MOOCs video recommendation using low-rank and sparse matrix factorization with inter-entity relations and intra-entity affinity information","authors":"Yunmei Gao","doi":"10.1016/j.ipm.2024.103861","DOIUrl":"10.1016/j.ipm.2024.103861","url":null,"abstract":"<div><h3>Purpose</h3><p>The serious information overload problem of MOOCs videos decreases the learning efficiency of the students and the utilization rate of the videos. There are two problems worthy of attention for the matrix factorization (MF)-based video learning resource recommender systems. Those methods suffer from the sparsity problem of the user-item rating matrix, while side information about user and item is seldom used to guide the learning procedure of the MF.</p></div><div><h3>Method</h3><p>To address those two problems, we proposed a new MOOCs video resource recommender LSMFERLI based on Low-rank and Sparse Matrix Factorization (LSMF) with the guidance of the inter-Entity Relations and intra-entity Latent Information of the students and videos. Firstly, we construct the inter-entity relation matrices and intra-entity latent preference matrix for the students. Secondly, we construct the inter-entity relation matrices and intra-entity affinity matrix for the videos. Lastly, with the guidance of the inter-entity relation and intra-entity affinity matrices of the students and videos, the student-video rating matrix is factorized into a low-rank matrix and a sparse matrix by the alternative iteration optimization scheme.</p></div><div><h3>Conclusions</h3><p>Experimental results on dataset MOOCcube indicate that LSMFERLI outperforms 7 state-of-the-art methods in terms of the HR@<em>K</em> and NDCG@<em>K</em>(<em>K</em> = 5,10,15) indicators increased by an average of 20.6 % and 21.0 %, respectively.</p></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"61 6","pages":"Article 103861"},"PeriodicalIF":7.4,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0306457324002206/pdfft?md5=308b736cfd63725fb5781fb48c9b85f3&pid=1-s2.0-S0306457324002206-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141944160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A framework for predicting scientific disruption based on graph signal processing 基于图信号处理的科学干扰预测框架
IF 7.4 1区 管理学
Information Processing & Management Pub Date : 2024-08-05 DOI: 10.1016/j.ipm.2024.103863
Houqiang Yu, Yian Liang
{"title":"A framework for predicting scientific disruption based on graph signal processing","authors":"Houqiang Yu,&nbsp;Yian Liang","doi":"10.1016/j.ipm.2024.103863","DOIUrl":"10.1016/j.ipm.2024.103863","url":null,"abstract":"<div><p>Identifying scientific disruption is consistently recognized as challenging, and more so is to predict it. We suggest that better predictions are hindered by the inability to integrate multidimensional information and the limited scalability of existing methods. This paper develops a framework based on graph signal processing (GSP) to predict scientific disruption, achieving an average AUC of about 80 % on benchmark datasets, surpassing the performance of prior methods by 13.6 % on average. The framework is unified, adaptable to any type of information, and scalable, with the potential for further enhancements using technologies from GSP. The intuition of this framework is: scientific disruption is characterized by leading to dramatic changes in scientific evolution, which is recognized as a complex system represented by a graph, and GSP is a technique that specializes in analyzing data on graph structures; thus, we argue that GSP is well-suited for modeling scientific evolution and predicting disruption. Based on this proposed framework, we proceed with disruption predictions. The content, context, and (citation) structure information is respectively defined as graph signals. The total variations of these graph signals, which measure the evolutionary amplitude, are the main predictors. To illustrate the unity and scalability of our framework, altmetrics data (online mentions of the paper) that seldom considered previously is defined as graph signal, and another indicator, the dispersion entropy of graph signal (measuring chaos of scientific evolution), is used for predicting respectively. Our framework also provides advantages of interpretability for a better understanding on scientific disruption. The analysis indicates that the scientific disruption not only results in dramatic changes in the knowledge content, but also in context (e.g., journals and authors), and will lead to chaos in subsequent evolution. At last, several practical future directions for disruption predictions based on the framework are proposed.</p></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"61 6","pages":"Article 103863"},"PeriodicalIF":7.4,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141944168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evolutions of semantic consistency in research topic via contextualized word embedding 通过语境化词语嵌入实现研究课题语义一致性的演变
IF 7.4 1区 管理学
Information Processing & Management Pub Date : 2024-08-03 DOI: 10.1016/j.ipm.2024.103859
Shengzhi Huang , Wei Lu , Qikai Cheng , Zhuoran Luo , Yong Huang
{"title":"Evolutions of semantic consistency in research topic via contextualized word embedding","authors":"Shengzhi Huang ,&nbsp;Wei Lu ,&nbsp;Qikai Cheng ,&nbsp;Zhuoran Luo ,&nbsp;Yong Huang","doi":"10.1016/j.ipm.2024.103859","DOIUrl":"10.1016/j.ipm.2024.103859","url":null,"abstract":"<div><p>Topic evolution has been studied extensively in the field of the science of science. This study first analyzes topic evolution pattern from topics’ semantic consistency in the semantic vector space, and explore its possible causes. Specifically, we extract papers in the computer science field from Microsoft Academic Graph as our dataset. We propose a novel method for encoding a topic with numerous Contextualized Word Embeddings (CWE), in which the title and abstract fields of papers studying the topic is taken as its context. Subsequently, we employ three geometric metrics to analyze topics’ semantic consistency over time, from which the influence of the anisotropy of CWE is excluded. The K-Means clustering algorithm is employed to identify four general evolution patterns of semantic consistency, that is, semantic consistency increases (IM), decreases (DM), increases first and then decreases (Inverted U-shape), and decreases first and then increases (U-shape). We also find that research methods tend to show DM and U-shape, but research questions tend to be IM and Inverted U-shape. Finally, we further utilize the regression analysis to explore whether and, if so, how a series of key features of a topic affect its semantic consistency. Importantly, semantic consistency of a topic varies inversely with the semantic similarity between the topic and other topics. Overall, this study sheds light on the evolution law of topics, and helps researchers to understand these patterns from a geometric perspective.</p></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"61 6","pages":"Article 103859"},"PeriodicalIF":7.4,"publicationDate":"2024-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141944161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信