Rudong Jing , Wei Zhang , Yuzhuo Li , Wenlin Li , Yanyan Liu
{"title":"Dynamic Feature Focusing Network for small object detection","authors":"Rudong Jing , Wei Zhang , Yuzhuo Li , Wenlin Li , Yanyan Liu","doi":"10.1016/j.ipm.2024.103858","DOIUrl":"10.1016/j.ipm.2024.103858","url":null,"abstract":"<div><p>Deep learning has driven research in object detection and achieved proud results. Despite its significant advancements in object detection, small object detection still struggles with low recognition rates and inaccurate positioning, primarily attributable to their miniature size. The location deviation of small objects induces severe feature misalignment, and the disequilibrium between classification and regression tasks hinders accurate recognition. To address these issues, we propose a Dynamic Feature Focusing Network (DFFN), which contains a duo of crucial modules: Visual Perception Enhancement Module (VPEM) and Task Association Module (TAM). Drawing upon the deformable convolution and attention mechanism, the VPEM concentrates on sparse key features and perceives the misalignment via positional offset. We aggregate multi-level features at identical spatial locations via layer average operation for learning a more discriminative representation. Incorporating class alignment and bounding box alignment parts, the TAM promotes classification ability, refines bounding box regression, and facilitates the joint learning of classification and localization. We conduct diverse experiments, and the proposed method considerably enhances the small object detection performance on four benchmark datasets of MS COCO, VisDrone, VOC, and TinyPerson. Our method has improved by 3.4 and 2.2 in mAP and AP<em>s</em>, making solid improvements on COCO. Compared to other classic detection models, DFFN exhibits a high level of competitiveness in precision.</p></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141979793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiuan Wan , Zhengchen Li , Dandan Liang , Shouyong Pan , Yuchun Fang
{"title":"OBCTeacher: Resisting labeled data scarcity in oracle bone character detection by semi-supervised learning","authors":"Xiuan Wan , Zhengchen Li , Dandan Liang , Shouyong Pan , Yuchun Fang","doi":"10.1016/j.ipm.2024.103864","DOIUrl":"10.1016/j.ipm.2024.103864","url":null,"abstract":"<div><p>Oracle bone characters (OBCs) are ancient ideographs for divination and memorization, as well as first-hand evidence of ancient Chinese culture. The detection of OBC is the premise of advanced studies and was mainly done by authoritative experts in the past. Deep learning techniques have great potential to facilitate OBC detection, but the high annotation cost of OBC brings the scarcity of labeled data, hindering its application. This paper proposes a novel OBC detection framework called OBCTeacher based on semi-supervised learning (SSL) to resist labeled data scarcity. We first construct a large-scale OBC detection dataset. Through investigation, we find that spatial mismatching and class imbalance problems lead to decreased positive anchors and biased predictions, affecting the quality of pseudo labels and the performance of OBC detection. To mitigate the spatial mismatching problem, we introduce a geometric-priori-based anchor assignment strategy and a heatmap polishing procedure to increase positive anchors and improve the quality of pseudo labels. As for the class imbalance problem, we propose a re-weighting method based on estimated class information and a contrastive anchor loss to achieve prioritized learning on different OBC classes and better class boundaries. We evaluate our method by using only a small portion of labeled data while using the remaining data as unlabeled and all labeled data with extra unlabeled data. The results demonstrate the effectiveness of our method compared with other state-of-the-art methods by superior performance and significant improvements of an average of 11.97 in <span><math><mrow><mi>A</mi><msub><mrow><mi>P</mi></mrow><mrow><mn>50</mn><mo>:</mo><mn>95</mn></mrow></msub></mrow></math></span> against the only supervised baseline. In addition, our method achieves comparable performance using only 20% of labeled data to the fully-supervised baseline using 100% of labeled data, demonstrating that our method significantly reduces the dependence on labeled data for OBC detection.</p></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141944143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Get by how much you pay: A novel data pricing scheme for data trading","authors":"Yu Lu , Jingyu Wang , Lixin Liu , Hanqing Yang","doi":"10.1016/j.ipm.2024.103849","DOIUrl":"10.1016/j.ipm.2024.103849","url":null,"abstract":"<div><p>As a crucial step in promoting data sharing, data trading can stimulate the development of the data economy. However, the current data trading market primarily focuses on satisfying data owners' interests, overlooking the demands of data requesters. Ignoring the demands of data requesters may lead to a loss of market competitiveness, customer loss, and missed business opportunities while damaging reputation and innovation capabilities. Therefore, in this paper, we introduce a novel pricing mechanism named Get By How Much You Pay (GHMP) based on compressed sensing technology and game theory to address pricing problems according to data requesters' demands. This scheme employs a dictionary matrix as the sparse basis matrix in compressed sensing. The quality of this matrix directly affects the precision with which the requester can reconstruct the data. If the requester requires higher-precision data, the corresponding payment will also increase accordingly so as to realize the pricing method based on the requester's demands. A game pricing method is proposed to address the final pricing and purchasing issues between the data requester and the data owner by utilizing an authorized smart contract as an intermediary. As an entity participating in the game, the smart contract can only receive a higher transaction fee if it successfully assists the data requester and data owner in completing the pricing. Therefore, it strives to establish more reasonable prices for both parties during the trading process to obtain profits. The experimental results demonstrate that this game-based approach assists the data requester and owner in achieving optimal data pricing, thereby satisfying the maximization of interests for both parties.</p></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141944158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hyoung Sun Yoo , Ye Lim Jung , June Young Lee , Chul Lee
{"title":"The interaction of inter-organizational diversity and team size, and the scientific impact of papers","authors":"Hyoung Sun Yoo , Ye Lim Jung , June Young Lee , Chul Lee","doi":"10.1016/j.ipm.2024.103851","DOIUrl":"10.1016/j.ipm.2024.103851","url":null,"abstract":"<div><p>Large teams are known to be more likely to publish highly cited papers, while small teams are known to be better at publishing highly disruptive papers. However, there is a lack of adequate theoretical understanding of the mechanisms by which scientific collaboration among researchers is related to the scientific impact of their papers. We investigated the mechanisms more closely by focusing on the interaction of inter-organizational diversity and team size in the process of team formation and knowledge dissemination. We analyzed 12,010,102 Web of Science papers and examined how inter-organizational diversity is associated with the relationship of team size with disruption and citations. As a result, we found that not only small teams, but also large teams with great inter-organizational diversity were able to disrupt science and technology effectively. We also found that large teams with greater inter-organizational diversity were more likely to produce highly cited papers. Our findings are robust and consistently observed regardless of publication year, team size, the number of references, and the degree of multidisciplinarity. These results have significant implications for researchers in selecting collaborators to achieve greater impact and for improving the qualitative efficiency of public research investments.</p></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0306457324002103/pdfft?md5=24632fc550985135ce9f8be93795f4b2&pid=1-s2.0-S0306457324002103-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141944159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MOOCs video recommendation using low-rank and sparse matrix factorization with inter-entity relations and intra-entity affinity information","authors":"Yunmei Gao","doi":"10.1016/j.ipm.2024.103861","DOIUrl":"10.1016/j.ipm.2024.103861","url":null,"abstract":"<div><h3>Purpose</h3><p>The serious information overload problem of MOOCs videos decreases the learning efficiency of the students and the utilization rate of the videos. There are two problems worthy of attention for the matrix factorization (MF)-based video learning resource recommender systems. Those methods suffer from the sparsity problem of the user-item rating matrix, while side information about user and item is seldom used to guide the learning procedure of the MF.</p></div><div><h3>Method</h3><p>To address those two problems, we proposed a new MOOCs video resource recommender LSMFERLI based on Low-rank and Sparse Matrix Factorization (LSMF) with the guidance of the inter-Entity Relations and intra-entity Latent Information of the students and videos. Firstly, we construct the inter-entity relation matrices and intra-entity latent preference matrix for the students. Secondly, we construct the inter-entity relation matrices and intra-entity affinity matrix for the videos. Lastly, with the guidance of the inter-entity relation and intra-entity affinity matrices of the students and videos, the student-video rating matrix is factorized into a low-rank matrix and a sparse matrix by the alternative iteration optimization scheme.</p></div><div><h3>Conclusions</h3><p>Experimental results on dataset MOOCcube indicate that LSMFERLI outperforms 7 state-of-the-art methods in terms of the HR@<em>K</em> and NDCG@<em>K</em>(<em>K</em> = 5,10,15) indicators increased by an average of 20.6 % and 21.0 %, respectively.</p></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0306457324002206/pdfft?md5=308b736cfd63725fb5781fb48c9b85f3&pid=1-s2.0-S0306457324002206-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141944160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A framework for predicting scientific disruption based on graph signal processing","authors":"Houqiang Yu, Yian Liang","doi":"10.1016/j.ipm.2024.103863","DOIUrl":"10.1016/j.ipm.2024.103863","url":null,"abstract":"<div><p>Identifying scientific disruption is consistently recognized as challenging, and more so is to predict it. We suggest that better predictions are hindered by the inability to integrate multidimensional information and the limited scalability of existing methods. This paper develops a framework based on graph signal processing (GSP) to predict scientific disruption, achieving an average AUC of about 80 % on benchmark datasets, surpassing the performance of prior methods by 13.6 % on average. The framework is unified, adaptable to any type of information, and scalable, with the potential for further enhancements using technologies from GSP. The intuition of this framework is: scientific disruption is characterized by leading to dramatic changes in scientific evolution, which is recognized as a complex system represented by a graph, and GSP is a technique that specializes in analyzing data on graph structures; thus, we argue that GSP is well-suited for modeling scientific evolution and predicting disruption. Based on this proposed framework, we proceed with disruption predictions. The content, context, and (citation) structure information is respectively defined as graph signals. The total variations of these graph signals, which measure the evolutionary amplitude, are the main predictors. To illustrate the unity and scalability of our framework, altmetrics data (online mentions of the paper) that seldom considered previously is defined as graph signal, and another indicator, the dispersion entropy of graph signal (measuring chaos of scientific evolution), is used for predicting respectively. Our framework also provides advantages of interpretability for a better understanding on scientific disruption. The analysis indicates that the scientific disruption not only results in dramatic changes in the knowledge content, but also in context (e.g., journals and authors), and will lead to chaos in subsequent evolution. At last, several practical future directions for disruption predictions based on the framework are proposed.</p></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141944168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shengzhi Huang , Wei Lu , Qikai Cheng , Zhuoran Luo , Yong Huang
{"title":"Evolutions of semantic consistency in research topic via contextualized word embedding","authors":"Shengzhi Huang , Wei Lu , Qikai Cheng , Zhuoran Luo , Yong Huang","doi":"10.1016/j.ipm.2024.103859","DOIUrl":"10.1016/j.ipm.2024.103859","url":null,"abstract":"<div><p>Topic evolution has been studied extensively in the field of the science of science. This study first analyzes topic evolution pattern from topics’ semantic consistency in the semantic vector space, and explore its possible causes. Specifically, we extract papers in the computer science field from Microsoft Academic Graph as our dataset. We propose a novel method for encoding a topic with numerous Contextualized Word Embeddings (CWE), in which the title and abstract fields of papers studying the topic is taken as its context. Subsequently, we employ three geometric metrics to analyze topics’ semantic consistency over time, from which the influence of the anisotropy of CWE is excluded. The K-Means clustering algorithm is employed to identify four general evolution patterns of semantic consistency, that is, semantic consistency increases (IM), decreases (DM), increases first and then decreases (Inverted U-shape), and decreases first and then increases (U-shape). We also find that research methods tend to show DM and U-shape, but research questions tend to be IM and Inverted U-shape. Finally, we further utilize the regression analysis to explore whether and, if so, how a series of key features of a topic affect its semantic consistency. Importantly, semantic consistency of a topic varies inversely with the semantic similarity between the topic and other topics. Overall, this study sheds light on the evolution law of topics, and helps researchers to understand these patterns from a geometric perspective.</p></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4,"publicationDate":"2024-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141944161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Rank aggregation with limited information based on link prediction","authors":"Guanghui Li , Yu Xiao , Jun Wu","doi":"10.1016/j.ipm.2024.103860","DOIUrl":"10.1016/j.ipm.2024.103860","url":null,"abstract":"<div><p>Rank aggregation is a vital tool in facilitating decision-making processes that consider multiple criteria or attributes. While in many applications, the available ranked lists are often limited and quite partial for various reasons. This scarcity of ranking information presents a significant challenge to rank aggregation effectiveness. To address this problem of rank aggregation with limited information, in this study, on the basis of networked representation of ranking information, we employ the link prediction technology to mine potential ranking information. It aims to optimize the aggregation process, and maximize the aggregation effectiveness using available limited information. Experimental results indicate that our proposed approach can significantly enhance the aggregation effectiveness of existing rank aggregation methods, such as Borda’s method, competition graph method and Markov chain method. Our work provides a new way to solve rank aggregation problem with limited information and develops a new research paradigm for future rank aggregation studies from the perspective of network science.</p></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141944162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-stakeholder recommendation system through deep learning-based preference evaluation and aggregation model with multi-view information embedding","authors":"Rahul Shrivastava, Dilip Singh Sisodia, Naresh Kumar Nagwani","doi":"10.1016/j.ipm.2024.103862","DOIUrl":"10.1016/j.ipm.2024.103862","url":null,"abstract":"<div><p>Learning the preferences of consumers, providers, and system stakeholders is a challenging problem in the Multi-Stakeholder Recommendation System (MSRS). Existing MSRS methods lack the ability to generate equitable recommendations and investigate implicit relationships between stakeholders and items. Hence, this study addresses this issue by proposing a multi-stakeholder preference learning-based recommendation model that exploits information from multiple views to evaluate stakeholders' preferences. The proposed model learns consumer preferences using users' ratings and reviews of an item, provider preferences with provider utility, and provider-item interaction. Furthermore, the proposed model learns the system-level preference of promoting long-tail items through the probabilistic evaluation of stakeholders' interest in popular and unpopular items. Finally, this study develops a multi-stakeholder, multi-view deep neural network model to aggregate stakeholders' preferences and deliver equitable recommendations. This work utilizes benchmark Movie Lens (ML) 25M, ML-100K, ML-1M, and TripAdvisor datasets to validate and compare the proposed model's performance with other baseline methods using standard evaluation metrics for each stakeholder. Examining the precision metrics, the proposed model attains the minimum enhancement of 7.91%, 18.24%, 10.72%, and 20.12% across the ML-25M, ML-100K, ML-1M, and TripAdvisor datasets. Further concerning the exposure, hit, and reach metrics, the model exhibits a substantial minimum improvement of 19.12%, 14.73%, 5.37%, and 28.46% over the ML-25M, ML-100K, ML-1M, and TripAdvisor datasets. Finally, the proposed model excels in promoting long-tail items and enhancing the cumulative utility gain of the stakeholders, surpassing the baseline methods.</p></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141951239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qingtao Pan , Haosen Wang , Jun Tang, Zhaolin Lv, Zining Wang, Xian Wu, Yirun Ruan, Tianyuan Yv, Mingrui Lao
{"title":"EIOA: A computing expectation-based influence evaluation method in weighted hypergraphs","authors":"Qingtao Pan , Haosen Wang , Jun Tang, Zhaolin Lv, Zining Wang, Xian Wu, Yirun Ruan, Tianyuan Yv, Mingrui Lao","doi":"10.1016/j.ipm.2024.103856","DOIUrl":"10.1016/j.ipm.2024.103856","url":null,"abstract":"<div><p>Influence maximization (IM) is a key issue in network science. However, previous research on IM has previously explored binary interaction relationship in ordinary graphs, with little consideration for higher-order interaction that are more practical in hypergraphs, especially weighted hypergraphs. Therefore, this study focuses on solving the IM problem in weighted hypergraphs. Firstly, we adopt a novel and more reasonable dissemination model, namely the adaptive dissemination (AD), and incorporate it into weighted hypergraphs. Next, a computing expectation-based influence evaluation method is proposed to accurately obtain the expected influence in one-hop area (EIOA) of the seed node set. Meanwhile, three search algorithms are designed using the EIOA to effectively solve the initial seed set. Then, multi-level experiments are conducted to compare the proposed algorithms with other six advanced algorithms in eight weighted hypergraph datasets from the real world. The experimental results are visually analyzed and two nonparametric test processes are applied to verify the significant advantages of the proposed algorithms. Finally, the impact of different factors such as seed set correlation, model parameter setting, and weight attribute on dissemination is explored, and the efficiency and robustness of these algorithms are further validated.</p></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141944163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}