{"title":"A hybrid feature fusion deep learning framework for multi-source medical image analysis","authors":"Qiang Cao , Xian Cheng","doi":"10.1016/j.ipm.2024.103934","DOIUrl":"10.1016/j.ipm.2024.103934","url":null,"abstract":"<div><div>Despite the widespread adoption of deep learning to enhance image classification, significant obstacles remain. First, multisource data with diverse sizes and formats is a great challenge for most current deep learning models. Second, lacking manual labeled data for model training limits the application of deep learning. Third, the widely used CNN-based methods shows their limitations in extracting global features and yield poor performance for image topology. To address these issues, we propose a Hybrid Feature Fusion Deep Learning (HFFDL) framework for image classification. This framework consists of an automated image segmentation module, a two-stream backbone module, and a classification module. The automatic image segmentation module utilizes the U-Net model and transfer learning to detect region of interest (ROI) in multisource images; the two-stream backbone module integrates the Swin Transformer architecture with the Inception CNN, with the aim of simultaneous extracting local and global features for efficient representation learning. We evaluate the performance of HFFDL framework with two publicly available image datasets: one for identifying COVID-19 through X-ray scans of the chest (30,386 images), and another for multiclass skin cancer screening using dermoscopy images (25,331 images). The HFFDL framework exhibited greater performance in comparison to many cutting-edge models, achieving the AUC score 0.9835 and 0.8789, respectively. Furthermore, a practical application study conducted in a hospital, identifying viable embryos using medical images, revealed the HFFDL framework outperformed embryologists.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4,"publicationDate":"2024-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142535923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tao Wen , Yu-wang Chen , Tahir Abbas Syed , Darminder Ghataoura
{"title":"Examining communication network behaviors, structure and dynamics in an organizational hierarchy: A social network analysis approach","authors":"Tao Wen , Yu-wang Chen , Tahir Abbas Syed , Darminder Ghataoura","doi":"10.1016/j.ipm.2024.103927","DOIUrl":"10.1016/j.ipm.2024.103927","url":null,"abstract":"<div><div>Effectively understanding and enhancing communication flows among employees within an organizational hierarchy is crucial for optimizing operational and decision-making efficiency. To fill this significant gap in research, we propose a systematic and comprehensive social network analysis approach, coupled with a newly formulated communication vector and matrix, to examine communication behaviors and dynamics in an organizational hierarchy. We use the Enron email dataset, consisting of 619,499 emails, as an illustrative example to bridge the micro-macro divide of organizational communication research. A series of centrality measures are employed to evaluate the influential ability of individual employees, revealing descending influential ability and changing behaviors according to hierarchy. We also uncover that employees tend to communicate within the same functional teams through the identification of community structure and the proposed communication matrix. Furthermore, the emergent dynamics of organizational communication during a crisis are examined through a time-segmented dataset, showcasing the progressive absence of the legal team, the responsibility of top management, and the presence of hierarchy. By considering both individual and organizational perspectives, our work provides a systematic and data-driven approach to understanding how the organizational communication network emerges dynamically from individual communication behaviors within the hierarchy, which has the potential to enhance operational and decision-making efficiency within organizations.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4,"publicationDate":"2024-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142535922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lin Runhui , Li Yalin , Ji Ze , Xie Qiqi , Chen Xiaoyu
{"title":"Quantifying the degree of scientific innovation breakthrough: Considering knowledge trajectory change and impact","authors":"Lin Runhui , Li Yalin , Ji Ze , Xie Qiqi , Chen Xiaoyu","doi":"10.1016/j.ipm.2024.103933","DOIUrl":"10.1016/j.ipm.2024.103933","url":null,"abstract":"<div><div>Scientific breakthroughs have the potential to reshape the trajectory of knowledge flow and significantly impact later research. The aim of this study is to introduce the Degree of Innovation Breakthrough (DIB) metric to more accurately quantify the extent of scientific breakthroughs. The DIB metric takes into account changes in the trajectory of knowledge flow, as well as the deep and width of impact, and it modifies the traditional assumption of equal citation contributions by assigning weighted citation counts. The effectiveness of the DIB metric is assessed using ROC curves and AUC metrics, demonstrating its ability to differentiate between high and low scientific breakthroughs with high sensitivity and minimal false positives. Based on ROC curves, this study proposes a method to calculate the threshold for high scientific breakthrough, reducing subjectivity. The effectiveness of the proposed method is demonstrated through a dataset consisting of 1108 award-winning computer science papers and 9832 matched control papers, showing that the DIB metric surpasses single-dimensional metrics. The study also performs a granular analysis of the innovation breakthrough degree of non-award-winning papers, categorizing them into four types based on originality and impact through 2D histogram visualization, and suggests tailored management strategies. Through the adoption of this refined classification strategy, the management of innovation practices can be optimized, ultimately fostering the enhancement of innovative research outcomes. The quantitative tools introduced in this paper offer guidance for researchers in the fields of science intelligence mining and science trend prediction.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4,"publicationDate":"2024-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142536025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jinjin Zhang, Qimeng Fan, Dingan Wang, Pu Huang, Zhangjing Yang
{"title":"Triple Sparse Denoising Discriminantive Least Squares Regression for image classification","authors":"Jinjin Zhang, Qimeng Fan, Dingan Wang, Pu Huang, Zhangjing Yang","doi":"10.1016/j.ipm.2024.103922","DOIUrl":"10.1016/j.ipm.2024.103922","url":null,"abstract":"<div><div>Discriminantive Least Squares Regression (DLSR) is an algorithm that employs <span><math><mi>ɛ</mi></math></span>-draggings techniques to enhance intra-class similarity. However, it overlooks that an increase in intra-class closeness may simultaneously lead to a decrease in the distance between similar but different classes. To address this issue, we propose a new approach called Triple Sparse Denoising Discriminantive Least Squares Regression (TSDDLSR), which combines three sparsity constraints: sparsity constraints between classes to amplify the growth of the distance between similar classes; sparsity constraints on relaxation matrices to capture more local structure; sparsity constraints on noise matrices to minimize the effect of outliers. In addition, we position the matrix decomposition step in the label space strategically with the objective of enhancing denoising capabilities, safeguarding it from potential degradation, and preserving its underlying manifold structure. Our experiments evaluate the classification performance of the method under face recognition tasks (AR, CMU PIE, Extended Yale B, Georgia Tech, FERET datasets), biometric recognition tasks (PolyU Palmprint dataset), and object recognition tasks (COIL-20, ImageNet datasets). Meanwhile, the results show that TSDDLSR significantly improves classification performance compared to existing methods.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4,"publicationDate":"2024-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142536026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unsupervised feature selection using sparse manifold learning: Auto-encoder approach","authors":"Amir Moslemi , Mina Jamshidi","doi":"10.1016/j.ipm.2024.103923","DOIUrl":"10.1016/j.ipm.2024.103923","url":null,"abstract":"<div><div>Feature selection techniques are widely being used as a preprocessing step to train machine learning algorithms to circumvent the curse of dimensionality, overfitting, and computation time challenges. Projection-based methods are frequently employed in feature selection, leveraging the extraction of linear relationships among features. The absence of nonlinear information extraction among features is notable in this context. While auto-encoder based techniques have recently gained traction for feature selection, their focus remains primarily on the encoding phase, as it is through this phase that the selected features are derived. The subtle point is that the performance of auto-encoder to obtain the most discriminative features is significantly affected by decoding phase. To address these challenges, in this paper, we proposed a novel feature selection based on auto-encoder to not only extracting nonlinear information among features but also decoding phase is regularized as well to enhance the performance of algorithm. In this study, we defined a new model of auto-encoder to preserve the topological information of reconstructed close to input data. To geometric structure of input data is preserved in projected space using Laplacian graph, and geometrical projected space is preserved in reconstructed space using a suitable term (abstract Laplacian graph of reconstructed data) in optimization problem. Preserving abstract Laplacian graph of reconstructed data close to Laplacian graph of input data affects the performance of feature selection and we experimentally showed this. Therefore, we show an effective approach to solve the objective of the corresponding problem. Since this approach can be mainly used for clustering aims, we conducted experiments on ten benchmark datasets and assessed our propped method based on clustering accuracy and normalized mutual information (NMI) metric. Our method obtained considerable superiority over recent state-of-the-art techniques in terms of NMI and accuracy.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142446437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shixuan Liu , Haoxiang Cheng , Yunfei Wang , Yue He , Changjun Fan , Zhong Liu
{"title":"EvoPath: Evolutionary meta-path discovery with large language models for complex heterogeneous information networks","authors":"Shixuan Liu , Haoxiang Cheng , Yunfei Wang , Yue He , Changjun Fan , Zhong Liu","doi":"10.1016/j.ipm.2024.103920","DOIUrl":"10.1016/j.ipm.2024.103920","url":null,"abstract":"<div><div>Heterogeneous Information Networks (HINs) encapsulate diverse entity and relation types, with meta-paths providing essential meta-level semantics for knowledge reasoning, although their utility is constrained by discovery challenges. While Large Language Models (LLMs) offer new prospects for meta-path discovery due to their extensive knowledge encoding and efficiency, their adaptation faces challenges such as corpora bias, lexical discrepancies, and hallucination. This paper pioneers the mitigation of these challenges by presenting EvoPath, an innovative framework that leverages LLMs to efficiently identify high-quality meta-paths. EvoPath is carefully designed, with each component aimed at addressing issues that could lead to potential knowledge conflicts. With a minimal subset of HIN facts, EvoPath iteratively generates and evolves meta-paths by dynamically replaying meta-paths in the buffer with prioritization based on their scores. Comprehensive experiments on three large, complex HINs with hundreds of relations demonstrate that our framework, EvoPath, enables LLMs to generate high-quality meta-paths through effective prompting, confirming its superior performance in HIN reasoning tasks. Further ablation studies validate the effectiveness of each module within the framework.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142536024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A diachronic language model for long-time span classical Chinese","authors":"Yuting Wei, Meiling Li, Yangfu Zhu, Yuanxing Xu, Yuqing Li, Bin Wu","doi":"10.1016/j.ipm.2024.103925","DOIUrl":"10.1016/j.ipm.2024.103925","url":null,"abstract":"<div><div>Classical Chinese literature, with its long history spanning thousands of years, serves as an invaluable resource for historical and humanistic studies. Previous classical Chinese language models have achieved significant progress in semantic understanding. However, they largely neglected the dynamic evolution of language across different historical eras. In this paper, we introduce a novel diachronic pre-trained language model tailored for classical Chinese texts. This model utilizes a time-based transformer architecture that captures the continuous evolution of semantics over time. Moreover, it adeptly balances the contextual and temporal information, minimizing semantic ambiguities from excessive time-related inputs. A high-quality diachronic corpus for classical Chinese is developed for training. This corpus spans from the pre-Qin dynasty to the Qing dynasty and includes a diverse array of genres. We validate its effectiveness by enriching a well-known classical Chinese word sense disambiguation dataset with additional temporal annotations. The results demonstrate the state-of-the-art performance of our model in discerning classical Chinese word meanings across different historical periods. Our research helps linguists to rapidly grasp the extent of semantic changes across different periods from vast corpora.<span><span><sup>1</sup></span></span></div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142440957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Naveed Imran , Jian Zhang , Zheng Yang , Jehad Ali
{"title":"mm-FERP: An effective method for human personality prediction via mm-wave radar using facial sensing","authors":"Naveed Imran , Jian Zhang , Zheng Yang , Jehad Ali","doi":"10.1016/j.ipm.2024.103919","DOIUrl":"10.1016/j.ipm.2024.103919","url":null,"abstract":"<div><div>mm-FERP (millimeter wave Facial Expression Recognition for Personality) explores the use of mm-Wave radar technology, specifically the TI IWR1443, to assess personality traits based on the OCEAN model through facial expression analysis. This research uniquely combines psychological profiling with state-of-the-art technology to predict the OCEAN (Openness, Conscientiousness, Extraversion, Agreeableness, Neuroticism) personality traits by carefully analyzing facial muscle movements collected through mm-wave radar alongside detailed questionnaire analysis. Our advanced mm-FERP system employs mm-wave radar technology for the detection and analysis of facial expressions in a manner that is both non-intrusive and privacy-centric, handling the ethical and privacy concerns associated with traditional camera-based methods. Using a convolutional neural network (CNN), mm-FERP effectively analyzes the complex patterns in mm-wave signals. This approach enables the smooth transfer of model knowledge from extensive image-based (Scalograms) datasets to the detailed understanding of mm-wave radar signals, significantly enhancing the model’s predictive accuracy and efficiency in identifying personality traits via emotional behavior. Our in-depth evaluation reveals mm-FERP’s remarkable potential to predict personality traits through emotion recognition (Neutral, Smile, Angry, Sad, Amazed) with an impressive accuracy of 97% across distances up to 0.47 m. We experiment in a controlled environment with more than 50 participants from different age groups (18–35) including males and females of different continents to train our model on different facial symmetry. Each participant gives 50 samples 10 for each expression making a total of 2500 samples. We also collected a self-assessment report from the same participants of 64 questions related to psychological behavior to validate personality by correlating it with radar signal features on question value weight (0.5–1.5). mm-FERP achieve an average score of 97.8% in precision, 97.2% in Recall, and 97.2% of F1. These results show mm-FERP’s ability as an innovative approach for psychological behavioral analysis through mm-wave emotion recognition, improving user experience design, and paving the path for interactive technologies that are both personalized and psychologically insightful.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142440959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shihui Zhang , Houlin Wang , Lei Wang , Xueqiang Han , Qing Tian
{"title":"CGCN: Context graph convolutional network for few-shot temporal action localization","authors":"Shihui Zhang , Houlin Wang , Lei Wang , Xueqiang Han , Qing Tian","doi":"10.1016/j.ipm.2024.103926","DOIUrl":"10.1016/j.ipm.2024.103926","url":null,"abstract":"<div><div>Localizing human actions in videos has attracted extensive attention from industry and academia. Few-Shot Temporal Action Localization (FS-TAL) aims to detect human actions in untrimmed videos using a limited number of training samples. Existing FS-TAL methods usually ignore the semantic context between video snippets, making it difficult to detect actions during the query process. In this paper, we propose a novel FS-TAL method named Context Graph Convolutional Network (CGCN) which employs multi-scale graph convolution to aggregate semantic context between video snippets in addition to exploiting their temporal context. Specifically, CGCN constructs a graph for each scale of a video, where each video snippet is a node, and the relationships between the snippets are edges. There are three types of edges, namely sequence edges, intra-action edges, and inter-action edges. CGCN establishes sequence edges to enhance temporal expression. Intra-action edges utilize hyperbolic space to encapsulate context among video snippets within each action, while inter-action edges leverage Euclidean space to capture similar semantics between different actions. Through graph convolution on each scale, CGCN enables the acquisition of richer and context-aware video representations. Experiments demonstrate CGCN outperforms the second-best method by 4.5%/0.9% and 4.3%/0.9% mAP on the ActivityNet and THUMOS14 datasets in one-shot/five-shot scenarios, respectively, at [email protected]. The source code can be found in <span><span>https://github.com/mugenggeng/CGCN.git</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142433121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shuai Yu , Wei Gao , Yongbin Qin , Caiwei Yang , Ruizhang Huang , Yanping Chen , Chuan Lin
{"title":"IterSum: Iterative summarization based on document topological structure","authors":"Shuai Yu , Wei Gao , Yongbin Qin , Caiwei Yang , Ruizhang Huang , Yanping Chen , Chuan Lin","doi":"10.1016/j.ipm.2024.103918","DOIUrl":"10.1016/j.ipm.2024.103918","url":null,"abstract":"<div><div>Document structure plays a crucial role in understanding and analyzing document information. However, effectively encoding document structural features into the Transformer architecture faces significant challenges. This is primarily because different types of documents require the model to adopt varying structural encoding strategies, leading to a lack of a unified framework that can broadly adapt to different document types to leverage their structural properties. Despite the diversity of document types, sentences within a document are interconnected through semantic relationships, forming a topological semantic network. This topological structure is essential for integrating and summarizing information within the document. In this work, we introduce IterSum, a versatile text summarization framework applicable to various types of text. In IterSum, we utilize the document’s topological structure to divide the text into multiple blocks, first generating a summary for the initial block, then combining the current summary with the content of the next block to produce the subsequent summary, and continuing in this iterative manner until the final summary is generated. We validated our model on nine different types of public datasets, including news, knowledge bases, legal documents, and guidelines. Both quantitative and qualitative analyses were conducted, and the experimental results show that our model achieves state-of-the-art performance on all nine datasets measured by ROUGE scores. We also explored low-resource summarization, finding that even with only 10 or 100 samples in multiple datasets, top-notch results were obtained. Finally, we conducted human evaluations to further validate the superiority of our model.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142437742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}