Jiaming Ding;Anning Wang;Kenneth Guang-Lih Huang;Qiang Zhang;Shanlin Yang
{"title":"Improving Large-Scale Classification in Technology Management: Making Full Use of Label Information for Professional Technical Documents","authors":"Jiaming Ding;Anning Wang;Kenneth Guang-Lih Huang;Qiang Zhang;Shanlin Yang","doi":"10.1109/TEM.2024.3481439","DOIUrl":null,"url":null,"abstract":"Professional technical documents (PTDs) offer a wealth of information for R&D personnel and innovation management scholars. Recently, the increase in the categories and volume of PTDs has introduced new challenges for their automatic and accurate classification. Existing studies have focused on leveraging the semantic information of documents (i.e., titles and abstracts) for classification tasks. However, the standard label hierarchy of classification systems and the rich label semantic information have been generally ignored. In this paper, we propose a supervised learning-based classification model, designed to Make Full Use of Label Information (MFULI) for hierarchical multi-label PTD classification. Firstly, we deploy a Label-aware Supervised Contrastive Learning Module (LSCLM), which introduces the definition of label set similarity with the aim of improving document representation. Then, we propose a Hierarchy-aware Label Embedding Attentive Module (HLEAM) that dynamically incorporates label hierarchy information into the classification model. We evaluate our proposed model on two public patent datasets, namely USPTO-1 and WIPO-alpha. Experimental results show that our model outperforms other state-of-the-art classification models. Furthermore, we perform a series of ablation studies and analyses to demonstrate the necessity of each component of our model. This paper provides important theoretical contributions and practical implications for innovation and technology management. \n<p><i>Managerial Relevance Statement</i>—This study helps advance the field of R&D, innovation and technology management by introducing a novel supervised learning-based classification model for professional technical documents (PTDs). Our proposed approach, termed Making Full Use of Label Information (MFULI), is specifically designed for hierarchical multi-label PTD classification, addressing the challenges posed by the growing diversity and volume of PTDs. By integrating innovative components such as the Label-aware Supervised Contrastive Learning Module (LSCLM) and the Hierarchy-aware Label Embedding Attentive Module (HLEAM), MFULI significantly enhances document representation and classification accuracy. The experimental validation of the model on public patent datasets underscores its practical utility and superiority over other existing state-of-the-art models. For managers and practitioners in R&D, innovation and technology management, the implications of this research are profound. Our study provides significant contributions to the fields of technology and innovation management, engineering management, and automated document classification, yielding both theoretical insights and practical implications. The model's ability to effectively categorize large-scale PTDs aids in streamlining knowledge management processes, enhancing decision-making, and fostering more efficient innovation strategies. In summary, this research offers a robust and innovative tool for managing PTDs, contributing to the more effective handling of critical information for innovation and technology management.</p>","PeriodicalId":55009,"journal":{"name":"IEEE Transactions on Engineering Management","volume":"71 ","pages":"15188-15208"},"PeriodicalIF":4.6000,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Engineering Management","FirstCategoryId":"91","ListUrlMain":"https://ieeexplore.ieee.org/document/10720519/","RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BUSINESS","Score":null,"Total":0}
引用次数: 0
Abstract
Professional technical documents (PTDs) offer a wealth of information for R&D personnel and innovation management scholars. Recently, the increase in the categories and volume of PTDs has introduced new challenges for their automatic and accurate classification. Existing studies have focused on leveraging the semantic information of documents (i.e., titles and abstracts) for classification tasks. However, the standard label hierarchy of classification systems and the rich label semantic information have been generally ignored. In this paper, we propose a supervised learning-based classification model, designed to Make Full Use of Label Information (MFULI) for hierarchical multi-label PTD classification. Firstly, we deploy a Label-aware Supervised Contrastive Learning Module (LSCLM), which introduces the definition of label set similarity with the aim of improving document representation. Then, we propose a Hierarchy-aware Label Embedding Attentive Module (HLEAM) that dynamically incorporates label hierarchy information into the classification model. We evaluate our proposed model on two public patent datasets, namely USPTO-1 and WIPO-alpha. Experimental results show that our model outperforms other state-of-the-art classification models. Furthermore, we perform a series of ablation studies and analyses to demonstrate the necessity of each component of our model. This paper provides important theoretical contributions and practical implications for innovation and technology management.
Managerial Relevance Statement—This study helps advance the field of R&D, innovation and technology management by introducing a novel supervised learning-based classification model for professional technical documents (PTDs). Our proposed approach, termed Making Full Use of Label Information (MFULI), is specifically designed for hierarchical multi-label PTD classification, addressing the challenges posed by the growing diversity and volume of PTDs. By integrating innovative components such as the Label-aware Supervised Contrastive Learning Module (LSCLM) and the Hierarchy-aware Label Embedding Attentive Module (HLEAM), MFULI significantly enhances document representation and classification accuracy. The experimental validation of the model on public patent datasets underscores its practical utility and superiority over other existing state-of-the-art models. For managers and practitioners in R&D, innovation and technology management, the implications of this research are profound. Our study provides significant contributions to the fields of technology and innovation management, engineering management, and automated document classification, yielding both theoretical insights and practical implications. The model's ability to effectively categorize large-scale PTDs aids in streamlining knowledge management processes, enhancing decision-making, and fostering more efficient innovation strategies. In summary, this research offers a robust and innovative tool for managing PTDs, contributing to the more effective handling of critical information for innovation and technology management.
期刊介绍:
Management of technical functions such as research, development, and engineering in industry, government, university, and other settings. Emphasis is on studies carried on within an organization to help in decision making or policy formation for RD&E.