基于深度学习的互补性专利识别方法

IF 3.4 2区管理学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Journal of Informetrics Pub Date : 2024-07-13 DOI:10.1016/j.joi.2024.101561

Jinzhu Zhang, Jialu Shi, Peiyu Zhang

{"title":"基于深度学习的互补性专利识别方法","authors":"Jinzhu Zhang, Jialu Shi, Peiyu Zhang","doi":"10.1016/j.joi.2024.101561","DOIUrl":null,"url":null,"abstract":"<div><p>Current studies on technology mining and analysis often focus on patent similarity, with relatively limited research on patent complementarity. Specifically, the hierarchical relationships among patents are seldom used and a standardized complementary patents dataset has not been established. In addition, it is necessary to utilize both network structure features and text content features of patents, and find the most suitable representation learning method for them. Finally, the relationships among different dimensions of feature representations are complex, making it essential to learn the contributions of each dimension considering complex interactions. Therefore, this paper first constructs a complementary patents dataset using hierarchical relationships contained in IPC numbers. Secondly, we design three types of embedding methods for patent semantic representation, including network embedding, text embedding and fusion embedding. Thirdly, we propose a deep learning framework enhanced by the CBAM (Convolutional Block Attention Module) to deal with the complex interactions between different dimensions of patent representation. The result shows that the proposed method CompGCN combined with ESimCSE_Attention performs best for complementary patent identification and the F1 score reaches 95.76 %. In addition, HeGAN and ESimCSE_Attention are the most suitable embedding methods for network structure and text content respectively. These results not only validate the effectiveness of the proposed approach, but also provide helpful and useful suggestions for method selection and complex relationships mining.</p></div>","PeriodicalId":48662,"journal":{"name":"Journal of Informetrics","volume":"18 3","pages":"Article 101561"},"PeriodicalIF":3.4000,"publicationDate":"2024-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An approach for identifying complementary patents based on deep learning\",\"authors\":\"Jinzhu Zhang, Jialu Shi, Peiyu Zhang\",\"doi\":\"10.1016/j.joi.2024.101561\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Current studies on technology mining and analysis often focus on patent similarity, with relatively limited research on patent complementarity. Specifically, the hierarchical relationships among patents are seldom used and a standardized complementary patents dataset has not been established. In addition, it is necessary to utilize both network structure features and text content features of patents, and find the most suitable representation learning method for them. Finally, the relationships among different dimensions of feature representations are complex, making it essential to learn the contributions of each dimension considering complex interactions. Therefore, this paper first constructs a complementary patents dataset using hierarchical relationships contained in IPC numbers. Secondly, we design three types of embedding methods for patent semantic representation, including network embedding, text embedding and fusion embedding. Thirdly, we propose a deep learning framework enhanced by the CBAM (Convolutional Block Attention Module) to deal with the complex interactions between different dimensions of patent representation. The result shows that the proposed method CompGCN combined with ESimCSE_Attention performs best for complementary patent identification and the F1 score reaches 95.76 %. In addition, HeGAN and ESimCSE_Attention are the most suitable embedding methods for network structure and text content respectively. These results not only validate the effectiveness of the proposed approach, but also provide helpful and useful suggestions for method selection and complex relationships mining.</p></div>\",\"PeriodicalId\":48662,\"journal\":{\"name\":\"Journal of Informetrics\",\"volume\":\"18 3\",\"pages\":\"Article 101561\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2024-07-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Informetrics\",\"FirstCategoryId\":\"91\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1751157724000749\",\"RegionNum\":2,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Informetrics","FirstCategoryId":"91","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1751157724000749","RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

摘要

目前的技术挖掘和分析研究通常侧重于专利相似性，而对专利互补性的研究相对有限。具体来说，专利之间的层次关系很少被利用，也没有建立标准化的互补性专利数据集。此外，有必要同时利用专利的网络结构特征和文本内容特征，并找到最适合它们的表示学习方法。最后，特征表征的不同维度之间关系复杂，因此必须考虑复杂的相互作用来学习每个维度的贡献。因此，本文首先利用 IPC 编号中包含的层次关系构建了一个补充专利数据集。其次，我们设计了三种专利语义表示的嵌入方法，包括网络嵌入、文本嵌入和融合嵌入。第三，我们提出了一个由 CBAM（卷积块注意力模块）增强的深度学习框架，以处理专利表示的不同维度之间的复杂交互。结果表明，结合 ESimCSE_Attention 的拟议方法 CompGCN 在专利互补性识别方面表现最佳，F1 分数达到 95.76 %。此外，HeGAN 和 ESimCSE_Attention 分别是最适合网络结构和文本内容的嵌入方法。这些结果不仅验证了所提方法的有效性，也为方法选择和复杂关系挖掘提供了有益的建议。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An approach for identifying complementary patents based on deep learning

Current studies on technology mining and analysis often focus on patent similarity, with relatively limited research on patent complementarity. Specifically, the hierarchical relationships among patents are seldom used and a standardized complementary patents dataset has not been established. In addition, it is necessary to utilize both network structure features and text content features of patents, and find the most suitable representation learning method for them. Finally, the relationships among different dimensions of feature representations are complex, making it essential to learn the contributions of each dimension considering complex interactions. Therefore, this paper first constructs a complementary patents dataset using hierarchical relationships contained in IPC numbers. Secondly, we design three types of embedding methods for patent semantic representation, including network embedding, text embedding and fusion embedding. Thirdly, we propose a deep learning framework enhanced by the CBAM (Convolutional Block Attention Module) to deal with the complex interactions between different dimensions of patent representation. The result shows that the proposed method CompGCN combined with ESimCSE_Attention performs best for complementary patent identification and the F1 score reaches 95.76 %. In addition, HeGAN and ESimCSE_Attention are the most suitable embedding methods for network structure and text content respectively. These results not only validate the effectiveness of the proposed approach, but also provide helpful and useful suggestions for method selection and complex relationships mining.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Informetrics Social Sciences-Library and Information Sciences

CiteScore

6.40

自引率

16.20%

发文量

期刊介绍： Journal of Informetrics (JOI) publishes rigorous high-quality research on quantitative aspects of information science. The main focus of the journal is on topics in bibliometrics, scientometrics, webometrics, patentometrics, altmetrics and research evaluation. Contributions studying informetric problems using methods from other quantitative fields, such as mathematics, statistics, computer science, economics and econometrics, and network science, are especially encouraged. JOI publishes both theoretical and empirical work. In general, case studies, for instance a bibliometric analysis focusing on a specific research field or a specific country, are not considered suitable for publication in JOI, unless they contain innovative methodological elements.