使用集成分类器的自动单标签专利分类

2022 14th International Conference on Machine Learning and Computing (ICMLC) Pub Date : 2022-02-18 DOI:10.1145/3529836.3529849

Eleni Kamateri, Vasileios Stamatis, K. Diamantaras, M. Salampasis

{"title":"使用集成分类器的自动单标签专利分类","authors":"Eleni Kamateri, Vasileios Stamatis, K. Diamantaras, M. Salampasis","doi":"10.1145/3529836.3529849","DOIUrl":null,"url":null,"abstract":"Many thousands of patent applications arrive at patent offices around the world every day. One important task when a patent application is submitted is to assign one or more classification codes from the complex and hierarchical patent classification schemes that will enable routing of the patent application to a patent examiner who is knowledgeable about the specific technical field. This task is typically undertaken by patent professionals, however due to the large number of applications and the potential complexity of an invention, they are usually overwhelmed. Therefore, there is a need for this code assignment manual task to be supported or even fully automated by classification systems that will classify patent applications, hopefully with an accuracy close to patent professionals. Like in many other text analysis problems, in the last years, this intellectually demanding task has been studied using word embeddings and deep learning techniques. In this paper these research efforts are shortly reviewed and re-produced with similar deep learning techniques using different feature representations on automatic patent classification in the level of sub-classes. On top of that, an innovative method of ensemble classifiers trained with different parts of the patent document is proposed. To the best of our knowledge, this is the first time that an ensemble method was proposed for the patent classification problem. Our first results are quite promising showing that an ensemble architecture of classifiers significantly outperforms current state-of-the-art techniques using the same classifiers as standalone solutions.","PeriodicalId":285191,"journal":{"name":"2022 14th International Conference on Machine Learning and Computing (ICMLC)","volume":"496 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Automated Single-Label Patent Classification using Ensemble Classifiers\",\"authors\":\"Eleni Kamateri, Vasileios Stamatis, K. Diamantaras, M. Salampasis\",\"doi\":\"10.1145/3529836.3529849\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Many thousands of patent applications arrive at patent offices around the world every day. One important task when a patent application is submitted is to assign one or more classification codes from the complex and hierarchical patent classification schemes that will enable routing of the patent application to a patent examiner who is knowledgeable about the specific technical field. This task is typically undertaken by patent professionals, however due to the large number of applications and the potential complexity of an invention, they are usually overwhelmed. Therefore, there is a need for this code assignment manual task to be supported or even fully automated by classification systems that will classify patent applications, hopefully with an accuracy close to patent professionals. Like in many other text analysis problems, in the last years, this intellectually demanding task has been studied using word embeddings and deep learning techniques. In this paper these research efforts are shortly reviewed and re-produced with similar deep learning techniques using different feature representations on automatic patent classification in the level of sub-classes. On top of that, an innovative method of ensemble classifiers trained with different parts of the patent document is proposed. To the best of our knowledge, this is the first time that an ensemble method was proposed for the patent classification problem. Our first results are quite promising showing that an ensemble architecture of classifiers significantly outperforms current state-of-the-art techniques using the same classifiers as standalone solutions.\",\"PeriodicalId\":285191,\"journal\":{\"name\":\"2022 14th International Conference on Machine Learning and Computing (ICMLC)\",\"volume\":\"496 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-02-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 14th International Conference on Machine Learning and Computing (ICMLC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3529836.3529849\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 14th International Conference on Machine Learning and Computing (ICMLC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3529836.3529849","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

世界各地的专利局每天收到成千上万的专利申请。提交专利申请时的一项重要任务是从复杂的分层专利分类方案中分配一个或多个分类代码，以便将专利申请路由给熟悉特定技术领域的专利审查员。这项任务通常由专利专业人员承担，但是由于大量的申请和发明的潜在复杂性，他们通常不堪重负。因此，有必要支持这个代码分配手动任务，甚至由分类系统完全自动化，这些分类系统将对专利申请进行分类，希望其准确性接近专利专业人员。像许多其他文本分析问题一样，在过去的几年里，这个智力要求很高的任务已经使用词嵌入和深度学习技术进行了研究。本文简要回顾了这些研究成果，并使用类似的深度学习技术在子类级别上使用不同的特征表示进行专利自动分类。在此基础上，提出了一种利用专利文档的不同部分训练集成分类器的创新方法。据我们所知，这是首次针对专利分类问题提出集成方法。我们的第一个结果非常有希望，表明分类器的集成体系结构明显优于使用相同分类器作为独立解决方案的当前最先进的技术。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Automated Single-Label Patent Classification using Ensemble Classifiers

Many thousands of patent applications arrive at patent offices around the world every day. One important task when a patent application is submitted is to assign one or more classification codes from the complex and hierarchical patent classification schemes that will enable routing of the patent application to a patent examiner who is knowledgeable about the specific technical field. This task is typically undertaken by patent professionals, however due to the large number of applications and the potential complexity of an invention, they are usually overwhelmed. Therefore, there is a need for this code assignment manual task to be supported or even fully automated by classification systems that will classify patent applications, hopefully with an accuracy close to patent professionals. Like in many other text analysis problems, in the last years, this intellectually demanding task has been studied using word embeddings and deep learning techniques. In this paper these research efforts are shortly reviewed and re-produced with similar deep learning techniques using different feature representations on automatic patent classification in the level of sub-classes. On top of that, an innovative method of ensemble classifiers trained with different parts of the patent document is proposed. To the best of our knowledge, this is the first time that an ensemble method was proposed for the patent classification problem. Our first results are quite promising showing that an ensemble architecture of classifiers significantly outperforms current state-of-the-art techniques using the same classifiers as standalone solutions.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 14th International Conference on Machine Learning and Computing (ICMLC)

自引率

0.00%

发文量