Guillaume Guarino, Ahmed Samet, Amir Nafi, D. Cavallucci
{"title":"专利理解的生成对抗网络","authors":"Guillaume Guarino, Ahmed Samet, Amir Nafi, D. Cavallucci","doi":"10.1109/ICDM51629.2021.00126","DOIUrl":null,"url":null,"abstract":"In recent years, Deep Learning methods have become very popular in Natural Language Processing (NLP), especially transformer-based architecture. NLP domain requires a high volume of annotated data to work. Unfortunately, obtaining high-quality and voluminous labeled data is expensive and time-consuming. One promising method which has singled out for its performance in the context of data deficiency is semi-supervised learning with Generative Adversarial Networks (GAN). In this paper, we propose a new approach called PaGAN which is a combination of a document classifier and a sentence-level classifier inside a GAN for patent documents understanding. The idea is to mine the patent’s motivating problem (aka contradiction in TRIZ domain) which is fundamentally important to understand the underlying invention and its originality. PaGAN is applied and evaluated on a real-world dataset. Experiments show outperforming results of PaGAN comparatively to baseline approaches.","PeriodicalId":320970,"journal":{"name":"2021 IEEE International Conference on Data Mining (ICDM)","volume":"279 1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"PaGAN: Generative Adversarial Network for Patent understanding\",\"authors\":\"Guillaume Guarino, Ahmed Samet, Amir Nafi, D. Cavallucci\",\"doi\":\"10.1109/ICDM51629.2021.00126\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, Deep Learning methods have become very popular in Natural Language Processing (NLP), especially transformer-based architecture. NLP domain requires a high volume of annotated data to work. Unfortunately, obtaining high-quality and voluminous labeled data is expensive and time-consuming. One promising method which has singled out for its performance in the context of data deficiency is semi-supervised learning with Generative Adversarial Networks (GAN). In this paper, we propose a new approach called PaGAN which is a combination of a document classifier and a sentence-level classifier inside a GAN for patent documents understanding. The idea is to mine the patent’s motivating problem (aka contradiction in TRIZ domain) which is fundamentally important to understand the underlying invention and its originality. PaGAN is applied and evaluated on a real-world dataset. Experiments show outperforming results of PaGAN comparatively to baseline approaches.\",\"PeriodicalId\":320970,\"journal\":{\"name\":\"2021 IEEE International Conference on Data Mining (ICDM)\",\"volume\":\"279 1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE International Conference on Data Mining (ICDM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDM51629.2021.00126\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Data Mining (ICDM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM51629.2021.00126","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
PaGAN: Generative Adversarial Network for Patent understanding
In recent years, Deep Learning methods have become very popular in Natural Language Processing (NLP), especially transformer-based architecture. NLP domain requires a high volume of annotated data to work. Unfortunately, obtaining high-quality and voluminous labeled data is expensive and time-consuming. One promising method which has singled out for its performance in the context of data deficiency is semi-supervised learning with Generative Adversarial Networks (GAN). In this paper, we propose a new approach called PaGAN which is a combination of a document classifier and a sentence-level classifier inside a GAN for patent documents understanding. The idea is to mine the patent’s motivating problem (aka contradiction in TRIZ domain) which is fundamentally important to understand the underlying invention and its originality. PaGAN is applied and evaluated on a real-world dataset. Experiments show outperforming results of PaGAN comparatively to baseline approaches.