Han Wu;Guanqi Zhu;Qi Liu;Hengshu Zhu;Hao Wang;Hongke Zhao;Chuanren Liu;Enhong Chen;Hui Xiong
{"title":"A Multi-Aspect Neural Tensor Factorization Framework for Patent Litigation Prediction","authors":"Han Wu;Guanqi Zhu;Qi Liu;Hengshu Zhu;Hao Wang;Hongke Zhao;Chuanren Liu;Enhong Chen;Hui Xiong","doi":"10.1109/TBDATA.2023.3313030","DOIUrl":null,"url":null,"abstract":"Patent litigation is an expensive and time-consuming legal process. To reduce costs, companies can proactively manage patents using predictive analysis to identify potential plaintiffs, defendants, and patents that may lead to litigation. However, there has been limited progress in predicting patent litigation due to the scarcity of lawsuits, the complexities of intentions, and the diversity of litigation characteristics. To this end, in this paper, we summarize the major causes of patent litigation into multiple aspects: the complex relations among plaintiffs, defendants and patents as well as the diverse content information from them. Along this line, we propose a Multi-aspect Neural Tensor Factorization (MANTF) framework for patent litigation prediction. First, a Pair-wise Tensor Factorization (PTF) module is designed to capture the complex relations among plaintiffs, defendants and patents inherent in a three-dimensional tensor, which will produce factorized latent vectors for companies and patents with pair-wise ranking estimators. Then, to better represent the patents and companies as an aid for PTF, we design a Patent Embedding Network (PEN) module and a Mask Company Embedding Network (MCEN) module to generate content-aware embedding for them, where PEN represents patents based on their meta, textual and graphical features, and MCEN represents companies by integrating their intrinsic features and competitions. Next, to integrate these three modules together, we leverage a Gaussian prior on the difference between factorized representations and content-aware embedding, and train MANTF in an end-to-end way. In the end, final predictions for patent litigation, i.e., the potentially litigated plaintiffs, defendants and patents, can be made with the well-trained model. We conduct extensive experiments on two real-world datasets, whose results prove that MANTF not only helps predict potential patent litigation but also shows robustness under various data sparse situations.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"10 1","pages":"35-54"},"PeriodicalIF":7.5000,"publicationDate":"2023-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Big Data","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10257662/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Patent litigation is an expensive and time-consuming legal process. To reduce costs, companies can proactively manage patents using predictive analysis to identify potential plaintiffs, defendants, and patents that may lead to litigation. However, there has been limited progress in predicting patent litigation due to the scarcity of lawsuits, the complexities of intentions, and the diversity of litigation characteristics. To this end, in this paper, we summarize the major causes of patent litigation into multiple aspects: the complex relations among plaintiffs, defendants and patents as well as the diverse content information from them. Along this line, we propose a Multi-aspect Neural Tensor Factorization (MANTF) framework for patent litigation prediction. First, a Pair-wise Tensor Factorization (PTF) module is designed to capture the complex relations among plaintiffs, defendants and patents inherent in a three-dimensional tensor, which will produce factorized latent vectors for companies and patents with pair-wise ranking estimators. Then, to better represent the patents and companies as an aid for PTF, we design a Patent Embedding Network (PEN) module and a Mask Company Embedding Network (MCEN) module to generate content-aware embedding for them, where PEN represents patents based on their meta, textual and graphical features, and MCEN represents companies by integrating their intrinsic features and competitions. Next, to integrate these three modules together, we leverage a Gaussian prior on the difference between factorized representations and content-aware embedding, and train MANTF in an end-to-end way. In the end, final predictions for patent litigation, i.e., the potentially litigated plaintiffs, defendants and patents, can be made with the well-trained model. We conduct extensive experiments on two real-world datasets, whose results prove that MANTF not only helps predict potential patent litigation but also shows robustness under various data sparse situations.
期刊介绍:
The IEEE Transactions on Big Data publishes peer-reviewed articles focusing on big data. These articles present innovative research ideas and application results across disciplines, including novel theories, algorithms, and applications. Research areas cover a wide range, such as big data analytics, visualization, curation, management, semantics, infrastructure, standards, performance analysis, intelligence extraction, scientific discovery, security, privacy, and legal issues specific to big data. The journal also prioritizes applications of big data in fields generating massive datasets.