Michail Salampasis , Eleni Kamateri , Vasileios Stamatis , Mihai Lupu , Allan Hanbury , Florina Piroi
{"title":"迈向专利实验的新范式:WPI+","authors":"Michail Salampasis , Eleni Kamateri , Vasileios Stamatis , Mihai Lupu , Allan Hanbury , Florina Piroi","doi":"10.1016/j.wpi.2025.102389","DOIUrl":null,"url":null,"abstract":"<div><div>We enhance the WPI patent research collection, which is publicly accessible and free of charge, to facilitate more comparable, transparent, and reproducible experiments. This is accomplished through what we call “soft standardization” advocating the adoption of consistent methods in using the test collection. We offer data statistics, predefined collection subsets, ground-truth data for additional tasks, and open-source tools for using the collection, all on a public GitHub repository. These resources not only relieve researchers from performing essential collection analysis tasks but also implicitly guide them toward sound methods for conducting experiments with the collection. Our initiative is primarily motivated by the goal of enhancing comparability and reproducibility of patent research. This is achieved through the development of a carefully designed resource that will be continuously expanded and maintained. Our work is also driven by the observation that highly integrated Information Retrieval experiment platforms for large scale evaluation are not widely adopted by researchers. We provide examples of how the WPI+ resource/collection can be used for research on multiple patent specific tasks, including prior-art search, patent classification, and summarization. Overall, our work shows that the traditional concept of a test collection—limited to just a corpus, topics, and relevance assessments—can be broadened to support more efficient and reliable scientific experimentation.</div></div>","PeriodicalId":51794,"journal":{"name":"World Patent Information","volume":"83 ","pages":"Article 102389"},"PeriodicalIF":1.9000,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Towards a new paradigm for patent experimentation: WPI+\",\"authors\":\"Michail Salampasis , Eleni Kamateri , Vasileios Stamatis , Mihai Lupu , Allan Hanbury , Florina Piroi\",\"doi\":\"10.1016/j.wpi.2025.102389\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>We enhance the WPI patent research collection, which is publicly accessible and free of charge, to facilitate more comparable, transparent, and reproducible experiments. This is accomplished through what we call “soft standardization” advocating the adoption of consistent methods in using the test collection. We offer data statistics, predefined collection subsets, ground-truth data for additional tasks, and open-source tools for using the collection, all on a public GitHub repository. These resources not only relieve researchers from performing essential collection analysis tasks but also implicitly guide them toward sound methods for conducting experiments with the collection. Our initiative is primarily motivated by the goal of enhancing comparability and reproducibility of patent research. This is achieved through the development of a carefully designed resource that will be continuously expanded and maintained. Our work is also driven by the observation that highly integrated Information Retrieval experiment platforms for large scale evaluation are not widely adopted by researchers. We provide examples of how the WPI+ resource/collection can be used for research on multiple patent specific tasks, including prior-art search, patent classification, and summarization. Overall, our work shows that the traditional concept of a test collection—limited to just a corpus, topics, and relevance assessments—can be broadened to support more efficient and reliable scientific experimentation.</div></div>\",\"PeriodicalId\":51794,\"journal\":{\"name\":\"World Patent Information\",\"volume\":\"83 \",\"pages\":\"Article 102389\"},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2025-09-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"World Patent Information\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0172219025000560\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"INFORMATION SCIENCE & LIBRARY SCIENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"World Patent Information","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0172219025000560","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}
Towards a new paradigm for patent experimentation: WPI+
We enhance the WPI patent research collection, which is publicly accessible and free of charge, to facilitate more comparable, transparent, and reproducible experiments. This is accomplished through what we call “soft standardization” advocating the adoption of consistent methods in using the test collection. We offer data statistics, predefined collection subsets, ground-truth data for additional tasks, and open-source tools for using the collection, all on a public GitHub repository. These resources not only relieve researchers from performing essential collection analysis tasks but also implicitly guide them toward sound methods for conducting experiments with the collection. Our initiative is primarily motivated by the goal of enhancing comparability and reproducibility of patent research. This is achieved through the development of a carefully designed resource that will be continuously expanded and maintained. Our work is also driven by the observation that highly integrated Information Retrieval experiment platforms for large scale evaluation are not widely adopted by researchers. We provide examples of how the WPI+ resource/collection can be used for research on multiple patent specific tasks, including prior-art search, patent classification, and summarization. Overall, our work shows that the traditional concept of a test collection—limited to just a corpus, topics, and relevance assessments—can be broadened to support more efficient and reliable scientific experimentation.
期刊介绍:
The aim of World Patent Information is to provide a worldwide forum for the exchange of information between people working professionally in the field of Industrial Property information and documentation and to promote the widest possible use of the associated literature. Regular features include: papers concerned with all aspects of Industrial Property information and documentation; new regulations pertinent to Industrial Property information and documentation; short reports on relevant meetings and conferences; bibliographies, together with book and literature reviews.