M. Drazic, D. Kukolj, Milana Vitas, M. Pokric, S. Manojlovic, Z. Tekic
{"title":"Effectiveness of text processing in patent documents visualization","authors":"M. Drazic, D. Kukolj, Milana Vitas, M. Pokric, S. Manojlovic, Z. Tekic","doi":"10.1109/SISY.2013.6662588","DOIUrl":null,"url":null,"abstract":"This paper analyzes effectiveness of text processing algorithm applied to different document parts for a data set consisting of patent documents. The algorithm is part of the software package which is used as a tool for business intelligence purposes. The tool assembles patent data from publicly available data bases, collects and analyzes patents bibliographic parameters and performs text mining. High-dimensional data contained in the patent documents are transformed into lower dimensionality space (2D or 3D), clustered and visualized. These features of the software tool enabled estimation of the effectiveness of text processing algorithm when run on different parts of the patent, such as abstract, claim, international patent code description and detailed patent description.","PeriodicalId":187088,"journal":{"name":"2013 IEEE 11th International Symposium on Intelligent Systems and Informatics (SISY)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE 11th International Symposium on Intelligent Systems and Informatics (SISY)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SISY.2013.6662588","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
This paper analyzes effectiveness of text processing algorithm applied to different document parts for a data set consisting of patent documents. The algorithm is part of the software package which is used as a tool for business intelligence purposes. The tool assembles patent data from publicly available data bases, collects and analyzes patents bibliographic parameters and performs text mining. High-dimensional data contained in the patent documents are transformed into lower dimensionality space (2D or 3D), clustered and visualized. These features of the software tool enabled estimation of the effectiveness of text processing algorithm when run on different parts of the patent, such as abstract, claim, international patent code description and detailed patent description.