Michał Karwatowski, M. Wielgosz, M. Pietroń, Mateusz Staruchowicz, K. Wiatr
{"title":"Comparison of semantic vectors with reduced precision using the cosine similarity measure","authors":"Michał Karwatowski, M. Wielgosz, M. Pietroń, Mateusz Staruchowicz, K. Wiatr","doi":"10.1109/INTELLISYS.2017.8324236","DOIUrl":null,"url":null,"abstract":"This paper presents an analysis of an impact of a precision reduction on a performance of the cosine similarity measure in a document comparison task. The precision reduction of semantic vectors allows for a substantial computing performance improvement at the expense of a negligible decline of a comparison quality. In order to take an advantage of the precision reduction in terms of a lower number of bits, a dedicated hardware platforms are essential. Consequently, we proposed an FPGA-based hardware solution and examined its performance. In order to validate the adopted method of the precision reduction we also created the quality assessment setup. This allowed us to determine that it is possible to decrease a vector precision down to 8 bits and still maintain 0.99 correlation of the regular and reduced results. This is feasible for a wide range of data set sizes.","PeriodicalId":131825,"journal":{"name":"2017 Intelligent Systems Conference (IntelliSys)","volume":"166 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 Intelligent Systems Conference (IntelliSys)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INTELLISYS.2017.8324236","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
This paper presents an analysis of an impact of a precision reduction on a performance of the cosine similarity measure in a document comparison task. The precision reduction of semantic vectors allows for a substantial computing performance improvement at the expense of a negligible decline of a comparison quality. In order to take an advantage of the precision reduction in terms of a lower number of bits, a dedicated hardware platforms are essential. Consequently, we proposed an FPGA-based hardware solution and examined its performance. In order to validate the adopted method of the precision reduction we also created the quality assessment setup. This allowed us to determine that it is possible to decrease a vector precision down to 8 bits and still maintain 0.99 correlation of the regular and reduced results. This is feasible for a wide range of data set sizes.