面向自然语言处理的并行字符串匹配单元设计

2003 IEEE International Workshop on Computer Architectures for Machine Perception Pub Date : 2003-05-12 DOI:10.1109/CAMP.2003.1598159

V. S. Murty, P. C. Reghu Raj, S. Raman

{"title":"面向自然语言处理的并行字符串匹配单元设计","authors":"V. S. Murty, P. C. Reghu Raj, S. Raman","doi":"10.1109/CAMP.2003.1598159","DOIUrl":null,"url":null,"abstract":"In natural language processing applications, string matching is the main time-consuming operation due to the large size of lexicon. Data dependence is minimal in string matching operations, and hence it is ideal for parallelization. A dedicated hardware for string matching that uses memory interleaving and parallel processing techniques can relieve the host CPU from this burden, thereby making the system suitable for real-time applications. This paper reports the FPGA design of such a system with m parallel matching units. The time complexity of the proposed algorithm is O (log2 n), where n is the total number of lexical entries. This has been achieved by a proper selection of the value of m. A special memory organization technique, which reduces the storage space by nearly 70%, has been adopted for storing lexical entries. The techniques used for matching and storage of lexical entries make the system language independent","PeriodicalId":443821,"journal":{"name":"2003 IEEE International Workshop on Computer Architectures for Machine Perception","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Design of a language-independent parallel string matching unit for NLP\",\"authors\":\"V. S. Murty, P. C. Reghu Raj, S. Raman\",\"doi\":\"10.1109/CAMP.2003.1598159\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In natural language processing applications, string matching is the main time-consuming operation due to the large size of lexicon. Data dependence is minimal in string matching operations, and hence it is ideal for parallelization. A dedicated hardware for string matching that uses memory interleaving and parallel processing techniques can relieve the host CPU from this burden, thereby making the system suitable for real-time applications. This paper reports the FPGA design of such a system with m parallel matching units. The time complexity of the proposed algorithm is O (log2 n), where n is the total number of lexical entries. This has been achieved by a proper selection of the value of m. A special memory organization technique, which reduces the storage space by nearly 70%, has been adopted for storing lexical entries. The techniques used for matching and storage of lexical entries make the system language independent\",\"PeriodicalId\":443821,\"journal\":{\"name\":\"2003 IEEE International Workshop on Computer Architectures for Machine Perception\",\"volume\":\"15 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2003-05-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2003 IEEE International Workshop on Computer Architectures for Machine Perception\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CAMP.2003.1598159\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2003 IEEE International Workshop on Computer Architectures for Machine Perception","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CAMP.2003.1598159","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

在自然语言处理应用中，由于词汇量大，字符串匹配是最耗时的操作。数据依赖性在字符串匹配操作中是最小的，因此它是并行化的理想选择。使用内存交错并行处理技术的字符串匹配专用硬件可以减轻主机CPU的负担，从而使系统适合实时应用。本文报道了一个具有m个并行匹配单元的系统的FPGA设计。本文算法的时间复杂度为O (log2 n)，其中n为词法条目的总数。这是通过正确选择m的值来实现的。在存储词法条目时，采用了一种特殊的内存组织技术，该技术将存储空间减少了近70%。用于匹配和存储词法条目的技术使系统与语言无关

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Design of a language-independent parallel string matching unit for NLP

In natural language processing applications, string matching is the main time-consuming operation due to the large size of lexicon. Data dependence is minimal in string matching operations, and hence it is ideal for parallelization. A dedicated hardware for string matching that uses memory interleaving and parallel processing techniques can relieve the host CPU from this burden, thereby making the system suitable for real-time applications. This paper reports the FPGA design of such a system with m parallel matching units. The time complexity of the proposed algorithm is O (log2 n), where n is the total number of lexical entries. This has been achieved by a proper selection of the value of m. A special memory organization technique, which reduces the storage space by nearly 70%, has been adopted for storing lexical entries. The techniques used for matching and storage of lexical entries make the system language independent

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2003 IEEE International Workshop on Computer Architectures for Machine Perception

自引率

0.00%

发文量