结合多种搜索算法的证据改进监视名单筛选

2008 IEEE Conference on Technologies for Homeland Security Pub Date : 2008-05-12 DOI:10.1109/THS.2008.4534432

Keith J. Miller

{"title":"结合多种搜索算法的证据改进监视名单筛选","authors":"Keith J. Miller","doi":"10.1109/THS.2008.4534432","DOIUrl":null,"url":null,"abstract":"In this paper, we describe a metasearch tool resulting from experiments in aggregating the results of different name matching algorithms on a knowledge- intensive multicultural name matching task. Three retrieval engines that match Romanized names were tested on a noisy and predominantly Arabic dataset. One is based on a generic string matching algorithm; another is designed specifically for Arabic names; and the third makes use of culturally-specific matching strategies for multiple cultures. We show that even a relatively naive method for aggregating results significantly increased effectiveness over each of the individual algorithms, resulting in nearly tripling the F-score of the worst-performing algorithm included in the aggregate, and in a 6 point improvement in F-score over the single best-performing algorithm included.","PeriodicalId":366416,"journal":{"name":"2008 IEEE Conference on Technologies for Homeland Security","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Improving Watchlist Screening By Combining Evidence From Multiple Search Algorithms\",\"authors\":\"Keith J. Miller\",\"doi\":\"10.1109/THS.2008.4534432\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we describe a metasearch tool resulting from experiments in aggregating the results of different name matching algorithms on a knowledge- intensive multicultural name matching task. Three retrieval engines that match Romanized names were tested on a noisy and predominantly Arabic dataset. One is based on a generic string matching algorithm; another is designed specifically for Arabic names; and the third makes use of culturally-specific matching strategies for multiple cultures. We show that even a relatively naive method for aggregating results significantly increased effectiveness over each of the individual algorithms, resulting in nearly tripling the F-score of the worst-performing algorithm included in the aggregate, and in a 6 point improvement in F-score over the single best-performing algorithm included.\",\"PeriodicalId\":366416,\"journal\":{\"name\":\"2008 IEEE Conference on Technologies for Homeland Security\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-05-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 IEEE Conference on Technologies for Homeland Security\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/THS.2008.4534432\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 IEEE Conference on Technologies for Homeland Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/THS.2008.4534432","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

在本文中，我们描述了一个元搜索工具，该工具是在一个知识密集型的多元文化名称匹配任务中聚合不同名称匹配算法的结果而产生的。在一个嘈杂且主要是阿拉伯语的数据集上测试了三个匹配罗马化名称的检索引擎。一种是基于通用字符串匹配算法;另一个是专门为阿拉伯名字设计的;第三种是针对不同文化使用不同文化的匹配策略。我们表明，即使是相对简单的聚合结果的方法也显著提高了每个单独算法的有效性，导致聚合中表现最差的算法的f分数提高了近三倍，并且比单个表现最好的算法的f分数提高了6分。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Improving Watchlist Screening By Combining Evidence From Multiple Search Algorithms

In this paper, we describe a metasearch tool resulting from experiments in aggregating the results of different name matching algorithms on a knowledge- intensive multicultural name matching task. Three retrieval engines that match Romanized names were tested on a noisy and predominantly Arabic dataset. One is based on a generic string matching algorithm; another is designed specifically for Arabic names; and the third makes use of culturally-specific matching strategies for multiple cultures. We show that even a relatively naive method for aggregating results significantly increased effectiveness over each of the individual algorithms, resulting in nearly tripling the F-score of the worst-performing algorithm included in the aggregate, and in a 6 point improvement in F-score over the single best-performing algorithm included.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2008 IEEE Conference on Technologies for Homeland Security

自引率

0.00%

发文量