高斯树:概率特征向量数据库中有效的目标识别

22nd International Conference on Data Engineering (ICDE'06) Pub Date : 2006-04-03 DOI:10.1109/ICDE.2006.159

C. Böhm, A. Pryakhin, Matthias Schubert

{"title":"高斯树:概率特征向量数据库中有效的目标识别","authors":"C. Böhm, A. Pryakhin, Matthias Schubert","doi":"10.1109/ICDE.2006.159","DOIUrl":null,"url":null,"abstract":"In applications of biometric databases the typical task is to identify individuals according to features which are not exactly known. Reasons for this inexactness are varying measuring techniques or environmental circumstances. Since these circumstances are not necessarily the same when determining the features for different individuals, the exactness might strongly vary between the individuals as well as between the features. To identify individuals, similarity search on feature vectors is applicable, but even the use of adaptable distance measures is not capable to handle objects having an individual level of exactness. Therefore, we develop a comprehensive probabilistic theory in which uncertain observations are modeled by probabilistic feature vectors (pfv), i.e. feature vectors where the conventional feature values are replaced by Gaussian probability distribution functions. Each feature value of each object is complemented by a variance value indicating its uncertainty. We define two types of identification queries, k-mostlikely identification and threshold identification. For efficient query processing, we propose a novel index structure, the Gauss-tree. Our experimental evaluation demonstrates that pfv stored in a Gauss-tree significantly improve the result quality compared to traditional feature vectors. Additionally, we show that the Gauss-tree significantly speeds up query times compared to competitive methods.","PeriodicalId":6819,"journal":{"name":"22nd International Conference on Data Engineering (ICDE'06)","volume":"11 1","pages":"9-9"},"PeriodicalIF":0.0000,"publicationDate":"2006-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"124","resultStr":"{\"title\":\"The Gauss-Tree: Efficient Object Identification in Databases of Probabilistic Feature Vectors\",\"authors\":\"C. Böhm, A. Pryakhin, Matthias Schubert\",\"doi\":\"10.1109/ICDE.2006.159\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In applications of biometric databases the typical task is to identify individuals according to features which are not exactly known. Reasons for this inexactness are varying measuring techniques or environmental circumstances. Since these circumstances are not necessarily the same when determining the features for different individuals, the exactness might strongly vary between the individuals as well as between the features. To identify individuals, similarity search on feature vectors is applicable, but even the use of adaptable distance measures is not capable to handle objects having an individual level of exactness. Therefore, we develop a comprehensive probabilistic theory in which uncertain observations are modeled by probabilistic feature vectors (pfv), i.e. feature vectors where the conventional feature values are replaced by Gaussian probability distribution functions. Each feature value of each object is complemented by a variance value indicating its uncertainty. We define two types of identification queries, k-mostlikely identification and threshold identification. For efficient query processing, we propose a novel index structure, the Gauss-tree. Our experimental evaluation demonstrates that pfv stored in a Gauss-tree significantly improve the result quality compared to traditional feature vectors. Additionally, we show that the Gauss-tree significantly speeds up query times compared to competitive methods.\",\"PeriodicalId\":6819,\"journal\":{\"name\":\"22nd International Conference on Data Engineering (ICDE'06)\",\"volume\":\"11 1\",\"pages\":\"9-9\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-04-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"124\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"22nd International Conference on Data Engineering (ICDE'06)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDE.2006.159\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"22nd International Conference on Data Engineering (ICDE'06)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE.2006.159","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 124

摘要

在生物特征数据库的应用中，典型的任务是根据不完全已知的特征来识别个体。造成这种不精确的原因是不同的测量技术或环境环境。由于在确定不同个体的特征时，这些情况不一定相同，因此个体之间以及特征之间的准确性可能会有很大差异。为了识别个体，特征向量上的相似性搜索是适用的，但即使使用自适应距离度量也无法处理具有个体精确度的对象。因此，我们发展了一个全面的概率理论，其中不确定观测由概率特征向量(pfv)建模，即传统特征值被高斯概率分布函数取代的特征向量。每个对象的每个特征值都有一个方差值来表示其不确定性。我们定义了两种类型的识别查询，k-最有可能识别和阈值识别。为了提高查询处理的效率，我们提出了一种新的索引结构——高斯树。我们的实验评估表明，与传统的特征向量相比，存储在高斯树中的pfv显著提高了结果质量。此外，我们表明，与竞争方法相比，高斯树显著加快了查询时间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

The Gauss-Tree: Efficient Object Identification in Databases of Probabilistic Feature Vectors

In applications of biometric databases the typical task is to identify individuals according to features which are not exactly known. Reasons for this inexactness are varying measuring techniques or environmental circumstances. Since these circumstances are not necessarily the same when determining the features for different individuals, the exactness might strongly vary between the individuals as well as between the features. To identify individuals, similarity search on feature vectors is applicable, but even the use of adaptable distance measures is not capable to handle objects having an individual level of exactness. Therefore, we develop a comprehensive probabilistic theory in which uncertain observations are modeled by probabilistic feature vectors (pfv), i.e. feature vectors where the conventional feature values are replaced by Gaussian probability distribution functions. Each feature value of each object is complemented by a variance value indicating its uncertainty. We define two types of identification queries, k-mostlikely identification and threshold identification. For efficient query processing, we propose a novel index structure, the Gauss-tree. Our experimental evaluation demonstrates that pfv stored in a Gauss-tree significantly improve the result quality compared to traditional feature vectors. Additionally, we show that the Gauss-tree significantly speeds up query times compared to competitive methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

22nd International Conference on Data Engineering (ICDE'06)

自引率

0.00%

发文量