Knut Magne Risvik, Trishul M. Chilimbi, Henry Tan, Karthik Kalyanaraman, Chris Anderson
{"title":"Maguro, a system for indexing and searching over very large text collections","authors":"Knut Magne Risvik, Trishul M. Chilimbi, Henry Tan, Karthik Kalyanaraman, Chris Anderson","doi":"10.1145/2433396.2433486","DOIUrl":null,"url":null,"abstract":"Maguro is a system for efficiently searching very large collections of text content of up to 1 trillion documents at low cost. Search engines span across content that is very dynamic and highly augmented with metadata to the tail content of the web. A long tail distribution of content calls for different trade-offs in the design space for good efficiency across the entire index range. Maguro is designed for the long tail of content with less dynamics and less metadata, but very good cost efficiency. Maguro is part of the serving stack in Bing and allows us to scale the index significantly better.","PeriodicalId":324799,"journal":{"name":"Proceedings of the sixth ACM international conference on Web search and data mining","volume":"218 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"26","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the sixth ACM international conference on Web search and data mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2433396.2433486","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 26
Abstract
Maguro is a system for efficiently searching very large collections of text content of up to 1 trillion documents at low cost. Search engines span across content that is very dynamic and highly augmented with metadata to the tail content of the web. A long tail distribution of content calls for different trade-offs in the design space for good efficiency across the entire index range. Maguro is designed for the long tail of content with less dynamics and less metadata, but very good cost efficiency. Maguro is part of the serving stack in Bing and allows us to scale the index significantly better.