arXiv - CS - Information Retrieval最新文献

筛选
英文 中文
Refining Wikidata Taxonomy using Large Language Models 利用大型语言模型完善维基数据分类法
arXiv - CS - Information Retrieval Pub Date : 2024-09-06 DOI: arxiv-2409.04056
Yiwen PengIP Paris, Thomas BonaldIP Paris, Mehwish AlamIP Paris
{"title":"Refining Wikidata Taxonomy using Large Language Models","authors":"Yiwen PengIP Paris, Thomas BonaldIP Paris, Mehwish AlamIP Paris","doi":"arxiv-2409.04056","DOIUrl":"https://doi.org/arxiv-2409.04056","url":null,"abstract":"Due to its collaborative nature, Wikidata is known to have a complex\u0000taxonomy, with recurrent issues like the ambiguity between instances and\u0000classes, the inaccuracy of some taxonomic paths, the presence of cycles, and\u0000the high level of redundancy across classes. Manual efforts to clean up this\u0000taxonomy are time-consuming and prone to errors or subjective decisions. We\u0000present WiKC, a new version of Wikidata taxonomy cleaned automatically using a\u0000combination of Large Language Models (LLMs) and graph mining techniques.\u0000Operations on the taxonomy, such as cutting links or merging classes, are\u0000performed with the help of zero-shot prompting on an open-source LLM. The\u0000quality of the refined taxonomy is evaluated from both intrinsic and extrinsic\u0000perspectives, on a task of entity typing for the latter, showing the practical\u0000interest of WiKC.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How Fair is Your Diffusion Recommender Model? 你的扩散推荐模型有多公平?
arXiv - CS - Information Retrieval Pub Date : 2024-09-06 DOI: arxiv-2409.04339
Daniele Malitesta, Giacomo Medda, Erasmo Purificato, Ludovico Boratto, Fragkiskos D. Malliaros, Mirko Marras, Ernesto William De Luca
{"title":"How Fair is Your Diffusion Recommender Model?","authors":"Daniele Malitesta, Giacomo Medda, Erasmo Purificato, Ludovico Boratto, Fragkiskos D. Malliaros, Mirko Marras, Ernesto William De Luca","doi":"arxiv-2409.04339","DOIUrl":"https://doi.org/arxiv-2409.04339","url":null,"abstract":"Diffusion-based recommender systems have recently proven to outperform\u0000traditional generative recommendation approaches, such as variational\u0000autoencoders and generative adversarial networks. Nevertheless, the machine\u0000learning literature has raised several concerns regarding the possibility that\u0000diffusion models, while learning the distribution of data samples, may\u0000inadvertently carry information bias and lead to unfair outcomes. In light of\u0000this aspect, and considering the relevance that fairness has held in\u0000recommendations over the last few decades, we conduct one of the first fairness\u0000investigations in the literature on DiffRec, a pioneer approach in\u0000diffusion-based recommendation. First, we propose an experimental setting\u0000involving DiffRec (and its variant L-DiffRec) along with nine state-of-the-art\u0000recommendation models, two popular recommendation datasets from the\u0000fairness-aware literature, and six metrics accounting for accuracy and\u0000consumer/provider fairness. Then, we perform a twofold analysis, one assessing\u0000models' performance under accuracy and recommendation fairness separately, and\u0000the other identifying if and to what extent such metrics can strike a\u0000performance trade-off. Experimental results from both studies confirm the\u0000initial unfairness warnings but pave the way for how to address them in future\u0000research directions.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GraphEx: A Graph-based Extraction Method for Advertiser Keyphrase Recommendation GraphEx:基于图形的广告商关键词推荐提取方法
arXiv - CS - Information Retrieval Pub Date : 2024-09-05 DOI: arxiv-2409.03140
Ashirbad Mishra, Soumik Dey, Marshall Wu, Jinyu Zhao, He Yu, Kaichen Ni, Binbin Li, Kamesh Madduri
{"title":"GraphEx: A Graph-based Extraction Method for Advertiser Keyphrase Recommendation","authors":"Ashirbad Mishra, Soumik Dey, Marshall Wu, Jinyu Zhao, He Yu, Kaichen Ni, Binbin Li, Kamesh Madduri","doi":"arxiv-2409.03140","DOIUrl":"https://doi.org/arxiv-2409.03140","url":null,"abstract":"Online sellers and advertisers are recommended keyphrases for their listed\u0000products, which they bid on to enhance their sales. One popular paradigm that\u0000generates such recommendations is Extreme Multi-Label Classification (XMC),\u0000which involves tagging/mapping keyphrases to items. We outline the limitations\u0000of using traditional item-query based tagging or mapping techniques for\u0000keyphrase recommendations on E-Commerce platforms. We introduce GraphEx, an\u0000innovative graph-based approach that recommends keyphrases to sellers using\u0000extraction of token permutations from item titles. Additionally, we demonstrate\u0000that relying on traditional metrics such as precision/recall can be misleading\u0000in practical applications, thereby necessitating a combination of metrics to\u0000evaluate performance in real-world scenarios. These metrics are designed to\u0000assess the relevance of keyphrases to items and the potential for buyer\u0000outreach. GraphEx outperforms production models at eBay, achieving the\u0000objectives mentioned above. It supports near real-time inferencing in\u0000resource-constrained production environments and scales effectively for\u0000billions of items.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"117 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HGAMN: Heterogeneous Graph Attention Matching Network for Multilingual POI Retrieval at Baidu Maps HGAMN:用于百度地图多语言 POI 检索的异构图注意力匹配网络
arXiv - CS - Information Retrieval Pub Date : 2024-09-05 DOI: arxiv-2409.03504
Jizhou Huang, Haifeng Wang, Yibo Sun, Miao Fan, Zhengjie Huang, Chunyuan Yuan, Yawen Li
{"title":"HGAMN: Heterogeneous Graph Attention Matching Network for Multilingual POI Retrieval at Baidu Maps","authors":"Jizhou Huang, Haifeng Wang, Yibo Sun, Miao Fan, Zhengjie Huang, Chunyuan Yuan, Yawen Li","doi":"arxiv-2409.03504","DOIUrl":"https://doi.org/arxiv-2409.03504","url":null,"abstract":"The increasing interest in international travel has raised the demand of\u0000retrieving point of interests in multiple languages. This is even superior to\u0000find local venues such as restaurants and scenic spots in unfamiliar languages\u0000when traveling abroad. Multilingual POI retrieval, enabling users to find\u0000desired POIs in a demanded language using queries in numerous languages, has\u0000become an indispensable feature of today's global map applications such as\u0000Baidu Maps. This task is non-trivial because of two key challenges: (1)\u0000visiting sparsity and (2) multilingual query-POI matching. To this end, we\u0000propose a Heterogeneous Graph Attention Matching Network (HGAMN) to\u0000concurrently address both challenges. Specifically, we construct a\u0000heterogeneous graph that contains two types of nodes: POI node and query node\u0000using the search logs of Baidu Maps. To alleviate challenge #1, we construct\u0000edges between different POI nodes to link the low-frequency POIs with the\u0000high-frequency ones, which enables the transfer of knowledge from the latter to\u0000the former. To mitigate challenge #2, we construct edges between POI and query\u0000nodes based on the co-occurrences between queries and POIs, where queries in\u0000different languages and formulations can be aggregated for individual POIs.\u0000Moreover, we develop an attention-based network to jointly learn node\u0000representations of the heterogeneous graph and further design a cross-attention\u0000module to fuse the representations of both types of nodes for query-POI\u0000relevance scoring. Extensive experiments conducted on large-scale real-world\u0000datasets from Baidu Maps demonstrate the superiority and effectiveness of\u0000HGAMN. In addition, HGAMN has already been deployed in production at Baidu\u0000Maps, and it successfully keeps serving hundreds of millions of requests every\u0000day.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Federated Prototype-based Contrastive Learning for Privacy-Preserving Cross-domain Recommendation 基于联合原型的对比学习,实现保护隐私的跨域推荐
arXiv - CS - Information Retrieval Pub Date : 2024-09-05 DOI: arxiv-2409.03294
Li Wang, Quangui Zhang, Lei Sang, Qiang Wu, Min Xu
{"title":"Federated Prototype-based Contrastive Learning for Privacy-Preserving Cross-domain Recommendation","authors":"Li Wang, Quangui Zhang, Lei Sang, Qiang Wu, Min Xu","doi":"arxiv-2409.03294","DOIUrl":"https://doi.org/arxiv-2409.03294","url":null,"abstract":"Cross-domain recommendation (CDR) aims to improve recommendation accuracy in\u0000sparse domains by transferring knowledge from data-rich domains. However,\u0000existing CDR methods often assume the availability of user-item interaction\u0000data across domains, overlooking user privacy concerns. Furthermore, these\u0000methods suffer from performance degradation in scenarios with sparse\u0000overlapping users, as they typically depend on a large number of fully shared\u0000users for effective knowledge transfer. To address these challenges, we propose\u0000a Federated Prototype-based Contrastive Learning (CL) method for\u0000Privacy-Preserving CDR, named FedPCL-CDR. This approach utilizes\u0000non-overlapping user information and prototypes to improve multi-domain\u0000performance while protecting user privacy. FedPCL-CDR comprises two modules:\u0000local domain (client) learning and global server aggregation. In the local\u0000domain, FedPCL-CDR clusters all user data to learn representative prototypes,\u0000effectively utilizing non-overlapping user information and addressing the\u0000sparse overlapping user issue. It then facilitates knowledge transfer by\u0000employing both local and global prototypes returned from the server in a CL\u0000manner. Simultaneously, the global server aggregates representative prototypes\u0000from local domains to learn both local and global prototypes. The combination\u0000of prototypes and federated learning (FL) ensures that sensitive user data\u0000remains decentralized, with only prototypes being shared across domains,\u0000thereby protecting user privacy. Extensive experiments on four CDR tasks using\u0000two real-world datasets demonstrate that FedPCL-CDR outperforms the\u0000state-of-the-art baselines.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"24 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
iText2KG: Incremental Knowledge Graphs Construction Using Large Language Models iText2KG:利用大型语言模型构建增量知识图谱
arXiv - CS - Information Retrieval Pub Date : 2024-09-05 DOI: arxiv-2409.03284
Yassir Lairgi, Ludovic Moncla, Rémy Cazabet, Khalid Benabdeslem, Pierre Cléau
{"title":"iText2KG: Incremental Knowledge Graphs Construction Using Large Language Models","authors":"Yassir Lairgi, Ludovic Moncla, Rémy Cazabet, Khalid Benabdeslem, Pierre Cléau","doi":"arxiv-2409.03284","DOIUrl":"https://doi.org/arxiv-2409.03284","url":null,"abstract":"Most available data is unstructured, making it challenging to access valuable\u0000information. Automatically building Knowledge Graphs (KGs) is crucial for\u0000structuring data and making it accessible, allowing users to search for\u0000information effectively. KGs also facilitate insights, inference, and\u0000reasoning. Traditional NLP methods, such as named entity recognition and\u0000relation extraction, are key in information retrieval but face limitations,\u0000including the use of predefined entity types and the need for supervised\u0000learning. Current research leverages large language models' capabilities, such\u0000as zero- or few-shot learning. However, unresolved and semantically duplicated\u0000entities and relations still pose challenges, leading to inconsistent graphs\u0000and requiring extensive post-processing. Additionally, most approaches are\u0000topic-dependent. In this paper, we propose iText2KG, a method for incremental,\u0000topic-independent KG construction without post-processing. This plug-and-play,\u0000zero-shot method is applicable across a wide range of KG construction scenarios\u0000and comprises four modules: Document Distiller, Incremental Entity Extractor,\u0000Incremental Relation Extractor, and Graph Integrator and Visualization. Our\u0000method demonstrates superior performance compared to baseline methods across\u0000three scenarios: converting scientific papers to graphs, websites to graphs,\u0000and CVs to graphs.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RAG based Question-Answering for Contextual Response Prediction System 基于 RAG 的情境响应预测系统问题解答
arXiv - CS - Information Retrieval Pub Date : 2024-09-05 DOI: arxiv-2409.03708
Sriram Veturi, Saurabh Vaichal, Nafis Irtiza Tripto, Reshma Lal Jagadheesh, Nian Yan
{"title":"RAG based Question-Answering for Contextual Response Prediction System","authors":"Sriram Veturi, Saurabh Vaichal, Nafis Irtiza Tripto, Reshma Lal Jagadheesh, Nian Yan","doi":"arxiv-2409.03708","DOIUrl":"https://doi.org/arxiv-2409.03708","url":null,"abstract":"Large Language Models (LLMs) have shown versatility in various Natural\u0000Language Processing (NLP) tasks, including their potential as effective\u0000question-answering systems. However, to provide precise and relevant\u0000information in response to specific customer queries in industry settings, LLMs\u0000require access to a comprehensive knowledge base to avoid hallucinations.\u0000Retrieval Augmented Generation (RAG) emerges as a promising technique to\u0000address this challenge. Yet, developing an accurate question-answering\u0000framework for real-world applications using RAG entails several challenges: 1)\u0000data availability issues, 2) evaluating the quality of generated content, and\u00003) the costly nature of human evaluation. In this paper, we introduce an\u0000end-to-end framework that employs LLMs with RAG capabilities for industry use\u0000cases. Given a customer query, the proposed system retrieves relevant knowledge\u0000documents and leverages them, along with previous chat history, to generate\u0000response suggestions for customer service agents in the contact centers of a\u0000major retail company. Through comprehensive automated and human evaluations, we\u0000show that this solution outperforms the current BERT-based algorithms in\u0000accuracy and relevance. Our findings suggest that RAG-based LLMs can be an\u0000excellent support to human customer service representatives by lightening their\u0000workload.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
WildVis: Open Source Visualizer for Million-Scale Chat Logs in the Wild WildVis:野外百万级聊天记录开源可视化工具
arXiv - CS - Information Retrieval Pub Date : 2024-09-05 DOI: arxiv-2409.03753
Yuntian Deng, Wenting Zhao, Jack Hessel, Xiang Ren, Claire Cardie, Yejin Choi
{"title":"WildVis: Open Source Visualizer for Million-Scale Chat Logs in the Wild","authors":"Yuntian Deng, Wenting Zhao, Jack Hessel, Xiang Ren, Claire Cardie, Yejin Choi","doi":"arxiv-2409.03753","DOIUrl":"https://doi.org/arxiv-2409.03753","url":null,"abstract":"The increasing availability of real-world conversation data offers exciting\u0000opportunities for researchers to study user-chatbot interactions. However, the\u0000sheer volume of this data makes manually examining individual conversations\u0000impractical. To overcome this challenge, we introduce WildVis, an interactive\u0000tool that enables fast, versatile, and large-scale conversation analysis.\u0000WildVis provides search and visualization capabilities in the text and\u0000embedding spaces based on a list of criteria. To manage million-scale datasets,\u0000we implemented optimizations including search index construction, embedding\u0000precomputation and compression, and caching to ensure responsive user\u0000interactions within seconds. We demonstrate WildVis's utility through three\u0000case studies: facilitating chatbot misuse research, visualizing and comparing\u0000topic distributions across datasets, and characterizing user-specific\u0000conversation patterns. WildVis is open-source and designed to be extendable,\u0000supporting additional datasets and customized search and visualization\u0000functionalities.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"44 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RETAIN: Interactive Tool for Regression Testing Guided LLM Migration RETAIN:引导 LLM 迁移的回归测试互动工具
arXiv - CS - Information Retrieval Pub Date : 2024-09-05 DOI: arxiv-2409.03928
Tanay Dixit, Daniel Lee, Sally Fang, Sai Sree Harsha, Anirudh Sureshan, Akash Maharaj, Yunyao Li
{"title":"RETAIN: Interactive Tool for Regression Testing Guided LLM Migration","authors":"Tanay Dixit, Daniel Lee, Sally Fang, Sai Sree Harsha, Anirudh Sureshan, Akash Maharaj, Yunyao Li","doi":"arxiv-2409.03928","DOIUrl":"https://doi.org/arxiv-2409.03928","url":null,"abstract":"Large Language Models (LLMs) are increasingly integrated into diverse\u0000applications. The rapid evolution of LLMs presents opportunities for developers\u0000to enhance applications continuously. However, this constant adaptation can\u0000also lead to performance regressions during model migrations. While several\u0000interactive tools have been proposed to streamline the complexity of prompt\u0000engineering, few address the specific requirements of regression testing for\u0000LLM Migrations. To bridge this gap, we introduce RETAIN (REgression Testing\u0000guided LLM migrAtIoN), a tool designed explicitly for regression testing in LLM\u0000Migrations. RETAIN comprises two key components: an interactive interface\u0000tailored to regression testing needs during LLM migrations, and an error\u0000discovery module that facilitates understanding of differences in model\u0000behaviors. The error discovery module generates textual descriptions of various\u0000errors or differences between model outputs, providing actionable insights for\u0000prompt refinement. Our automatic evaluation and empirical user studies\u0000demonstrate that RETAIN, when compared to manual evaluation, enabled\u0000participants to identify twice as many errors, facilitated experimentation with\u000075% more prompts, and achieves 12% higher metric scores in a given time frame.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"55 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MOBIUS: Towards the Next Generation of Query-Ad Matching in Baidu's Sponsored Search MOBIUS:在百度赞助商搜索中实现下一代查询-广告匹配
arXiv - CS - Information Retrieval Pub Date : 2024-09-05 DOI: arxiv-2409.03449
Miao Fan, Jiacheng Guo, Shuai Zhu, Shuo Miao, Mingming Sun, Ping Li
{"title":"MOBIUS: Towards the Next Generation of Query-Ad Matching in Baidu's Sponsored Search","authors":"Miao Fan, Jiacheng Guo, Shuai Zhu, Shuo Miao, Mingming Sun, Ping Li","doi":"arxiv-2409.03449","DOIUrl":"https://doi.org/arxiv-2409.03449","url":null,"abstract":"Baidu runs the largest commercial web search engine in China, serving\u0000hundreds of millions of online users every day in response to a great variety\u0000of queries. In order to build a high-efficiency sponsored search engine, we\u0000used to adopt a three-layer funnel-shaped structure to screen and sort hundreds\u0000of ads from billions of ad candidates subject to the requirement of low\u0000response latency and the restraints of computing resources. Given a user query,\u0000the top matching layer is responsible for providing semantically relevant ad\u0000candidates to the next layer, while the ranking layer at the bottom concerns\u0000more about business indicators (e.g., CPM, ROI, etc.) of those ads. The clear\u0000separation between the matching and ranking objectives results in a lower\u0000commercial return. The Mobius project has been established to address this\u0000serious issue. It is our first attempt to train the matching layer to consider\u0000CPM as an additional optimization objective besides the query-ad relevance, via\u0000directly predicting CTR (click-through rate) from billions of query-ad pairs.\u0000Specifically, this paper will elaborate on how we adopt active learning to\u0000overcome the insufficiency of click history at the matching layer when training\u0000our neural click networks offline, and how we use the SOTA ANN search technique\u0000for retrieving ads more efficiently (Here ``ANN'' stands for approximate\u0000nearest neighbor search). We contribute the solutions to Mobius-V1 as the first\u0000version of our next generation query-ad matching system.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信