{"title":"构造偏向查询的摘要:人工和系统生成的片段的比较","authors":"L. L. Bando, Falk Scholer, A. Turpin","doi":"10.1145/1840784.1840813","DOIUrl":null,"url":null,"abstract":"Modern search engines display a summary for each ranked document that is returned in response to a query. These summaries typically include a snippet -- a collection of text fragments from the underlying document -- that has some relation to the query that is being answered.\n In this study we investigate how 10 humans construct snippets: participants first generate their own natural language snippet, and then separately extract a snippet by choosing text fragments, for four queries related to two documents. By mapping their generated snippets back to text fragments in the source document using eye tracking data, we observe that participants extract these same pieces of text around 73% of the time when creating their extractive snippets.\n In comparison, we notice that automated approaches for extracting snippets only use these same fragments 10% of the time. However, when the automated methods are evaluated using a position-independent bag-of-words approach, as typically used in the research literature for evaluating snippets, they are scored much more highly, seemingly extracting the \"correct\" text 24% of the time.\n In addition to demonstrating this large scope for improvement in snippet generation algorithms with our novel methodology, we also offer a series of observations on the behaviour of participants as they constructed their snippets.","PeriodicalId":413481,"journal":{"name":"International Conference on Information Interaction in Context","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":"{\"title\":\"Constructing query-biased summaries: a comparison of human and system generated snippets\",\"authors\":\"L. L. Bando, Falk Scholer, A. Turpin\",\"doi\":\"10.1145/1840784.1840813\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Modern search engines display a summary for each ranked document that is returned in response to a query. These summaries typically include a snippet -- a collection of text fragments from the underlying document -- that has some relation to the query that is being answered.\\n In this study we investigate how 10 humans construct snippets: participants first generate their own natural language snippet, and then separately extract a snippet by choosing text fragments, for four queries related to two documents. By mapping their generated snippets back to text fragments in the source document using eye tracking data, we observe that participants extract these same pieces of text around 73% of the time when creating their extractive snippets.\\n In comparison, we notice that automated approaches for extracting snippets only use these same fragments 10% of the time. However, when the automated methods are evaluated using a position-independent bag-of-words approach, as typically used in the research literature for evaluating snippets, they are scored much more highly, seemingly extracting the \\\"correct\\\" text 24% of the time.\\n In addition to demonstrating this large scope for improvement in snippet generation algorithms with our novel methodology, we also offer a series of observations on the behaviour of participants as they constructed their snippets.\",\"PeriodicalId\":413481,\"journal\":{\"name\":\"International Conference on Information Interaction in Context\",\"volume\":\"23 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-08-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"24\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Information Interaction in Context\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1840784.1840813\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Information Interaction in Context","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1840784.1840813","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Constructing query-biased summaries: a comparison of human and system generated snippets
Modern search engines display a summary for each ranked document that is returned in response to a query. These summaries typically include a snippet -- a collection of text fragments from the underlying document -- that has some relation to the query that is being answered.
In this study we investigate how 10 humans construct snippets: participants first generate their own natural language snippet, and then separately extract a snippet by choosing text fragments, for four queries related to two documents. By mapping their generated snippets back to text fragments in the source document using eye tracking data, we observe that participants extract these same pieces of text around 73% of the time when creating their extractive snippets.
In comparison, we notice that automated approaches for extracting snippets only use these same fragments 10% of the time. However, when the automated methods are evaluated using a position-independent bag-of-words approach, as typically used in the research literature for evaluating snippets, they are scored much more highly, seemingly extracting the "correct" text 24% of the time.
In addition to demonstrating this large scope for improvement in snippet generation algorithms with our novel methodology, we also offer a series of observations on the behaviour of participants as they constructed their snippets.