Improving document retrieval using special characteristics of lecture recording documents

Christoph Hermann
{"title":"Improving document retrieval using special characteristics of lecture recording documents","authors":"Christoph Hermann","doi":"10.1145/2350716.2350754","DOIUrl":null,"url":null,"abstract":"With the increasing use of lecture recordings, content providers are facing the challenge to make electronic lecture materials both easily accessible and searchable. Therefore powerful search engines need to be implemented that allow users to easily retrieve documents fulfilling their information needs. While there has been a lot of research in the domain of text search, special characteristics of lecture recording documents have not yet gotten much attention. Lecture recording documents differ from text documents such as papers, scripts or web pages because they usually do not contain running texts but rather listings and enumerations. Additionally, lecture recording documents contain time-based data such as an audio or video stream of the lecturer as well as handwritten annotations. Analyzing these additional data streams leads to an improvement of the search process. Our novel approach to analyze annotations of lecture recording documents improves document relevance estimation during the search in lecture materials. Hence, technology had been developed to make the contents and special properties of lecture recordings accessible and searchable. We describe key issues encountered during these developments and present experimental results of our search engine which takes into account the special characteristics of lecture recording documents during the indexing process. Searching our archive of over 15,000 files only takes a few milliseconds and enables us to offer a search-as-you-type user interface, query auto-completion and visual browsing.","PeriodicalId":208300,"journal":{"name":"Proceedings of the 3rd Symposium on Information and Communication Technology","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 3rd Symposium on Information and Communication Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2350716.2350754","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

With the increasing use of lecture recordings, content providers are facing the challenge to make electronic lecture materials both easily accessible and searchable. Therefore powerful search engines need to be implemented that allow users to easily retrieve documents fulfilling their information needs. While there has been a lot of research in the domain of text search, special characteristics of lecture recording documents have not yet gotten much attention. Lecture recording documents differ from text documents such as papers, scripts or web pages because they usually do not contain running texts but rather listings and enumerations. Additionally, lecture recording documents contain time-based data such as an audio or video stream of the lecturer as well as handwritten annotations. Analyzing these additional data streams leads to an improvement of the search process. Our novel approach to analyze annotations of lecture recording documents improves document relevance estimation during the search in lecture materials. Hence, technology had been developed to make the contents and special properties of lecture recordings accessible and searchable. We describe key issues encountered during these developments and present experimental results of our search engine which takes into account the special characteristics of lecture recording documents during the indexing process. Searching our archive of over 15,000 files only takes a few milliseconds and enables us to offer a search-as-you-type user interface, query auto-completion and visual browsing.
利用讲座录音文档的特点改进文档检索
随着讲座录音的使用越来越多,内容提供商正面临着使电子讲座材料易于访问和搜索的挑战。因此,需要实现强大的搜索引擎,使用户能够轻松地检索满足其信息需求的文档。虽然在文本搜索领域已经有了大量的研究,但讲座录音文件的特殊性尚未引起人们的重视。讲座录音文档不同于文本文档,如论文、脚本或网页,因为它们通常不包含运行文本,而是包含列表和枚举。此外,讲座录音文档包含基于时间的数据,如讲师的音频或视频流以及手写注释。分析这些额外的数据流可以改进搜索过程。我们的新方法分析了讲座录音文档的注释,提高了在讲座材料搜索过程中的文档相关性估计。因此,技术的发展使讲座录音的内容和特殊属性可访问和搜索。我们描述了在这些开发过程中遇到的关键问题,并介绍了我们的搜索引擎的实验结果,该引擎考虑了在索引过程中讲座录音文档的特殊特征。搜索我们超过15,000个文件的存档只需要几毫秒,并使我们能够提供“按需搜索”的用户界面,查询自动完成和可视化浏览。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信