SSCS '10最新文献

筛选
英文 中文
The ACLD: speech-based just-in-time retrieval of meeting transcripts, documents and websites ACLD:基于语音的会议记录、文件和网站的即时检索
SSCS '10 Pub Date : 2010-10-29 DOI: 10.1145/1878101.1878111
Andrei Popescu-Belis, J. Kilgour, Alexandre Nanchen, P. Poller
{"title":"The ACLD: speech-based just-in-time retrieval of meeting transcripts, documents and websites","authors":"Andrei Popescu-Belis, J. Kilgour, Alexandre Nanchen, P. Poller","doi":"10.1145/1878101.1878111","DOIUrl":"https://doi.org/10.1145/1878101.1878111","url":null,"abstract":"The Automatic Content Linking Device (ACLD) is a just-in-time retrieval system that monitors an ongoing conversation or a monologue and enriches it with potentially related documents, including transcripts of past meetings, from local repositories or from the Internet. The linked content is displayed in real-time to the participants in the conversation, or to users watching a recorded conversation or talk. The system can be demonstrated in both settings, using real-time automatic speech recognition (ASR) or replaying offline ASR, via a flexible user interface that displays results and provides access to the content of past meetings and documents.","PeriodicalId":123226,"journal":{"name":"SSCS '10","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121392759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Automatic indexing of speech segments with spontaneity levels on large audio database 大型音频数据库中具有自发性水平的语音片段自动索引
SSCS '10 Pub Date : 2010-10-29 DOI: 10.1145/1878101.1878110
Richard Dufour, Y. Estève, P. Deléglise
{"title":"Automatic indexing of speech segments with spontaneity levels on large audio database","authors":"Richard Dufour, Y. Estève, P. Deléglise","doi":"10.1145/1878101.1878110","DOIUrl":"https://doi.org/10.1145/1878101.1878110","url":null,"abstract":"Spontaneous speech detection from a large audio database can be useful for different applications. For example, processing spontaneous speech is one of the many challenges that Automatic Speech Recognition (ASR) systems have to deal with. Spontaneous speech detection can also be an informative descriptor for information retrieval.\u0000 The main evidences characterizing spontaneous speech are disfluencies (filled pause, repetition, repair and false start) and many studies have focused on the detection and the correction of these disfluencies. In this study1 we define spontaneous speech as unprepared speech, in opposition to prepared speech where utterances contain well-formed sentences close to those that can be found in written documents. Disfluencies are of course very good indicators of unprepared speech, however they are not the only ones: ungrammaticality and language register are also important as well as prosodic patterns. This paper proposes a set of acoustic and linguistic features that can be used for characterizing and detecting spontaneous speech segments from large audio databases, and proposes a method to extract and to exploit these features in order to index audio documents with three speech spontaneity levels.","PeriodicalId":123226,"journal":{"name":"SSCS '10","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129477090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Towards methods for efficient access to spoken content in the ami corpus 探讨如何有效地访问ami语料库中的口语内容
SSCS '10 Pub Date : 2010-10-29 DOI: 10.1145/1878101.1878108
G. Jones, Maria Eskevich, Ágnes Gyarmati
{"title":"Towards methods for efficient access to spoken content in the ami corpus","authors":"G. Jones, Maria Eskevich, Ágnes Gyarmati","doi":"10.1145/1878101.1878108","DOIUrl":"https://doi.org/10.1145/1878101.1878108","url":null,"abstract":"Increasing amounts of informal spoken content are being collected. This material does not have clearly defined document forms either in terms of structure or topical content, e.g. recordings of meetings, lectures and personal data sources. Automated search of this content poses challenges beyond retrieval of defined documents, including definition of search items and location of relevant content within them. While most existing work on speech search focused on clearly defined document units, in this paper we describe our initial investigation into search of meeting content using the AMI meeting collection. Manual and automated transcripts of meetings are first automatically segmented into topical units. A known-item search task is then performed using presentation slides from the meetings as search queries to locate relevant sections of the meetings. Query slides were selected corresponding to well recognised and poorly recognised spoken content, and randomly selected slides. Experimental results show that relevant items can be located with reasonable accuracy using a standard information retrieval approach, and that there is a clear relationship between automatic transcription accuracy and retrieval effectiveness.","PeriodicalId":123226,"journal":{"name":"SSCS '10","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132191039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
World wide telecom web search: invited talk abstract 全球电信网搜索:特邀谈话摘要
SSCS '10 Pub Date : 2010-10-29 DOI: 10.1145/1878101.1878103
Nitendra Rajput
{"title":"World wide telecom web search: invited talk abstract","authors":"Nitendra Rajput","doi":"10.1145/1878101.1878103","DOIUrl":"https://doi.org/10.1145/1878101.1878103","url":null,"abstract":"Searching of spoken content has been of interest to the speech community for varied reasons in the past. The call centers are interested in searching information the calls where an agent may not have provided appropriate answer to the customer. Video search is mostly dependent on the accompanying audio analysis. In such scenarios, the query interface is still a visual interaction and multiple results can be presented to the user, with effective browsing and presentation controls. We challenge ourselves to enable searching of user generated audio content through a audio-only query-result interface. The motivation to perform such a search, the challenges in terms of data, interface and users are presented in this talk. The hope is that the audience will be able to identify sub-problems in this large space of WWTW search.","PeriodicalId":123226,"journal":{"name":"SSCS '10","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126680988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Large multimedia archive for world languages 世界语言的大型多媒体档案
SSCS '10 Pub Date : 2010-10-29 DOI: 10.1145/1878101.1878113
P. Wittenburg, Paul Trilsbeek, Przemek Lenkiewicz
{"title":"Large multimedia archive for world languages","authors":"P. Wittenburg, Paul Trilsbeek, Przemek Lenkiewicz","doi":"10.1145/1878101.1878113","DOIUrl":"https://doi.org/10.1145/1878101.1878113","url":null,"abstract":"In this paper, we describe the core pillars of a large archive of language material recorded worldwide partly about languages that are highly endangered. The bases for the documentation of these languages are audio/video recordings which are then annotated at several linguistic layers. The digital age completely changed the requirements of long-term preservation and it is discussed how the archive met these new challenges. An extensive solution for data replication has been worked out to guarantee bit-stream preservation. Due to an immediate conversion of the incoming data to standards-based formats and checks at upload time lifecycle management of all 50 Terabyte of data is widely simplified. A suitable metadata framework not only allowing users to describe and discover resources, but also allowing them to organize their resources is enabling the management of this amount of resources very efficiently. Finally, it is the Language Archiving Technology software suite which allows users to create, manipulate, access and enrich all archived resources given that they have access permissions.","PeriodicalId":123226,"journal":{"name":"SSCS '10","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127755688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Novel methods for query selection and query combination in query-by-example spoken term detection 基于实例查询的语音词检测中查询选择和查询组合的新方法
SSCS '10 Pub Date : 2010-10-29 DOI: 10.1145/1878101.1878106
Javier Tejedor, Igor Szöke, M. Fapšo
{"title":"Novel methods for query selection and query combination in query-by-example spoken term detection","authors":"Javier Tejedor, Igor Szöke, M. Fapšo","doi":"10.1145/1878101.1878106","DOIUrl":"https://doi.org/10.1145/1878101.1878106","url":null,"abstract":"Query-by-example (QbE) spoken term detection (STD) is necessary for low-resource scenarios where training material is hardly available and word-based speech recognition systems cannot be employed. We present two novel contributions to QbE STD: the first introduces several criteria to select the optimal example used as query throughout the search system. The second presents a novel feature level example combination to construct a more robust query used during the search. Experiments, tested on with-in language and cross-lingual QbE STD setups, show a significant improvement when the query is selected according to an optimal criterion over when the query is selected randomly for both setups and a significant improvement when several examples are combined to build the input query for the search system compared with the use of the single best example. They also show comparable performance to that of a state-of-the-art acoustic keyword spotting system.","PeriodicalId":123226,"journal":{"name":"SSCS '10","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132713458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Spoken news queries over the world wide web 在万维网上查询口语新闻
SSCS '10 Pub Date : 2010-10-29 DOI: 10.1145/1878101.1878115
Sebastian Stüker, Michael Heck, Katja Renner, A. Waibel
{"title":"Spoken news queries over the world wide web","authors":"Sebastian Stüker, Michael Heck, Katja Renner, A. Waibel","doi":"10.1145/1878101.1878115","DOIUrl":"https://doi.org/10.1145/1878101.1878115","url":null,"abstract":"In this paper we present our work in expanding the View4You system developed at the Interactive Systems Laboratories (ISL). The View4You system allows the user the retrieval of automatically found news clips from recorded German broadcast news by natural spoken queries. While modular in design, so far, the architecture has required the components to at least run in a common file space. By utilizing Flash technology we turned this single machine setup into a distributed set-up that gives us access to our news database over the World Wide Web. The client side of our architecture only requires a web browser with Flash extension in order to record and send the speech of the queries to the servers and in order to display the retrieved news clips. Our future work will focus on turning the monolingual German system into a multilingual system that provides cross-lingual access and retrieval in multiple languages.","PeriodicalId":123226,"journal":{"name":"SSCS '10","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125132989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
The ambient spotlight: queryless desktop search from meeting speech 环境聚光灯:无查询桌面搜索从会议演讲
SSCS '10 Pub Date : 2010-10-29 DOI: 10.1145/1878101.1878112
J. Kilgour, J. Carletta, S. Renals
{"title":"The ambient spotlight: queryless desktop search from meeting speech","authors":"J. Kilgour, J. Carletta, S. Renals","doi":"10.1145/1878101.1878112","DOIUrl":"https://doi.org/10.1145/1878101.1878112","url":null,"abstract":"It has recently become possible to record any small meeting using a laptop equipped with a plug-and-play USB microphone array. We show the potential for such recordings in a personal aid that allows project managers to record their meetings and, when reviewing them afterwards through a standard calendar interface, to find relevant documents on their computer. This interface is intended to supplement or replace the textual searches that managers typically perform. The prototype, which relies on meeting speech recognition and topic segmentation, formulates and runs desktop search queries in order to present its results","PeriodicalId":123226,"journal":{"name":"SSCS '10","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117026842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Speaker role recognition to help spontaneous conversational speech detection 说话者角色识别,以帮助自发的会话语音检测
SSCS '10 Pub Date : 2010-10-29 DOI: 10.1145/1878101.1878104
Benjamin Bigot, I. Ferrané, J. Pinquier, R. André-Obrecht
{"title":"Speaker role recognition to help spontaneous conversational speech detection","authors":"Benjamin Bigot, I. Ferrané, J. Pinquier, R. André-Obrecht","doi":"10.1145/1878101.1878104","DOIUrl":"https://doi.org/10.1145/1878101.1878104","url":null,"abstract":"In the audio indexing context, we present our recent contributions to the field of speaker role recognition, especially applied to conversational speech.\u0000 We assume that there exist clues about roles like Anchor, Journalists or Others in temporal, acoustic and prosodic features extracted from the results of speaker segmentation and from audio files. In this paper, investigations are done on the EPAC corpus, mainly containing conversational documents. First, an automatic clustering approach is used to validate the proposed features and the role definitions. In a second study we propose a hierarchical supervised classification system. The use of dimensionality reduction methods as well as feature selection are investigated. This system correctly classifies 92% of speaker roles","PeriodicalId":123226,"journal":{"name":"SSCS '10","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129059147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
A parallel meeting diarist 平行会议日记
SSCS '10 Pub Date : 2010-10-29 DOI: 10.1145/1878101.1878114
G. Friedland, J. Chong, Adam L. Janin
{"title":"A parallel meeting diarist","authors":"G. Friedland, J. Chong, Adam L. Janin","doi":"10.1145/1878101.1878114","DOIUrl":"https://doi.org/10.1145/1878101.1878114","url":null,"abstract":"The following article presents an application for browsing meeting recordings by speaker, keyword, and pre-defined acoustic events (e.g., laughter), which we call the Meeting Diarist. The goal of the system is to enable browsing of the content with rich meta-data in a graphical user interface (GUI) shortly after the end of meeting, even when the application runs on a contemporary laptop. We therefore developed novel parallel methods for speaker diarization and speech recognition that are optimized to run on multicore and manycore architectures. This paper presents the application and the underlying parallel speaker diarization and speech recognition realizations.","PeriodicalId":123226,"journal":{"name":"SSCS '10","volume":"14 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123082113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信