International Conference on Natural Language and Speech Processing最新文献

筛选
英文 中文
Supervised Acoustic Embeddings And Their Transferability Across Languages 监督声嵌入及其跨语言可移植性
International Conference on Natural Language and Speech Processing Pub Date : 2023-01-03 DOI: 10.48550/arXiv.2301.01020
Sreepratha Ram, Hanan Aldarmaki
{"title":"Supervised Acoustic Embeddings And Their Transferability Across Languages","authors":"Sreepratha Ram, Hanan Aldarmaki","doi":"10.48550/arXiv.2301.01020","DOIUrl":"https://doi.org/10.48550/arXiv.2301.01020","url":null,"abstract":"In speech recognition, it is essential to model the phonetic content of the input signal while discarding irrelevant factors such as speaker variations and noise, which is challenging in low-resource settings. Self-supervised pre-training has been proposed as a way to improve both supervised and unsupervised speech recognition, including frame-level feature representations and Acoustic Word Embeddings (AWE) for variable-length segments. However, self-supervised models alone cannot learn perfect separation of the linguistic content as they are trained to optimize indirect objectives. In this work, we experiment with different pre-trained self-supervised features as input to AWE models and show that they work best within a supervised framework. Models trained on English can be transferred to other languages with no adaptation and outperform self-supervised models trained solely on the target languages.","PeriodicalId":405017,"journal":{"name":"International Conference on Natural Language and Speech Processing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133494967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Arguments to Key Points Mapping with Prompt-based Learning 论点到关键点的映射与基于提示的学习
International Conference on Natural Language and Speech Processing Pub Date : 2022-11-28 DOI: 10.48550/arXiv.2211.14995
Ahnaf Mozib Samin, Behrooz Nikandish, Jingyan Chen
{"title":"Arguments to Key Points Mapping with Prompt-based Learning","authors":"Ahnaf Mozib Samin, Behrooz Nikandish, Jingyan Chen","doi":"10.48550/arXiv.2211.14995","DOIUrl":"https://doi.org/10.48550/arXiv.2211.14995","url":null,"abstract":"Handling and digesting a huge amount of information in an efficient manner has been a long-term demand in modern society. Some solutions to map key points (short textual summaries capturing essential information and filtering redundancies) to a large number of arguments/opinions have been provided recently (Bar-Haim et al., 2020). To complement the full picture of the argument-to-keypoint mapping task, we mainly propose two approaches in this paper. The first approach is to incorporate prompt engineering for fine-tuning the pre-trained language models (PLMs). The second approach utilizes prompt-based learning in PLMs to generate intermediary texts, which are then combined with the original argument-keypoint pairs and fed as inputs to a classifier, thereby mapping them. Furthermore, we extend the experiments to cross/in-domain to conduct an in-depth analysis. In our evaluation, we find that i) using prompt engineering in a more direct way (Approach 1) can yield promising results and improve the performance; ii) Approach 2 performs considerably worse than Approach 1 due to the negation issue of the PLM.","PeriodicalId":405017,"journal":{"name":"International Conference on Natural Language and Speech Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130127084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semantic Similarity-Based Clustering of Findings From Security Testing Tools 基于语义相似度的安全测试工具结果聚类
International Conference on Natural Language and Speech Processing Pub Date : 2022-11-20 DOI: 10.48550/arXiv.2211.11057
Phillip Schneider, Markus Voggenreiter, Abdullah Gulraiz, F. Matthes
{"title":"Semantic Similarity-Based Clustering of Findings From Security Testing Tools","authors":"Phillip Schneider, Markus Voggenreiter, Abdullah Gulraiz, F. Matthes","doi":"10.48550/arXiv.2211.11057","DOIUrl":"https://doi.org/10.48550/arXiv.2211.11057","url":null,"abstract":"Over the last years, software development in domains with high security demands transitioned from traditional methodologies to uniting modern approaches from software development and operations (DevOps). Key principles of DevOps gained more importance and are now applied to security aspects of software development, resulting in the automation of security-enhancing activities. In particular, it is common practice to use automated security testing tools that generate reports after inspecting a software artifact from multiple perspectives. However, this raises the challenge of generating duplicate security findings. To identify these duplicate findings manually, a security expert has to invest resources like time, effort, and knowledge. A partial automation of this process could reduce the analysis effort, encourage DevOps principles, and diminish the chance of human error. In this study, we investigated the potential of applying Natural Language Processing for clustering semantically similar security findings to support the identification of problem-specific duplicate findings. Towards this goal, we developed a web application for annotating and assessing security testing tool reports and published a human-annotated corpus of clustered security findings. In addition, we performed a comparison of different semantic similarity techniques for automatically grouping security findings. Finally, we assess the resulting clusters using both quantitative and qualitative evaluation methods.","PeriodicalId":405017,"journal":{"name":"International Conference on Natural Language and Speech Processing","volume":"180 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121882114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Scaling Native Language Identification with Transformer Adapters 使用变压器适配器扩展本地语言标识
International Conference on Natural Language and Speech Processing Pub Date : 2022-11-18 DOI: 10.48550/arXiv.2211.10117
Ahmet Uluslu, G. Schneider
{"title":"Scaling Native Language Identification with Transformer Adapters","authors":"Ahmet Uluslu, G. Schneider","doi":"10.48550/arXiv.2211.10117","DOIUrl":"https://doi.org/10.48550/arXiv.2211.10117","url":null,"abstract":"Native language identification (NLI) is the task of automatically identifying the native language (L1) of an individual based on their language production in a learned language. It is useful for a variety of purposes including marketing, security and educational applications. NLI is usually framed as a multi-label classification task, where numerous designed features are combined to achieve state-of-the-art results. Recently deep generative approach based on transformer decoders (GPT-2) outperformed its counterparts and achieved the best results on the NLI benchmark datasets. We investigate this approach to determine the practical implications compared to traditional state-of-the-art NLI systems. We introduce transformer adapters to address memory limitations and improve training/inference speed to scale NLI applications for production.","PeriodicalId":405017,"journal":{"name":"International Conference on Natural Language and Speech Processing","volume":"109 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123443076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Efficient Task-Oriented Dialogue Systems with Response Selection as an Auxiliary Task 以响应选择为辅助任务的高效任务导向对话系统
International Conference on Natural Language and Speech Processing Pub Date : 2022-08-15 DOI: 10.48550/arXiv.2208.07097
Radostin Cholakov, T. Kolev
{"title":"Efficient Task-Oriented Dialogue Systems with Response Selection as an Auxiliary Task","authors":"Radostin Cholakov, T. Kolev","doi":"10.48550/arXiv.2208.07097","DOIUrl":"https://doi.org/10.48550/arXiv.2208.07097","url":null,"abstract":"The adoption of pre-trained language models in task-oriented dialogue systems has resulted in significant enhancements of their text generation abilities. However, these architectures are slow to use because of the large number of trainable parameters and can sometimes fail to generate diverse responses. To address these limitations, we propose two models with auxiliary tasks for response selection - (1) distinguishing distractors from ground truth responses and (2) distinguishing synthetic responses from ground truth labels. They achieve state-of-the-art results on the MultiWOZ 2.1 dataset with combined scores of 107.5 and 108.3 and outperform a baseline with three times more parameters. We publish reproducible code and checkpoints and discuss the effects of applying auxiliary tasks to T5-based architectures.","PeriodicalId":405017,"journal":{"name":"International Conference on Natural Language and Speech Processing","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121443669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Constructing the Corpus of Chinese Textual ‘Run-on’ Sentences (CCTRS): Discourse Corpus Benchmark with Multi-layer Annotations 构建汉语文本“连续”句语料库:基于多层注释的篇章语料库基准
International Conference on Natural Language and Speech Processing Pub Date : 2021-11-12 DOI: 10.31234/osf.io/jua9g
Kun Sun, Rong Wang
{"title":"Constructing the Corpus of Chinese Textual ‘Run-on’ Sentences (CCTRS): Discourse Corpus Benchmark with Multi-layer Annotations","authors":"Kun Sun, Rong Wang","doi":"10.31234/osf.io/jua9g","DOIUrl":"https://doi.org/10.31234/osf.io/jua9g","url":null,"abstract":"Chinese is a discourse-oriented language. “Run-on” sentences (liushui ju) are a typical and prevalent form of discourse in Chinese. These sentences show the capacity of the Chinese language for organizing loose structures into an effective and coherent discourse. Despite their widespread use in Chinese, previous studies have only explored “run-on” sentences by using small-scale examples. In order to carry out a quantitative investigation of “run-on” sentences, we need to establish a corpus. The present study selects 500 “run-on” sentences and annotates them on the levels of discourse, syntax and semantics. We mainly adopt PDTB (Penn Discourse Treebank) styles in the discourse annotations but we also borrow some features from RST (rhetorical structure theory). We find that the distribution of the frequency of discourse relations in the data extracted from this corpus follows the power law. The preliminary results reveal that semantic leaps in “run-on” sentences are closely related to the use of the topic chain and the animacy and the span of discourse relations. This corpus can thus aid in carrying out further computational and cognitive studies of Chinese discourse.","PeriodicalId":405017,"journal":{"name":"International Conference on Natural Language and Speech Processing","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123520634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluation of topic segmentation algorithms on Arabic texts 阿拉伯语文本主题分割算法的评价
International Conference on Natural Language and Speech Processing Pub Date : 1900-01-01 DOI: 10.1109/ICNLSP.2018.8374389
Fayçal Nouar, H. Belhadef
{"title":"Evaluation of topic segmentation algorithms on Arabic texts","authors":"Fayçal Nouar, H. Belhadef","doi":"10.1109/ICNLSP.2018.8374389","DOIUrl":"https://doi.org/10.1109/ICNLSP.2018.8374389","url":null,"abstract":"In this paper, we are interested in the topic segmentation of Arabic texts. For this aim, we evaluate two based lexical cohesion algorithms: MinCutSeg and BayesSeg by using the Pk and WindowDiff metrics. To assess how well each algorithm works, each was applied on three datasets with longer texts from two different domains: transcribed multi-party conversations and written texts. After adaptation to the Arabic language, the test results show significant differences in performance depending on the types of documents.","PeriodicalId":405017,"journal":{"name":"International Conference on Natural Language and Speech Processing","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124037146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信