超越邻接对:人机对话长序列的层次聚类

M. Maitreyee
{"title":"超越邻接对:人机对话长序列的层次聚类","authors":"M. Maitreyee","doi":"10.18653/v1/2020.codi-1.2","DOIUrl":null,"url":null,"abstract":"This work proposes a framework to predict sequences in dialogues, using turn based syntactic features and dialogue control functions. Syntactic features were extracted using dependency parsing, while dialogue control functions were manually labelled. These features were transformed using tf-idf and word embedding; feature selection was done using Principal Component Analysis (PCA). We ran experiments on six combinations of features to predict sequences with Hierarchical Agglomerative Clustering. An analysis of the clustering results indicate that using word-embeddings and syntactic features, significantly improved the results.","PeriodicalId":332037,"journal":{"name":"Proceedings of the First Workshop on Computational Approaches to Discourse","volume":"82 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Beyond Adjacency Pairs: Hierarchical Clustering of Long Sequences for Human-Machine Dialogues\",\"authors\":\"M. Maitreyee\",\"doi\":\"10.18653/v1/2020.codi-1.2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This work proposes a framework to predict sequences in dialogues, using turn based syntactic features and dialogue control functions. Syntactic features were extracted using dependency parsing, while dialogue control functions were manually labelled. These features were transformed using tf-idf and word embedding; feature selection was done using Principal Component Analysis (PCA). We ran experiments on six combinations of features to predict sequences with Hierarchical Agglomerative Clustering. An analysis of the clustering results indicate that using word-embeddings and syntactic features, significantly improved the results.\",\"PeriodicalId\":332037,\"journal\":{\"name\":\"Proceedings of the First Workshop on Computational Approaches to Discourse\",\"volume\":\"82 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the First Workshop on Computational Approaches to Discourse\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18653/v1/2020.codi-1.2\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the First Workshop on Computational Approaches to Discourse","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/2020.codi-1.2","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

这项工作提出了一个框架来预测对话序列,使用基于回合的句法特征和对话控制功能。使用依赖解析提取语法特征,同时手动标记对话控制函数。利用tf-idf和词嵌入对这些特征进行转换;使用主成分分析(PCA)进行特征选择。我们对六种特征组合进行了实验,用层次聚集聚类预测序列。对聚类结果的分析表明,使用词嵌入和句法特征可以显著改善聚类结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Beyond Adjacency Pairs: Hierarchical Clustering of Long Sequences for Human-Machine Dialogues
This work proposes a framework to predict sequences in dialogues, using turn based syntactic features and dialogue control functions. Syntactic features were extracted using dependency parsing, while dialogue control functions were manually labelled. These features were transformed using tf-idf and word embedding; feature selection was done using Principal Component Analysis (PCA). We ran experiments on six combinations of features to predict sequences with Hierarchical Agglomerative Clustering. An analysis of the clustering results indicate that using word-embeddings and syntactic features, significantly improved the results.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信