AI Open最新文献

筛选
英文 中文
Semantic graph based topic modelling framework for multilingual fake news detection 基于语义图的多语言假新闻检测主题建模框架
AI Open Pub Date : 2023-01-01 DOI: 10.1016/j.aiopen.2023.08.004
Rami Mohawesh , Xiao Liu , Hilya Mudrika Arini , Yutao Wu , Hui Yin
{"title":"Semantic graph based topic modelling framework for multilingual fake news detection","authors":"Rami Mohawesh ,&nbsp;Xiao Liu ,&nbsp;Hilya Mudrika Arini ,&nbsp;Yutao Wu ,&nbsp;Hui Yin","doi":"10.1016/j.aiopen.2023.08.004","DOIUrl":"https://doi.org/10.1016/j.aiopen.2023.08.004","url":null,"abstract":"<div><p>Fake news detection is one of the most alluring problems that has grabbed the interest of Machine Learning (ML) and Natural Language Processing (NLP) experts in recent years. The majority of existing studies on detecting fake news are written in English, restricting its application outside the English-speaking population. The lack of annotated corpora and technologies makes it difficult to identify false news in the scenario of low-resource languages, despite the growth in multilingual web content. Moreover, existing works cannot collect more semantic and contextual characteristics from documents in a particular multilingual text corpus. To bridge up these challenges and deal with the multilingual fake news detection challenge, we develop a new semantic graph attention-based representation learning framework to extract structural and semantic representations of texts. Our experiments on TALLIP fake news datasets showed that the classification performance had been significantly enhanced, ranging from 1% to 7% in terms of accuracy metric, and our proposed framework outperformed the state-of-the-art techniques for the multilingual fake news detection task.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"4 ","pages":"Pages 33-41"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49710400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
AdaDS: Adaptive data selection for accelerating pre-trained language model knowledge distillation 加速预训练语言模型知识升华的自适应数据选择
AI Open Pub Date : 2023-01-01 DOI: 10.1016/j.aiopen.2023.08.005
Qinhong Zhou , Peng Li , Yang Liu , Yuyang Guan , Qizhou Xing , Ming Chen , Maosong Sun , Yang Liu
{"title":"AdaDS: Adaptive data selection for accelerating pre-trained language model knowledge distillation","authors":"Qinhong Zhou ,&nbsp;Peng Li ,&nbsp;Yang Liu ,&nbsp;Yuyang Guan ,&nbsp;Qizhou Xing ,&nbsp;Ming Chen ,&nbsp;Maosong Sun ,&nbsp;Yang Liu","doi":"10.1016/j.aiopen.2023.08.005","DOIUrl":"https://doi.org/10.1016/j.aiopen.2023.08.005","url":null,"abstract":"<div><p>Knowledge distillation (KD) is a widely used method for transferring knowledge from large teacher models to computationally efficient student models. Unfortunately, the computational cost of KD becomes unaffordable as pre-trained language models (PLMs) grow larger. Computing KD loss on only part of the training set is a promising way to accelerate KD. However, existing works heuristically leverage only one static data selection strategy during the KD process, demonstrating inconsistent improvements across different distillation scenarios. In this work, we conduct a thorough study on various typical data selection strategies for KD, and show that this problem is due to the fact that the best data selection strategy is specific to various factors, including task, selected data size, and training stage. To automatically adapt to these factors, we propose a framework named AdaDS to learn to choose the data selection strategy adaptively during the KD process. Experimental results show that our proposed method is effective for various tasks and selected data sizes under both fine-tuning and pre-training stages, achieving comparable performance to DistilBERT with only 10% amount of queries to the teacher model.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"4 ","pages":"Pages 56-63"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49732904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MONEY: Ensemble learning for stock price movement prediction via a convolutional network with adversarial hypergraph model MONEY:基于对抗超图模型卷积网络的股票价格运动预测集成学习
AI Open Pub Date : 2023-01-01 DOI: 10.1016/j.aiopen.2023.10.002
Zhongtian Sun , Anoushka Harit , Alexandra I. Cristea , Jingyun Wang , Pietro Lio
{"title":"MONEY: Ensemble learning for stock price movement prediction via a convolutional network with adversarial hypergraph model","authors":"Zhongtian Sun ,&nbsp;Anoushka Harit ,&nbsp;Alexandra I. Cristea ,&nbsp;Jingyun Wang ,&nbsp;Pietro Lio","doi":"10.1016/j.aiopen.2023.10.002","DOIUrl":"https://doi.org/10.1016/j.aiopen.2023.10.002","url":null,"abstract":"<div><p>Stock price prediction is challenging in financial investment, with the AI boom leading to increased interest from researchers. Despite these recent advances, many studies are limited to capturing the time series characteristics of price movement via recurrent neural networks (RNNs) but neglect other critical relevant factors, such as industry, shareholders, and news. On the other hand, graph neural networks have been applied to a broad range of tasks due to their superior performance in capturing complex relations among entities and representation learning. This paper investigates the effectiveness of using graph neural networks for stock price movement prediction. Inspired by a recent study, we capture the complex group-level information (co-movement of similar companies) via hypergraphs. Unlike other hypergraph studies, we also use a graph model to learn pairwise relations. Moreover, we are the first to demonstrate that this simple graph model should be applied before using RNNs, rather than later, as prior research suggested. In this paper, the long-term dependencies of similar companies can be learnt by the next RNNs, which augments their predictability. We also apply adversarial training to capture the stochastic nature of the financial market and enhance the generalisation of the proposed model. Hence, we contribute with a novel ensemble learning framework to predict stock price movement, named MONEY. It is comprised of (a) a Graph Convolution Network (GCN), representing pairwise industry and price information and (b) a hypergraph convolution network for group-oriented information transmission via hyperedges with adversarial training by adding perturbations on inputs before the last prediction layer. Real-world data experiments demonstrate that MONEY significantly outperforms, on average, the state-of-the-art methods and performs particularly well in the bear market.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"4 ","pages":"Pages 165-174"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651023000189/pdfft?md5=40081746293fa3fdc23c059ee4dd4684&pid=1-s2.0-S2666651023000189-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"92026116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Interactive active learning for fairness with partial group label 基于部分分组标签的公平交互式主动学习
AI Open Pub Date : 2023-01-01 DOI: 10.1016/j.aiopen.2023.10.003
Zeyu Yang , Jizhi Zhang , Fuli Feng , Chongming Gao , Qifan Wang , Xiangnan He
{"title":"Interactive active learning for fairness with partial group label","authors":"Zeyu Yang ,&nbsp;Jizhi Zhang ,&nbsp;Fuli Feng ,&nbsp;Chongming Gao ,&nbsp;Qifan Wang ,&nbsp;Xiangnan He","doi":"10.1016/j.aiopen.2023.10.003","DOIUrl":"https://doi.org/10.1016/j.aiopen.2023.10.003","url":null,"abstract":"<div><p>The rapid development of AI technologies has found numerous applications across various domains in human society. Ensuring fairness and preventing discrimination are critical considerations in the development of AI models. However, incomplete information often hinders the complete collection of sensitive attributes in real-world applications, primarily due to the high cost and potential privacy violations associated with such data collection. Label reconstruction through building another learner on sensitive attributes is a common approach to address this issue. However, existing methods focus solely on improving the prediction accuracy of the sensitive learner as a separate model, while ignoring the disparity between its accuracy and the fairness of the base model. To bridge this gap, this paper proposes an interactive learning framework that aims to optimize the sensitive learner while considering the fairness of the base learner. Furthermore, a new active sampling strategy is developed to select the most valuable data for the sensitive learner regarding the fairness of the base model. The effectiveness of our proposed method in improving model fairness is demonstrated through comprehensive evaluations conducted on various datasets and fairness criteria.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"4 ","pages":"Pages 175-182"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651023000190/pdfft?md5=8647172d4d8f417e44b8c64861c1afd4&pid=1-s2.0-S2666651023000190-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"92131676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Information Retrieval meets Large Language Models: A strategic report from Chinese IR community 信息检索与大型语言模型的结合——来自中国信息检索界的战略报告
AI Open Pub Date : 2023-01-01 DOI: 10.1016/j.aiopen.2023.08.001
Qingyao Ai , Ting Bai , Zhao Cao , Yi Chang , Jiawei Chen , Zhumin Chen , Zhiyong Cheng , Shoubin Dong , Zhicheng Dou , Fuli Feng , Shen Gao , Jiafeng Guo , Xiangnan He , Yanyan Lan , Chenliang Li , Yiqun Liu , Ziyu Lyu , Weizhi Ma , Jun Ma , Zhaochun Ren , Xiaofei Zhu
{"title":"Information Retrieval meets Large Language Models: A strategic report from Chinese IR community","authors":"Qingyao Ai ,&nbsp;Ting Bai ,&nbsp;Zhao Cao ,&nbsp;Yi Chang ,&nbsp;Jiawei Chen ,&nbsp;Zhumin Chen ,&nbsp;Zhiyong Cheng ,&nbsp;Shoubin Dong ,&nbsp;Zhicheng Dou ,&nbsp;Fuli Feng ,&nbsp;Shen Gao ,&nbsp;Jiafeng Guo ,&nbsp;Xiangnan He ,&nbsp;Yanyan Lan ,&nbsp;Chenliang Li ,&nbsp;Yiqun Liu ,&nbsp;Ziyu Lyu ,&nbsp;Weizhi Ma ,&nbsp;Jun Ma ,&nbsp;Zhaochun Ren ,&nbsp;Xiaofei Zhu","doi":"10.1016/j.aiopen.2023.08.001","DOIUrl":"https://doi.org/10.1016/j.aiopen.2023.08.001","url":null,"abstract":"<div><p>The research field of Information Retrieval (IR) has evolved significantly, expanding beyond traditional search to meet diverse user information needs. Recently, Large Language Models (LLMs) have demonstrated exceptional capabilities in text understanding, generation, and knowledge inference, opening up exciting avenues for IR research. LLMs not only facilitate generative retrieval but also offer improved solutions for user understanding, model evaluation, and user-system interactions. More importantly, the synergistic relationship among IR models, LLMs, and humans forms a new technical paradigm that is more powerful for information seeking. IR models provide real-time and relevant information, LLMs contribute internal knowledge, and humans play a central role of demanders and evaluators to the reliability of information services. Nevertheless, significant challenges exist, including computational costs, credibility concerns, domain-specific limitations, and ethical considerations. To thoroughly discuss the transformative impact of LLMs on IR research, the Chinese IR community conducted a strategic workshop in April 2023, yielding valuable insights. This paper provides a summary of the workshop’s outcomes, including the rethinking of IR’s core values, the mutual enhancement of LLMs and IR, the proposal of a novel IR technical paradigm, and open challenges.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"4 ","pages":"Pages 80-90"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49710721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A unified network embedding algorithm for multi-type similarity measures 一种用于多类型相似性度量的统一网络嵌入算法
AI Open Pub Date : 2023-01-01 DOI: 10.1016/j.aiopen.2023.08.002
Rui Feng , Qi Ding , Weihao Qiu , Xiao Yang , Yang yang , Chunping Wang
{"title":"A unified network embedding algorithm for multi-type similarity measures","authors":"Rui Feng ,&nbsp;Qi Ding ,&nbsp;Weihao Qiu ,&nbsp;Xiao Yang ,&nbsp;Yang yang ,&nbsp;Chunping Wang","doi":"10.1016/j.aiopen.2023.08.002","DOIUrl":"https://doi.org/10.1016/j.aiopen.2023.08.002","url":null,"abstract":"<div><p>Traditional network embedding aims to learn <em>representations</em> by capturing a predefined <em>vertex-to-vertex similarity measure</em>. However, in practice, there are different types of similarity measures (e.g., <em>connectivity</em> and <em>structural similarity</em>), which are appropriate for different downstream applications. Meanwhile, it is hard to select the “best” similarity measure that can mostly benefit the application, considering the required domain knowledge of both application scenario and network science. It sometimes requires to cooperate these similarity measures with each other for achieving better performance. Therefore, automatically integrate multiple types of similarity measures into a uniform network embedding framework is critical to obtain effective vertex representations for a downstream application. In this paper, we address the above problem in social networks, and propose a <em>semi-supervised</em> representation learning algorithm. The general idea of our approach is to impose <em>social influence</em>, which occurs when one’s opinions, emotions, or behaviors are affected by others in a social network. Particularly, we build the connection between a user’s representation vector and the probability of her being influenced by another user to have a particular label (e.g., fraud, personal interest, etc.). We conduct efficient experiments based on six real-world datasets and find a clear improvement of our approach comparing with several state-of-the-art baselines.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"4 ","pages":"Pages 64-72"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49710730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Is Chinese Spelling Check ready? Understanding the correction behavior in real-world scenarios 中文拼写检查准备好了吗?了解现实场景中的纠正行为
AI Open Pub Date : 2023-01-01 DOI: 10.1016/j.aiopen.2023.10.004
Liner Yang , Xin Liu , Tianxin Liao , Zhenghao Liu , Mengyan Wang , Xuezhi Fang , Erhong Yang
{"title":"Is Chinese Spelling Check ready? Understanding the correction behavior in real-world scenarios","authors":"Liner Yang ,&nbsp;Xin Liu ,&nbsp;Tianxin Liao ,&nbsp;Zhenghao Liu ,&nbsp;Mengyan Wang ,&nbsp;Xuezhi Fang ,&nbsp;Erhong Yang","doi":"10.1016/j.aiopen.2023.10.004","DOIUrl":"https://doi.org/10.1016/j.aiopen.2023.10.004","url":null,"abstract":"<div><p>The task of Chinese Spelling Check (CSC) is crucial for identifying and rectifying spelling errors in Chinese texts. While prior work in this domain has predominantly relied on benchmarks such as SIGHAN for evaluating model performance, these benchmarks often exhibit an imbalanced distribution of spelling errors. They are typically constructed under idealized conditions, presuming the presence of only spelling errors in the input text. This assumption does not hold in real-world scenarios, where spell checkers frequently encounter a mix of spelling and grammatical errors, thereby presenting additional challenges. To address this gap and create a more realistic testing environment, we introduce a high-quality CSC evaluation benchmark named YACSC (Yet Another Chinese Spelling Check Dataset). YACSC is unique in that it includes annotations for both grammatical and spelling errors, rendering it a more reliable benchmark for CSC tasks. Furthermore, we propose a hierarchical network designed to integrate multidimensional information, leveraging semantic and phonetic aspects, as well as the structural forms of Chinese characters, to enhance the detection and correction of spelling errors. Through extensive experiments, we evaluate the limitations of existing CSC benchmarks and illustrate the application of our proposed system in real-world scenarios, particularly as a preliminary stage in writing assistant systems.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"4 ","pages":"Pages 183-192"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651023000207/pdfft?md5=74aa1bdba96c30d73a25c1dde4472205&pid=1-s2.0-S2666651023000207-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134657198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A survey on complex factual question answering 复杂事实问答调查
AI Open Pub Date : 2023-01-01 DOI: 10.1016/j.aiopen.2022.12.003
Lingxi Zhang , Jing Zhang , Xirui Ke , Haoyang Li , Xinmei Huang , Zhonghui Shao , Shulin Cao , Xin Lv
{"title":"A survey on complex factual question answering","authors":"Lingxi Zhang ,&nbsp;Jing Zhang ,&nbsp;Xirui Ke ,&nbsp;Haoyang Li ,&nbsp;Xinmei Huang ,&nbsp;Zhonghui Shao ,&nbsp;Shulin Cao ,&nbsp;Xin Lv","doi":"10.1016/j.aiopen.2022.12.003","DOIUrl":"https://doi.org/10.1016/j.aiopen.2022.12.003","url":null,"abstract":"<div><p>Answering complex factual questions has drawn a lot of attention. Researchers leverage various data sources to support complex QA, such as unstructured texts, structured knowledge graphs and relational databases, semi-structured web tables, or even hybrid data sources. However, although the ideas behind these approaches show similarity to some extent, there is not yet a consistent strategy to deal with various data sources. In this survey, we carefully examine how complex factual question answering has evolved across various data sources. We list the similarities among these approaches and group them into the analysis–extend–reason framework, despite the various question types and data sources that they focus on. We also address future directions for difficult factual question answering as well as the relevant benchmarks.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"4 ","pages":"Pages 1-12"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49710582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Graph-based methods for cervical cancer segmentation: Advancements, limitations, and future directions 基于图的子宫颈癌分割方法:进展、限制和未来方向
AI Open Pub Date : 2023-01-01 DOI: 10.1016/j.aiopen.2023.08.006
Nazar Zaki , Wenjian Qin , Anusuya Krishnan
{"title":"Graph-based methods for cervical cancer segmentation: Advancements, limitations, and future directions","authors":"Nazar Zaki ,&nbsp;Wenjian Qin ,&nbsp;Anusuya Krishnan","doi":"10.1016/j.aiopen.2023.08.006","DOIUrl":"https://doi.org/10.1016/j.aiopen.2023.08.006","url":null,"abstract":"<div><p>Cervical cancer remains a significant health concern worldwide, where precise segmentation of cervical lesions is integral for effective diagnosis and treatment planning. This systematic review critically evaluates the application of graph-based methodologies for cervical cancer segmentation, identifying their potential, drawbacks, and avenues for future development. An exhaustive literature search across Scopus and PubMed databases resulted in 20 pertinent studies. These studies were assessed focusing on their implementation of graph-based techniques for cervical cancer segmentation, the utilized datasets, evaluation metrics, and reported precision levels. The review highlights the progressive strides made in the field, especially regarding the segmentation of intricate, non-convex regions and facilitating the detection and grading of cervical cancer using graph-based methodologies. Nonetheless, several constraints were evident, including a dearth of comparative performance analysis, reliance on high-resolution images, difficulties in specific boundary delineation, and the imperative for additional validation and diversified datasets. The review suggests future work to integrate advanced deep learning strategies for heightened accuracy, formulate hybrid methodologies to counteract existing limitations, and explore multi-modal fusion to boost segmentation precision. Emphasizing the explainability and interpretability of outcomes also stands paramount. Lastly, addressing critical challenges such as scarcity of annotated data, the need for real-time and interactive segmentation, and the segmentation of multiple objects or regions of interest remains a crucial frontier for future endeavors.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"4 ","pages":"Pages 42-55"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49732902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Word sense induction with agglomerative clustering and mutual information maximization 词义归纳与聚类和互信息最大化
AI Open Pub Date : 2023-01-01 DOI: 10.1016/j.aiopen.2023.12.001
Hadi Abdine , Moussa Kamal Eddine , Davide Buscaldi , Michalis Vazirgiannis
{"title":"Word sense induction with agglomerative clustering and mutual information maximization","authors":"Hadi Abdine ,&nbsp;Moussa Kamal Eddine ,&nbsp;Davide Buscaldi ,&nbsp;Michalis Vazirgiannis","doi":"10.1016/j.aiopen.2023.12.001","DOIUrl":"https://doi.org/10.1016/j.aiopen.2023.12.001","url":null,"abstract":"<div><p>Word sense induction (WSI) is a challenging problem in natural language processing that involves the unsupervised automatic detection of a word’s senses (i.e., meanings). Recent work achieves significant results on the WSI task by pre-training a language model that can exclusively disambiguate word senses. In contrast, others employ off-the-shelf pre-trained language models with additional strategies to induce senses. This paper proposes a novel unsupervised method based on hierarchical clustering and invariant information clustering (IIC). The IIC loss is used to train a small model to optimize the mutual information between two vector representations of a target word occurring in a pair of synthetic paraphrases. This model is later used in inference mode to extract a higher-quality vector representation to be used in the hierarchical clustering. We evaluate our method on two WSI tasks and in two distinct clustering configurations (fixed and dynamic number of clusters). We empirically show that our approach is at least on par with the state-of-the-art baselines, outperforming them in several configurations. The code and data to reproduce this work are available to the public<span><sup>1</sup></span>.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"4 ","pages":"Pages 193-201"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651023000232/pdfft?md5=a0553e94f2fab365fb751bcc0ddf8e6c&pid=1-s2.0-S2666651023000232-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138570139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信