AI Open最新文献

筛选
英文 中文
Joint span and token framework for few-shot named entity recognition 用于少镜头命名实体识别的联合跨度和令牌框架
AI Open Pub Date : 2023-01-01 DOI: 10.1016/j.aiopen.2023.08.009
Wenlong Fang, Yongbin Liu, Chunping Ouyang, Lin Ren, Jiale Li, Yaping Wan
{"title":"Joint span and token framework for few-shot named entity recognition","authors":"Wenlong Fang,&nbsp;Yongbin Liu,&nbsp;Chunping Ouyang,&nbsp;Lin Ren,&nbsp;Jiale Li,&nbsp;Yaping Wan","doi":"10.1016/j.aiopen.2023.08.009","DOIUrl":"https://doi.org/10.1016/j.aiopen.2023.08.009","url":null,"abstract":"<div><p>Few-shot Named Entity Recognition (NER) is a challenging task that involves identifying new entity types using a limited number of labeled instances for training. Currently, the majority of Few-shot NER methods are based on span, which pay more attention to the boundary information of the spans as candidate entities and the entity-level information. However, these methods often overlook token-level semantic information, which can limit their effectiveness. To address this issue, we propose a novel Joint Span and Token (<strong>JST</strong>) framework that integrates both the boundary information of an entity and the semantic information of each token that comprises an entity. The <strong>JST</strong> framework employs span features to extract the boundary features of the entity and token features to extract the semantic features of each token. Additionally, to reduce the negative impact of the Other class, we introduce a method to separate named entities from the Other class in semantic space, which helps to improve the distinction between entities and the Other class. In addition, we used GPT to do data augmentation on the support sentences, generating similar sentences to the original ones. These sentences increase the diversity of the sample and the reliability of our model. Our experimental results on the Few-NERD<span><sup>1</sup></span> and SNIPS<span><sup>2</sup></span> datasets demonstrate that our model outperforms existing methods in terms of performance.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"4 ","pages":"Pages 111-119"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49710707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Restricted orthogonal gradient projection for continual learning 连续学习的受限正交梯度投影
AI Open Pub Date : 2023-01-01 DOI: 10.1016/j.aiopen.2023.08.010
Zeyuan Yang , Zonghan Yang , Yichen Liu , Peng Li , Yang Liu
{"title":"Restricted orthogonal gradient projection for continual learning","authors":"Zeyuan Yang ,&nbsp;Zonghan Yang ,&nbsp;Yichen Liu ,&nbsp;Peng Li ,&nbsp;Yang Liu","doi":"10.1016/j.aiopen.2023.08.010","DOIUrl":"https://doi.org/10.1016/j.aiopen.2023.08.010","url":null,"abstract":"<div><p>Continual learning aims to avoid catastrophic forgetting and effectively leverage learned experiences to master new knowledge. Existing gradient projection approaches impose hard constraints on the optimization space for new tasks to minimize interference, which simultaneously hinders forward knowledge transfer. To address this issue, recent methods reuse frozen parameters with a growing network, resulting in high computational costs. Thus, it remains a challenge whether we can improve forward knowledge transfer for gradient projection approaches <em>using a fixed network architecture</em>. In this work, we propose the Restricted Orthogonal Gradient prOjection (ROGO) framework. The basic idea is to adopt a restricted orthogonal constraint allowing parameters optimized in the direction oblique to the whole frozen space to facilitate forward knowledge transfer while consolidating previous knowledge. Our framework requires neither data buffers nor extra parameters. Extensive experiments have demonstrated the superiority of our framework over several strong baselines. We also provide theoretical guarantees for our relaxing strategy.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"4 ","pages":"Pages 98-110"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49732819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-grained hypergraph interest modeling for conversational recommendation 用于会话推荐的多粒度超图兴趣建模
AI Open Pub Date : 2023-01-01 DOI: 10.1016/j.aiopen.2023.10.001
Chenzhan Shang , Yupeng Hou , Wayne Xin Zhao , Yaliang Li , Jing Zhang
{"title":"Multi-grained hypergraph interest modeling for conversational recommendation","authors":"Chenzhan Shang ,&nbsp;Yupeng Hou ,&nbsp;Wayne Xin Zhao ,&nbsp;Yaliang Li ,&nbsp;Jing Zhang","doi":"10.1016/j.aiopen.2023.10.001","DOIUrl":"https://doi.org/10.1016/j.aiopen.2023.10.001","url":null,"abstract":"<div><p>Conversational recommender system (CRS) interacts with users through multi-turn dialogues in natural language, which aims to provide high-quality recommendations for user’s instant information need. Although great efforts have been made to develop effective CRS, most of them still focus on the contextual information from the current dialogue, usually suffering from the data scarcity issue. Therefore, we consider leveraging historical dialogue data to enrich the limited contexts of the current dialogue session.</p><p>In this paper, we propose a novel multi-grained hypergraph interest modeling approach to capture user interest beneath intricate historical data from different perspectives. As the core idea, we employ <em>hypergraph</em> to represent complicated semantic relations underlying historical dialogues. In our approach, we first employ the hypergraph structure to model users’ historical dialogue sessions and form a <em>session-based hypergraph</em>, which captures <em>coarse-grained, session-level</em> relations. Second, to alleviate the issue of data scarcity, we use an external knowledge graph and construct a <em>knowledge-based hypergraph</em> considering <em>fine-grained, entity-level</em> semantics. We further conduct multi-grained hypergraph convolution on the two kinds of hypergraphs, and utilize the enhanced representations to develop interest-aware CRS. Extensive experiments on two benchmarks <span>ReDial</span> and <span>TG-ReDial</span> validate the effectiveness of our approach on both recommendation and conversation tasks. Code is available at: <span>https://github.com/RUCAIBox/MHIM</span><svg><path></path></svg>.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"4 ","pages":"Pages 154-164"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651023000177/pdfft?md5=845c75e23c419b9a9e76d0939d4efddc&pid=1-s2.0-S2666651023000177-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"92131677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving task generalization via unified schema prompt 通过统一的模式提示提高任务泛化能力
AI Open Pub Date : 2023-01-01 DOI: 10.1016/j.aiopen.2023.08.011
Wanjun Zhong , Yifan Gao , Ning Ding , Zhiyuan Liu , Ming Zhou , Jiahai Wang , Jian Yin , Nan Duan
{"title":"Improving task generalization via unified schema prompt","authors":"Wanjun Zhong ,&nbsp;Yifan Gao ,&nbsp;Ning Ding ,&nbsp;Zhiyuan Liu ,&nbsp;Ming Zhou ,&nbsp;Jiahai Wang ,&nbsp;Jian Yin ,&nbsp;Nan Duan","doi":"10.1016/j.aiopen.2023.08.011","DOIUrl":"https://doi.org/10.1016/j.aiopen.2023.08.011","url":null,"abstract":"<div><p>Task generalization has been a long-standing challenge in Natural Language Processing (NLP). Recent research attempts to improve the task generalization ability of pre-trained language models by mapping NLP tasks into human-readable prompted forms. However, these approaches require laborious and inflexible manual collection of prompts, and different prompts on the same downstream task may receive unstable performance. We propose Unified Schema Prompt, a flexible and extensible prompting method, which automatically customizes the learnable prompts for each task according to the task input schema. It models the shared knowledge between tasks, while keeping the characteristics of different task schema, and thus enhances task generalization ability. The schema prompt takes the explicit data structure of each task to formulate prompts so that little human effort is involved. To test the task generalization ability of schema prompt at scale, we conduct schema prompt-based multitask pre-training on a wide variety of general NLP tasks. The framework achieves strong zero-shot and few-shot generalization performance on 16 unseen downstream tasks from 8 task types (e.g., QA, NLI, etc.). Furthermore, comprehensive analyses demonstrate the effectiveness of each component in the schema prompt, its flexibility in task compositionality, and its ability to improve performance under a full-data fine-tuning setting.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"4 ","pages":"Pages 120-129"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49710709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Batch virtual adversarial training for graph convolutional networks 图卷积网络的批处理虚拟对抗训练
AI Open Pub Date : 2023-01-01 DOI: 10.1016/j.aiopen.2023.08.007
Zhijie Deng , Yinpeng Dong , Jun Zhu
{"title":"Batch virtual adversarial training for graph convolutional networks","authors":"Zhijie Deng ,&nbsp;Yinpeng Dong ,&nbsp;Jun Zhu","doi":"10.1016/j.aiopen.2023.08.007","DOIUrl":"https://doi.org/10.1016/j.aiopen.2023.08.007","url":null,"abstract":"<div><p>We present batch virtual adversarial training (BVAT), a novel regularization method for graph convolutional networks (GCNs). BVAT addresses the issue that GCNs do not ensure the smoothness of the model’s output distribution against local perturbations around the input node features. We propose two algorithms, sampling-based BVAT and optimization-based BVAT, which promote the output smoothness of GCN classifiers based on the generated virtual adversarial perturbations for either a subset of independent nodes or all nodes via an elaborate optimization process. Extensive experiments on three citation network datasets <em>Cora</em>, <em>Citeseer</em> and <em>Pubmed</em> and a knowledge graph dataset <em>Nell</em> validate the efficacy of the proposed method in semi-supervised node classification tasks.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"4 ","pages":"Pages 73-79"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49761369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 63
Sarcasm detection using news headlines dataset 基于新闻标题数据集的讽刺检测
AI Open Pub Date : 2023-01-01 DOI: 10.1016/j.aiopen.2023.01.001
Rishabh Misra , Prahal Arora
{"title":"Sarcasm detection using news headlines dataset","authors":"Rishabh Misra ,&nbsp;Prahal Arora","doi":"10.1016/j.aiopen.2023.01.001","DOIUrl":"https://doi.org/10.1016/j.aiopen.2023.01.001","url":null,"abstract":"<div><p>Sarcasm has been an elusive concept for humans. Due to interesting linguistic properties, sarcasm detection has gained traction of the Natural Language Processing (NLP) research community in the past few years. However, the task of predicting sarcasm in a text remains a difficult one for machines as well, and there are limited insights into what makes a sentence sarcastic. Past studies in sarcasm detection either use large scale datasets collected using tag-based supervision or small scale manually annotated datasets. The former category of datasets are noisy in terms of labels and language, whereas the latter category of datasets do not have enough instances to train deep learning models reliably despite having high-quality labels. To overcome these shortcomings, we introduce a high-quality and relatively larger-scale dataset which is a collection of news headlines from a sarcastic news website and a real news website. We describe the unique aspects of our dataset and compare its various characteristics with other benchmark datasets in sarcasm detection domain. Furthermore, we produce insights into what constitute as sarcasm in a text using a Hybrid Neural Network architecture. First released in 2019, we dedicate a section on how the NLP research community has extensively relied upon our contributions to push the state of the art further in the sarcasm detection domain. Lastly, we make the dataset as well as framework implementation publicly available to facilitate continued research in this domain.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"4 ","pages":"Pages 13-18"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49732927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
On the distribution alignment of propagation in graph neural networks 论图神经网络传播的分布对齐
AI Open Pub Date : 2022-12-01 DOI: 10.1016/j.aiopen.2022.11.006
Qinkai Zheng, Xiao Xia, Kun Zhang, E. Kharlamov, Yuxiao Dong
{"title":"On the distribution alignment of propagation in graph neural networks","authors":"Qinkai Zheng, Xiao Xia, Kun Zhang, E. Kharlamov, Yuxiao Dong","doi":"10.1016/j.aiopen.2022.11.006","DOIUrl":"https://doi.org/10.1016/j.aiopen.2022.11.006","url":null,"abstract":"","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"12 1","pages":"218-228"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81955757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BCA: Bilinear Convolutional Neural Networks and Attention Networks for legal question answering 基于双线性卷积神经网络和注意力网络的法律问答
AI Open Pub Date : 2022-11-01 DOI: 10.1016/j.aiopen.2022.11.002
Haiguang Zhang, Tongyue Zhang, Faxin Cao, Zhizheng Wang, Yuanyu Zhang, Yuanyuan Sun, Mark Anthony Vicente
{"title":"BCA: Bilinear Convolutional Neural Networks and Attention Networks for legal question answering","authors":"Haiguang Zhang, Tongyue Zhang, Faxin Cao, Zhizheng Wang, Yuanyu Zhang, Yuanyuan Sun, Mark Anthony Vicente","doi":"10.1016/j.aiopen.2022.11.002","DOIUrl":"https://doi.org/10.1016/j.aiopen.2022.11.002","url":null,"abstract":"","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"5 1","pages":"172-181"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78994593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
HSSDA: Hierarchical relation aided Semi-Supervised Domain Adaptation 层次关系辅助半监督领域自适应
AI Open Pub Date : 2022-11-01 DOI: 10.1016/j.aiopen.2022.11.001
Xiechao Guo, R. Liu, Dandan Song
{"title":"HSSDA: Hierarchical relation aided Semi-Supervised Domain Adaptation","authors":"Xiechao Guo, R. Liu, Dandan Song","doi":"10.1016/j.aiopen.2022.11.001","DOIUrl":"https://doi.org/10.1016/j.aiopen.2022.11.001","url":null,"abstract":"","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"47 1","pages":"156-161"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78388787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimized separable convolution: Yet another efficient convolution operator 优化的可分离卷积:另一个有效的卷积算子
AI Open Pub Date : 2022-10-01 DOI: 10.2139/ssrn.4245175
Tao Wei, Yonghong Tian, Yaowei Wang, Yun Liang, C. Chen
{"title":"Optimized separable convolution: Yet another efficient convolution operator","authors":"Tao Wei, Yonghong Tian, Yaowei Wang, Yun Liang, C. Chen","doi":"10.2139/ssrn.4245175","DOIUrl":"https://doi.org/10.2139/ssrn.4245175","url":null,"abstract":"The convolution operation is the most critical component in recent surge of deep learning research. Conventional 2D convolution needs O ( C 2 K 2 ) parameters to represent, where C is the channel size and K is the kernel size. The amount of parameters has become really costly considering that these parameters increased tremendously recently to meet the needs of demanding applications. Among various implementations of the convolution, separable convolution has been proven to be more efficient in reducing the model size. For example, depth separable convolution reduces the complexity to O ( C · ( C + K 2 )) while spatial separable convolution reduces the complexity to O ( C 2 K ) . However, these are considered ad hoc designs which cannot ensure that they can in general achieve optimal separation. In this research, we propose a novel and principled operator called optimized separable convolution by optimal design for the internal number of groups and kernel sizes for general separable convolutions can achieve the complexity of O ( C 32 K ) . When the restriction in the number of separated convolutions can be lifted, an even lower complexity at O ( C · log( CK 2 )) can be achieved. Experimental results demonstrate that the proposed optimized separable convolution is able to achieve an improved performance in terms of accuracy-#Params trade-offs over both conventional, depth-wise, and depth/spatial separable convolutions.","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"40 1","pages":"162-171"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85236498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信