AI OpenPub Date : 2021-01-01DOI: 10.1016/j.aiopen.2021.06.001
Sha Yuan , Hanyu Zhao , Zhengxiao Du , Ming Ding , Xiao Liu , Yukuo Cen , Xu Zou , Zhilin Yang , Jie Tang
{"title":"WuDaoCorpora: A super large-scale Chinese corpora for pre-training language models","authors":"Sha Yuan , Hanyu Zhao , Zhengxiao Du , Ming Ding , Xiao Liu , Yukuo Cen , Xu Zou , Zhilin Yang , Jie Tang","doi":"10.1016/j.aiopen.2021.06.001","DOIUrl":"10.1016/j.aiopen.2021.06.001","url":null,"abstract":"<div><p>Using large-scale training data to build a pre-trained language model (PLM) with a larger volume of parameters can significantly improve downstream tasks. For example, OpenAI trained the GPT3 model with 175 billion parameters on 570 GB English training data, enabling downstream applications building with only a small number of samples. However, there is a lack of Chinese corpus to support large-scale PLMs. This paper introduces a super large-scale Chinese corpora WuDaoCorpora, containing about 3 TB training data and 1.08 trillion Chinese characters. We also release the base version of WuDaoCorpora, containing about 200 GB training data and 72 billion Chinese characters. As a baseline, we train a model transformer-XL with 3 billion parameters on the base version to test the corpora's effect. The results show that the models trained on this corpora can achieve excellent performance in Chinese. The data and model are available at <span>https://data.wudaoai.cn</span><svg><path></path></svg> and <span>https://github.com/THUDM/Chinese-Transformer-XL</span><svg><path></path></svg>, respectively.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"2 ","pages":"Pages 65-68"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.aiopen.2021.06.001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81139879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AI OpenPub Date : 2021-01-01DOI: 10.1016/j.aiopen.2021.05.003
Zhengyan Zhang , Fanchao Qi , Zhiyuan Liu , Qun Liu , Maosong Sun
{"title":"Know what you don't need: Single-Shot Meta-Pruning for attention heads","authors":"Zhengyan Zhang , Fanchao Qi , Zhiyuan Liu , Qun Liu , Maosong Sun","doi":"10.1016/j.aiopen.2021.05.003","DOIUrl":"10.1016/j.aiopen.2021.05.003","url":null,"abstract":"<div><p>Deep pre-trained Transformer models have achieved state-of-the-art results over a variety of natural language processing (NLP) tasks. By learning rich language knowledge with millions of parameters, these models are usually overparameterized and significantly increase the computational overhead in applications. It is intuitive to address this issue by model compression. In this work, we propose a method, called Single-Shot Meta-Pruning, to compress deep pre-trained Transformers before fine-tuning. Specifically, we focus on pruning unnecessary attention heads adaptively for different downstream tasks. To measure the informativeness of attention heads, we train our Single-Shot Meta-Pruner (SMP) with a meta-learning paradigm aiming to maintain the distribution of text representations after pruning. Compared with existing compression methods for pre-trained models, our method can reduce the overhead of both fine-tuning and inference. Experimental results show that our pruner can selectively prune 50% of attention heads with little impact on the performance on downstream tasks and even provide better text representations. The source code is available at <span>https://github.com/thunlp/SMP</span><svg><path></path></svg>.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"2 ","pages":"Pages 36-42"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.aiopen.2021.05.003","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75769316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AI OpenPub Date : 2021-01-01DOI: 10.1016/j.aiopen.2021.09.003
Guangyao Li , Zequn Sun , Lei Qian , Qiang Guo , Wei Hu
{"title":"Rule-based data augmentation for knowledge graph embedding","authors":"Guangyao Li , Zequn Sun , Lei Qian , Qiang Guo , Wei Hu","doi":"10.1016/j.aiopen.2021.09.003","DOIUrl":"10.1016/j.aiopen.2021.09.003","url":null,"abstract":"<div><p>Knowledge graph (KG) embedding models suffer from the incompleteness issue of observed facts. Different from existing solutions that incorporate additional information or employ expressive and complex embedding techniques, we propose to augment KGs by iteratively mining logical rules from the observed facts and then using the rules to generate new relational triples. We incrementally train KG embeddings with the coming of new augmented triples, and leverage the embeddings to validate these new triples. To guarantee the quality of the augmented data, we filter out the noisy triples based on a propagation mechanism during the validation. The mined rules and rule groundings are human-understandable, and can make the augmentation procedure reliable. Our KG augmentation framework is applicable to any KG embedding models with no need to modify their embedding techniques. Our experiments on two popular embedding-based tasks (i.e., entity alignment and link prediction) show that the proposed framework can bring significant improvement to existing KG embedding models on most benchmark datasets.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"2 ","pages":"Pages 186-196"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651021000267/pdfft?md5=46899106b76601dcb62a0d1c184db35c&pid=1-s2.0-S2666651021000267-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87043858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AI OpenPub Date : 2020-01-01DOI: 10.1016/j.aiopen.2020.07.001
Aman Chandra Kaushik , Utkarsh Raj
{"title":"AI-driven drug discovery: A boon against COVID-19?","authors":"Aman Chandra Kaushik , Utkarsh Raj","doi":"10.1016/j.aiopen.2020.07.001","DOIUrl":"10.1016/j.aiopen.2020.07.001","url":null,"abstract":"<div><p>The COVID-19 is an issue of international concern and threat to public health and there is an urgent need of drug/vaccine design. There is no vaccine or specific drug yet made as of July 23, 2020, for the coronavirus disease (COVID-19). Thus, the patients currently can only be treated symptomatically. A quick identification of the drugs for COVID-19 may act as a potential therapeutic medication which has been used earlier in patients to answer the present pandemic condition before it could get more worse. According to our view, an artificial intelligence (AI) based tool that may predict drugs/peptides directly from the sequences of infected patients and thereby, they might have better affinity with the target and contribute towards vaccine design against COVID-19. Researchers across the world proposed several vaccines/drugs for COVID-19 utilizing AI based approaches, however, testing of these proposed vaccines/drugs will be needed to verify the safety and feasibility for combating COVID-19.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"1 ","pages":"Pages 1-4"},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.aiopen.2020.07.001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76327761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AI OpenPub Date : 2020-01-01DOI: 10.1016/j.aiopen.2021.02.004
Kang Liu , Yubo Chen , Jian Liu , Xinyu Zuo , Jun Zhao
{"title":"Extracting Events and Their Relations from Texts: A Survey on Recent Research Progress and Challenges","authors":"Kang Liu , Yubo Chen , Jian Liu , Xinyu Zuo , Jun Zhao","doi":"10.1016/j.aiopen.2021.02.004","DOIUrl":"10.1016/j.aiopen.2021.02.004","url":null,"abstract":"<div><p>Event is a common but non-negligible knowledge type. How to identify events from texts, extract their arguments, even analyze the relations between different events are important for many applications. This paper summaries some constructed event-centric knowledge graphs and the recent typical approaches for event and event relation extraction, besides task description, widely used evaluation datasets, and challenges. Specifically, in the event extraction task, we mainly focus on three recent important research problems: 1) how to learn the textual semantic representations for events in sentence-level event extraction; 2) how to extract relations across sentences or in a document level; 3) how to acquire or augment labeled instances for model training. In event relation extraction, we focus on the extraction approaches for three typical event relation types, including coreference, causal and temporal relations, respectively. Finally, we give out our conclusion and potential research issues in the future.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"1 ","pages":"Pages 22-39"},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.aiopen.2021.02.004","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78935150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AI OpenPub Date : 2020-01-01DOI: 10.1016/j.aiopen.2021.01.001
Jie Zhou , Ganqu Cui , Shengding Hu , Zhengyan Zhang , Cheng Yang , Zhiyuan Liu , Lifeng Wang , Changcheng Li , Maosong Sun
{"title":"Graph neural networks: A review of methods and applications","authors":"Jie Zhou , Ganqu Cui , Shengding Hu , Zhengyan Zhang , Cheng Yang , Zhiyuan Liu , Lifeng Wang , Changcheng Li , Maosong Sun","doi":"10.1016/j.aiopen.2021.01.001","DOIUrl":"10.1016/j.aiopen.2021.01.001","url":null,"abstract":"<div><p>Lots of learning tasks require dealing with graph data which contains rich relation information among elements. Modeling physics systems, learning molecular fingerprints, predicting protein interface, and classifying diseases demand a model to learn from graph inputs. In other domains such as learning from non-structural data like texts and images, reasoning on extracted structures (like the dependency trees of sentences and the scene graphs of images) is an important research topic which also needs graph reasoning models. Graph neural networks (GNNs) are neural models that capture the dependence of graphs via message passing between the nodes of graphs. In recent years, variants of GNNs such as graph convolutional network (GCN), graph attention network (GAT), graph recurrent network (GRN) have demonstrated ground-breaking performances on many deep learning tasks. In this survey, we propose a general design pipeline for GNN models and discuss the variants of each component, systematically categorize the applications, and propose four open problems for future research.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"1 ","pages":"Pages 57-81"},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.aiopen.2021.01.001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88004694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AI OpenPub Date : 2020-01-01DOI: 10.1016/j.aiopen.2020.11.001
Zhixing Tan , Shuo Wang , Zonghan Yang , Gang Chen , Xuancheng Huang , Maosong Sun , Yang Liu
{"title":"Neural machine translation: A review of methods, resources, and tools","authors":"Zhixing Tan , Shuo Wang , Zonghan Yang , Gang Chen , Xuancheng Huang , Maosong Sun , Yang Liu","doi":"10.1016/j.aiopen.2020.11.001","DOIUrl":"10.1016/j.aiopen.2020.11.001","url":null,"abstract":"<div><p>Machine translation (MT) is an important sub-field of natural language processing that aims to translate natural languages using computers. In recent years, end-to-end neural machine translation (NMT) has achieved great success and has become the new mainstream method in practical MT systems. In this article, we first provide a broad review of the methods for NMT and focus on methods relating to architectures, decoding, and data augmentation. Then we summarize the resources and tools that are useful for researchers. Finally, we conclude with a discussion of possible future research directions.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"1 ","pages":"Pages 5-21"},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.aiopen.2020.11.001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86830106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AI OpenPub Date : 2020-01-01DOI: 10.1016/j.aiopen.2021.02.003
Fan Zhang , Yiqun Liu , Jiaxin Mao , Min Zhang , Shaoping Ma
{"title":"User behavior modeling for Web search evaluation","authors":"Fan Zhang , Yiqun Liu , Jiaxin Mao , Min Zhang , Shaoping Ma","doi":"10.1016/j.aiopen.2021.02.003","DOIUrl":"10.1016/j.aiopen.2021.02.003","url":null,"abstract":"<div><p>Search engines are widely used in our daily life. Batch evaluation of the performance of search systems to their users has always been an essential issue in the field of information retrieval. However, batch evaluation, which usually compares different search systems based on offline collections, cannot directly take the perception of users to the systems into consideration. Recently, substantial studies have focused on proposing effective evaluation metrics that model user behavior to bring human factors in the loop of Web search evaluation. In this survey, we comprehensively review the development of user behavior modeling for Web search evaluation and related works of different model-based evaluation metrics. From the overview of these metrics, we can see how the assumptions and modeling methods of user behavior have evolved with time. We also show the methods to compare the performances of model-based evaluation metrics in terms of modeling user behavior and measuring user satisfaction. Finally, we briefly discuss some potential future research directions in this field.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"1 ","pages":"Pages 40-56"},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.aiopen.2021.02.003","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78381934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}