Data Intelligence最新文献_第5页

A Survey on Automatic Delineation of Radiotherapy Target Volume based on Machine Learning 基于机器学习的放射治疗靶体积自动绘制研究综述

IF 3.9 3区计算机科学

Data Intelligence Pub Date : 2023-02-11 DOI: 10.1162/dint_a_00204

Zhenchao Tao, Shengfei Lyu

引用次数: 0

Auto Insurance Fraud Detection with Multimodal Learning 基于多模态学习的汽车保险欺诈检测

IF 3.9 3区计算机科学

Data Intelligence Pub Date : 2023-02-09 DOI: 10.1162/dint_a_00191

Jiaxi Yang, Kui Chen, Kai Ding, Chongning Na, Meng Wang

引用次数: 0

Research e-infrastructures for open science: The national example of CSTCloud in China 研究开放科学的电子基础设施：CSTCloud在中国的全国性实例

IF 3.9 3区计算机科学

Data Intelligence Pub Date : 2023-02-09 DOI: 10.1162/dint_a_00196

Lili Zhang, Jianhui Li, P. Uhlir, Liangming Wen, Kaichao Wu, Ze Luo, Yude Liu

引用次数: 0

Towards Text-to-SQL over Aggregate Tables 聚合表上的文本到SQL

IF 3.9 3区计算机科学

Data Intelligence Pub Date : 2023-02-09 DOI: 10.1162/dint_a_00194

Shuqin Li, Kaibin Zhou, Zeyang Zhuang, Haofen Wang, Jun Ma

{"title":"Towards Text-to-SQL over Aggregate Tables","authors":"Shuqin Li, Kaibin Zhou, Zeyang Zhuang, Haofen Wang, Jun Ma","doi":"10.1162/dint_a_00194","DOIUrl":"https://doi.org/10.1162/dint_a_00194","url":null,"abstract":"ABSTRACT Text-to-SQL aims at translating textual questions into the corresponding SQL queries. Aggregate tables are widely created for high-frequent queries. Although text-to-SQL has emerged as an important task, recent studies paid little attention to the task over aggregate tables. The increased aggregate tables bring two challenges: (1) mapping of natural language questions and relational databases will suffer from more ambiguity, (2) modern models usually adopt self-attention mechanism to encode database schema and question. The mechanism is of quadratic time complexity, which will make inferring more time-consuming as input sequence length grows. In this paper, we introduce a novel approach named WAGG for text-to-SQL over aggregate tables. To effectively select among ambiguous items, we propose a relation selection mechanism for relation computing. To deal with high computation costs, we introduce a dynamical pruning strategy to discard unrelated items that are common for aggregate tables. We also construct a new large-scale dataset SpiderwAGG extended from Spider dataset for validation, where extensive experiments show the effectiveness and efficiency of our proposed method with 4% increase of accuracy and 15% decrease of inference time w.r.t a strong baseline RAT-SQL.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":"5 1","pages":"457-474"},"PeriodicalIF":3.9,"publicationDate":"2023-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41824177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Metadata as a Methodological Commons: From Aboutness Description to Cognitive Modeling 元数据作为一种方法论共享：从能力描述到认知建模

IF 3.9 3区计算机科学

Data Intelligence Pub Date : 2023-02-07 DOI: 10.1162/dint_a_00189

Wei Liu, Yaming Fu, Qianqian Liu

{"title":"Metadata as a Methodological Commons: From Aboutness Description to Cognitive Modeling","authors":"Wei Liu, Yaming Fu, Qianqian Liu","doi":"10.1162/dint_a_00189","DOIUrl":"https://doi.org/10.1162/dint_a_00189","url":null,"abstract":"ABSTRACT Metadata is data about data, which is generated mainly for resources organization and description, facilitating finding, identifying, selecting and obtaining information①. With the advancement of technologies, the acquisition of metadata has gradually become a critical step in data modeling and function operation, which leads to the formation of its methodological commons. A series of general operations has been developed to achieve structured description, semantic encoding and machine-understandable information, including entity definition, relation description, object analysis, attribute extraction, ontology modeling, data cleaning, disambiguation, alignment, mapping, relating, enriching, importing, exporting, service implementation, registry and discovery, monitoring etc. Those operations are not only necessary elements in semantic technologies (including linked data) and knowledge graph technology, but has also developed into the common operation and primary strategy in building independent and knowledge-based information systems. In this paper, a series of metadata-related methods are collectively referred to as ‘metadata methodological commons’, which has a lot of best practices reflected in the various standard specifications of the Semantic Web. In the future construction of a multi-modal metaverse based on Web 3.0, it shall play an important role, for example, in building digital twins through adopting knowledge models, or supporting the modeling of the entire virtual world, etc. Manual-based description and coding obviously cannot adapted to the UGC (User Generated Contents) and AIGC (AI Generated Contents)-based content production in the metaverse era. The automatic processing of semantic formalization must be considered as a sure way to adapt metadata methodological commons to meet the future needs of AI era.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":"5 1","pages":"289-302"},"PeriodicalIF":3.9,"publicationDate":"2023-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48210399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Few-shot Named Entity Recognition with Joint Token and Sentence Awareness 基于联合标记和句子感知的少镜头命名实体识别

IF 3.9 3区计算机科学

Data Intelligence Pub Date : 2023-01-09 DOI: 10.1162/dint_a_00195

Wen Wen, Yongbin Liu, Qiang Lin, Chunping Ouyang

引用次数: 0

MillenniumDB: An Open-Source Graph Database System 一个开源的图形数据库系统

3区计算机科学

Data Intelligence Pub Date : 2023-01-01 DOI: 10.1162/dint_a_00229

Domagoj Vrgoč, Carlos Rojas, Renzo Angles, Marcelo Arenas, Diego Arroyuelo, Carlos Buil-Aranda, Aidan Hogan, Gonzalo Navarro, Cristian Riveros, Juan Romero

{"title":"MillenniumDB: An Open-Source Graph Database System","authors":"Domagoj Vrgoč, Carlos Rojas, Renzo Angles, Marcelo Arenas, Diego Arroyuelo, Carlos Buil-Aranda, Aidan Hogan, Gonzalo Navarro, Cristian Riveros, Juan Romero","doi":"10.1162/dint_a_00229","DOIUrl":"https://doi.org/10.1162/dint_a_00229","url":null,"abstract":"ABSTRACT In this systems paper, we present MillenniumDB: a novel graph database engine that is modular, persistent, and open source. MillenniumDB is based on a graph data model, which we call domain graphs, that provides a simple abstraction upon which a variety of popular graph models can be supported, thus providing a flexible data management engine for diverse types of knowledge graph. The engine itself is founded on a combination of tried and tested techniques from relational data management, state-of-the-art algorithms for worst-case-optimal joins, as well as graph-specific algorithms for evaluating path queries. In this paper, we present the main design principles underlying MillenniumDB, describing the abstract graph model and query semantics supported, the concrete data model and query syntax implemented, as well as the storage, indexing, query planning and query evaluation techniques used. We evaluate MillenniumDB over real-world data and queries from the Wikidata knowledge graph, where we find that it outperforms other popular persistent graph database engines (including both enterprise and open source alternatives) that support similar query features.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135401943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A Knowledge Graph-Based Deep Learning Framework for Efficient Content Similarity Search of Sustainable Development Goals Data 基于知识图的可持续发展目标数据内容相似度高效搜索深度学习框架

3区计算机科学

Data Intelligence Pub Date : 2023-01-01 DOI: 10.1162/dint_a_00230

Irene Kilanioti, George A. Papadopoulos

{"title":"A Knowledge Graph-Based Deep Learning Framework for Efficient Content Similarity Search of Sustainable Development Goals Data","authors":"Irene Kilanioti, George A. Papadopoulos","doi":"10.1162/dint_a_00230","DOIUrl":"https://doi.org/10.1162/dint_a_00230","url":null,"abstract":"ABSTRACT Sustainable development denotes the enhancement of living standards in the present without compromising future generations’ resources. Sustainable Development Goals (SDGs) quantify the accomplishment of sustainable development and pave the way for a world worth living in for future generations. Scholars can contribute to the achievement of the SDGs by guiding the actions of practitioners based on the analysis of SDG data, as intended by this work. We propose a framework of algorithms based on dimensionality reduction methods with the use of Hilbert Space Filling Curves (HSFCs) in order to semantically cluster new uncategorised SDG data and novel indicators, and efficiently place them in the environment of a distributed knowledge graph store. First, a framework of algorithms for insertion of new indicators and projection on the HSFC curve based on their transformer-based similarity assessment, for retrieval of indicators and load-balancing along with an approach for data classification of entrant-indicators is described. Then, a thorough case study in a distributed knowledge graph environment experimentally evaluates our framework. The results are presented and discussed in light of theory along with the actual impact that can have for practitioners analysing SDG data, including intergovernmental organizations, government agencies and social welfare organizations. Our approach empowers SDG knowledge graphs for causal analysis, inference, and manifold interpretations of the societal implications of SDG-related actions, as data are accessed in reduced retrieval times. It facilitates quicker measurement of influence of users and communities on specific goals and serves for faster distributed knowledge matching, as semantic cohesion of data is preserved.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135400885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Research on Intelligent Organization and Application of Multi-source Heterogeneous Knowledge Resources for Energy Internet 面向能源互联网的多源异构知识资源智能组织与应用研究

IF 3.9 3区计算机科学

Data Intelligence Pub Date : 2023-01-01 DOI: 10.1162/dint_a_00158

Yuxuan Wang, Liqun Luo, Guangjian Li

引用次数: 1

Knowledge Graph based Mutual Attention for Machine Reading Comprehension over Anti-Terrorism Corpus 基于知识图的反恐语料库机器阅读理解相互关注

3区计算机科学

Data Intelligence Pub Date : 2023-01-01 DOI: 10.1162/dint_a_00210

Feng Gao, Jin Hou, Jinguang Gu, Lihua Zhang

引用次数: 0