{"title":"Few-Shot Multi-label Aspect Category Detection Utilizing Prototypical Network with Sentence-Level Weighting and Label Augmentation","authors":"Zeyu Wang, M. Iwaihara","doi":"10.1007/978-3-031-39821-6_30","DOIUrl":"https://doi.org/10.1007/978-3-031-39821-6_30","url":null,"abstract":"","PeriodicalId":334566,"journal":{"name":"International Conference on Database and Expert Systems Applications","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132649812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"PrivSketch: A Private Sketch-based Frequency Estimation Protocol for Data Streams","authors":"Ying Li, Xiaodong Lee, Botao Peng, Themis Palpanas, Jing'an Xue","doi":"10.48550/arXiv.2306.12144","DOIUrl":"https://doi.org/10.48550/arXiv.2306.12144","url":null,"abstract":"Local differential privacy (LDP) has recently become a popular privacy-preserving data collection technique protecting users' privacy. The main problem of data stream collection under LDP is the poor utility due to multi-item collection from a very large domain. This paper proposes PrivSketch, a high-utility frequency estimation protocol taking advantage of sketches, suitable for private data stream collection. Combining the proposed background information and a decode-first collection-side workflow, PrivSketch improves the utility by reducing the errors introduced by the sketching algorithm and the privacy budget utilization when collecting multiple items. We analytically prove the superior accuracy and privacy characteristics of PrivSketch, and also evaluate them experimentally. Our evaluation, with several diverse synthetic and real datasets, demonstrates that PrivSketch is 1-3 orders of magnitude better than the competitors in terms of utility in both frequency estimation and frequent item estimation, while being up to ~100x faster.","PeriodicalId":334566,"journal":{"name":"International Conference on Database and Expert Systems Applications","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126296479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Confidential Truth Finding with Multi-Party Computation (Extended Version)","authors":"Angelo Saadeh, P. Senellart, S. Bressan","doi":"10.48550/arXiv.2305.14727","DOIUrl":"https://doi.org/10.48550/arXiv.2305.14727","url":null,"abstract":"Federated knowledge discovery and data mining are challenged to assess the trustworthiness of data originating from autonomous sources while protecting confidentiality and privacy. Truth-finding algorithms help corroborate data from disagreeing sources. For each query it receives, a truth-finding algorithm predicts a truth value of the answer, possibly updating the trustworthiness factor of each source. Few works, however, address the issues of confidentiality and privacy. We devise and present a secure secret-sharing-based multi-party computation protocol for pseudo-equality tests that are used in truth-finding algorithms to compute additions depending on a condition. The protocol guarantees confidentiality of the data and privacy of the sources. We also present variants of truth-finding algorithms that would make the computation faster when executed using secure multi-party computation. We empirically evaluate the performance of the proposed protocol on two state-of-the-art truth-finding algorithms, Cosine, and 3-Estimates, and compare them with that of the baseline plain algorithms. The results confirm that the secret-sharing-based secure multi-party algorithms are as accurate as the corresponding baselines but for proposed numerical approximations that significantly reduce the efficiency loss incurred.","PeriodicalId":334566,"journal":{"name":"International Conference on Database and Expert Systems Applications","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130210603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mingming Qiu, Elie Najm, Rémi Sharrock, Bruno Traverson
{"title":"PBRE: A Rule Extraction Method from Trained Neural Networks Designed for Smart Home Services","authors":"Mingming Qiu, Elie Najm, Rémi Sharrock, Bruno Traverson","doi":"10.48550/arXiv.2207.08814","DOIUrl":"https://doi.org/10.48550/arXiv.2207.08814","url":null,"abstract":"Designing smart home services is a complex task when multiple services with a large number of sensors and actuators are deployed simultaneously. It may rely on knowledge-based or data-driven approaches. The former can use rule-based methods to design services statically, and the latter can use learning methods to discover inhabitants' preferences dynamically. However, neither of these approaches is entirely satisfactory because rules cannot cover all possible situations that may change, and learning methods may make decisions that are sometimes incomprehensible to the inhabitant. In this paper, PBRE (Pedagogic Based Rule Extractor) is proposed to extract rules from learning methods to realize dynamic rule generation for smart home systems. The expected advantage is that both the explainability of rule-based methods and the dynamicity of learning methods are adopted. We compare PBRE with an existing rule extraction method, and the results show better performance of PBRE. We also apply PBRE to extract rules from a smart home service represented by an NRL (Neural Network-based Reinforcement Learning). The results show that PBRE can help the NRL-simulated service to make understandable suggestions to the inhabitant.","PeriodicalId":334566,"journal":{"name":"International Conference on Database and Expert Systems Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129498541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zaishuo Xia, Zelin Li, Yanbing Bai, Jinze Yu, B. Adriano
{"title":"Self-Supervised Learning for Building Damage Assessment from Large-scale xBD Satellite Imagery Benchmark Datasets","authors":"Zaishuo Xia, Zelin Li, Yanbing Bai, Jinze Yu, B. Adriano","doi":"10.48550/arXiv.2205.15688","DOIUrl":"https://doi.org/10.48550/arXiv.2205.15688","url":null,"abstract":"In the field of post-disaster assessment, for timely and accurate rescue and localization after a disaster, people need to know the location of damaged buildings. In deep learning, some scholars have proposed methods to make automatic and highly accurate building damage assessments by remote sensing images, which are proved to be more efficient than assessment by domain experts. However, due to the lack of a large amount of labeled data, these kinds of tasks can suffer from being able to do an accurate assessment, as the efficiency of deep learning models relies highly on labeled data. Although existing semi-supervised and unsupervised studies have made breakthroughs in this area, none of them has completely solved this problem. Therefore, we propose adopting a self-supervised comparative learning approach to address the task without the requirement of labeled data. We constructed a novel asymmetric twin network architecture and tested its performance on the xBD dataset. Experiment results of our model show the improvement compared to baseline and commonly used methods. We also demonstrated the potential of self-supervised methods for building damage recognition awareness.","PeriodicalId":334566,"journal":{"name":"International Conference on Database and Expert Systems Applications","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124493255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DeepCore: A Comprehensive Library for Coreset Selection in Deep Learning","authors":"Chengcheng Guo, B. Zhao, Yanbing Bai","doi":"10.48550/arXiv.2204.08499","DOIUrl":"https://doi.org/10.48550/arXiv.2204.08499","url":null,"abstract":"Coreset selection, which aims to select a subset of the most informative training samples, is a long-standing learning problem that can benefit many downstream tasks such as data-efficient learning, continual learning, neural architecture search, active learning, etc. However, many existing coreset selection methods are not designed for deep learning, which may have high complexity and poor generalization performance. In addition, the recently proposed methods are evaluated on models, datasets, and settings of different complexities. To advance the research of coreset selection in deep learning, we contribute a comprehensive code library, namely DeepCore, and provide an empirical study on popular coreset selection methods on CIFAR10 and ImageNet datasets. Extensive experiments on CIFAR10 and ImageNet datasets verify that, although various methods have advantages in certain experiment settings, random selection is still a strong baseline.","PeriodicalId":334566,"journal":{"name":"International Conference on Database and Expert Systems Applications","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126298249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Syntax-informed Question Answering with Heterogeneous Graph Transformer","authors":"Fangyi Zhu, Lok You Tan, See-Kiong Ng, S. Bressan","doi":"10.48550/arXiv.2204.09655","DOIUrl":"https://doi.org/10.48550/arXiv.2204.09655","url":null,"abstract":"Large neural language models are steadily contributing state-of-the-art performance to question answering and other natural language and information processing tasks. These models are expensive to train. We propose to evaluate whether such pre-trained models can benefit from the addition of explicit linguistics information without requiring retraining from scratch. We present a linguistics-informed question answering approach that extends and fine-tunes a pre-trained transformer-based neural language model with symbolic knowledge encoded with a heterogeneous graph transformer. We illustrate the approach by the addition of syntactic information in the form of dependency and constituency graphic structures connecting tokens and virtual vertices. A comparative empirical performance evaluation with BERT as its baseline and with Stanford Question Answering Dataset demonstrates the competitiveness of the proposed approach. We argue, in conclusion and in the light of further results of preliminary experiments, that the approach is extensible to further linguistics information including semantics and pragmatics.","PeriodicalId":334566,"journal":{"name":"International Conference on Database and Expert Systems Applications","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128964564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. D’Souza, A. Monteverdi, Muhammad Haris, M. Anteghini, K. Farfar, M. Stocker, V. A. M. D. Santos, S. Auer
{"title":"The Digitalization of Bioassays in the Open Research Knowledge Graph","authors":"J. D’Souza, A. Monteverdi, Muhammad Haris, M. Anteghini, K. Farfar, M. Stocker, V. A. M. D. Santos, S. Auer","doi":"10.48550/arXiv.2203.14574","DOIUrl":"https://doi.org/10.48550/arXiv.2203.14574","url":null,"abstract":"Background: Recent years are seeing a growing impetus in the semantification of scholarly knowledge at the fine-grained level of scientific entities in knowledge graphs. The Open Research Knowledge Graph (ORKG) https://www.orkg.org/ represents an important step in this direction, with thousands of scholarly contributions as structured, fine-grained, machine-readable data. There is a need, however, to engender change in traditional community practices of recording contributions as unstructured, non-machine-readable text. For this in turn, there is a strong need for AI tools designed for scientists that permit easy and accurate semantification of their scholarly contributions. We present one such tool, ORKG-assays. Implementation: ORKG-assays is a freely available AI micro-service in ORKG written in Python designed to assist scientists obtain semantified bioassays as a set of triples. It uses an AI-based clustering algorithm which on gold-standard evaluations over 900 bioassays with 5,514 unique property-value pairs for 103 predicates shows competitive performance. Results and Discussion: As a result, semantified assay collections can be surveyed on the ORKG platform via tabulation or chart-based visualizations of key property values of the chemicals and compounds offering smart knowledge access to biochemists and pharmaceutical researchers in the advancement of drug development.","PeriodicalId":334566,"journal":{"name":"International Conference on Database and Expert Systems Applications","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130162313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Himanshu Batra, N. Punn, S. K. Sonbhadra, Sonali Agarwal
{"title":"BERT-Based Sentiment Analysis: A Software Engineering Perspective","authors":"Himanshu Batra, N. Punn, S. K. Sonbhadra, Sonali Agarwal","doi":"10.1007/978-3-030-86472-9_13","DOIUrl":"https://doi.org/10.1007/978-3-030-86472-9_13","url":null,"abstract":"","PeriodicalId":334566,"journal":{"name":"International Conference on Database and Expert Systems Applications","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115461884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Querying collections of tree-structured records in the presence of within-record referential constraints","authors":"F. Afrati, M. Damigos","doi":"10.1007/978-3-030-86472-9_26","DOIUrl":"https://doi.org/10.1007/978-3-030-86472-9_26","url":null,"abstract":"","PeriodicalId":334566,"journal":{"name":"International Conference on Database and Expert Systems Applications","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115069741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}