Proceedings of the 22nd International Conference on Information Integration and Web-based Applications & Services最新文献

筛选
英文 中文
A Comparison of Two Database Partitioning Approaches that Support Taxonomy-Based Query Answering 支持基于分类的查询应答的两种数据库分区方法的比较
J. Schäfer, L. Wiese
{"title":"A Comparison of Two Database Partitioning Approaches that Support Taxonomy-Based Query Answering","authors":"J. Schäfer, L. Wiese","doi":"10.1145/3428757.3429108","DOIUrl":"https://doi.org/10.1145/3428757.3429108","url":null,"abstract":"In this paper we address the topic of identification of cohorts of similar patients in a database of electronic health records. We follow the conjecture that retrieval of similar patients can be supported by an underlying distributed database design. Hence we propose a fragmentation based on partitioning the health records and present a benchmark of two implementation variants in comparison to an off-the-shelf data distribution approach provided by Apache Ignite. While our main use case in this paper is cohort identification, our approach has advantages for taxonomy-based query answering in other (non-medical) domains.","PeriodicalId":212557,"journal":{"name":"Proceedings of the 22nd International Conference on Information Integration and Web-based Applications & Services","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115027186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Proceedings of the 22nd International Conference on Information Integration and Web-based Applications & Services 第22届信息集成与基于网络的应用与服务国际会议论文集
{"title":"Proceedings of the 22nd International Conference on Information Integration and Web-based Applications & Services","authors":"","doi":"10.1145/3428757","DOIUrl":"https://doi.org/10.1145/3428757","url":null,"abstract":"","PeriodicalId":212557,"journal":{"name":"Proceedings of the 22nd International Conference on Information Integration and Web-based Applications & Services","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128229139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
KNNAC KNNAC
Yao Zhang, Yifeng Lu, Thomas Seidl
{"title":"KNNAC","authors":"Yao Zhang, Yifeng Lu, Thomas Seidl","doi":"10.1145/3428757.3429135","DOIUrl":"https://doi.org/10.1145/3428757.3429135","url":null,"abstract":"Density-based clustering algorithms are commonly adopted when arbitrarily shaped clusters exist. Usually, they do not need to know the number of clusters in prior, which is a big advantage. Conventional density-based approaches such as DBSCAN, utilize two parameters to define density. Recently, novel density-based clustering algorithms are proposed to reduce the problem complexity to the use of a single parameter k by utilizing the concepts of k Nearest Neighbor (kNN) and Reverse k Nearest Neighbor (RkNN) to define density. However, those kNN-based approaches are either ineffective or inefficient. In this paper, we present a new clustering algorithm KNNAC, which only requires computing the densities for a chosen subset of points due to the use of active core detection. We empirically show that, compared to other nearest neighbor based clustering approaches (e.g., RECORD, IS-DBSCAN, etc.), KNNAC can provide competitive performance while taking a fraction of the runtime.","PeriodicalId":212557,"journal":{"name":"Proceedings of the 22nd International Conference on Information Integration and Web-based Applications & Services","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129404744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A new Multi-Agents System based on Blockchain for Prediction Anomaly from System Logs 基于区块链的多智能体系统日志异常预测
Arwa Binlashram, Hajer Bouricha, L. Hsairi, Haneen Al Ahmadi
{"title":"A new Multi-Agents System based on Blockchain for Prediction Anomaly from System Logs","authors":"Arwa Binlashram, Hajer Bouricha, L. Hsairi, Haneen Al Ahmadi","doi":"10.1145/3428757.3429149","DOIUrl":"https://doi.org/10.1145/3428757.3429149","url":null,"abstract":"The execution traces generated by an application contain information that the developers believed would be useful in debugging or monitoring the application, it contains application states and significant events at various critical points that help them gain insight into failures and identify and predict potential problems before they occur. Despite the ubiquity of these traces universally in almost all computer systems, they are rarely exploited because they are not readily machine-parsable. In this paper, we propose a Multi-Agents approach for prediction process using Blockchain technology, which allows automatically analysis of execution traces and detects early warning signals for system failure prediction during executing. The proposed prediction approach is constructed using a four-layer Multi-Agents system architecture. The proposed prediction approach performance is based on data prepossessing and supervised learning algorithms for prediction. Blockchain was used to coordinate collaboration between agents, and to synchronize prediction between agents and the administrators. We validated our approach by applying it to real-world distributed systems, where we predicted problems before they occurred with high accuracy. In this paper we will focus on the Architecture of our prediction approach.","PeriodicalId":212557,"journal":{"name":"Proceedings of the 22nd International Conference on Information Integration and Web-based Applications & Services","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132863325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analysis and Comparison of Block-Splitting-Based Load Balancing Strategies for Parallel Entity Resolution 基于块分割的并行实体解析负载均衡策略分析与比较
Xiao Chen, Nishanth Entoor Venkatarathnam, Kirity Rapuru, David Broneske, Gabriel Campero Durand, Roman Zoun, G. Saake
{"title":"Analysis and Comparison of Block-Splitting-Based Load Balancing Strategies for Parallel Entity Resolution","authors":"Xiao Chen, Nishanth Entoor Venkatarathnam, Kirity Rapuru, David Broneske, Gabriel Campero Durand, Roman Zoun, G. Saake","doi":"10.1145/3428757.3429140","DOIUrl":"https://doi.org/10.1145/3428757.3429140","url":null,"abstract":"Entity resolution (ER) is a process to identify records that refer to the same real-world entity. In recent years, facing the ever-increasing data volume, both blocking techniques and parallel computation have been proposed for ER to reduce its running time and improve efficiency. It is popular and convenient to apply the MapReduce programming model for parallel computation. With the default load balancing strategy, if the block sizes are skewed, an imbalanced reducer load will occur and significantly increase the runtime. One possible solution is block-splitting: breaking the overpopulated blocks into smaller sub-blocks, to improve efficiency. In this paper we analyze the advantages and disadvantages of state-of-the-art block splitting methods (BlockSplit and BlockSlicer), and we propose two approaches: TLS and BOS to overcome the identified drawbacks. We comprehensively evaluate and compare our proposed solutions, with Spark implementations, using real-world and synthetic datasets with different properties. The results show that all of them can balance the reducer load with the help of the greedy partition assignment strategy. When memory of used cluster is not abundant given a dataset, a high number of reducers is required to reduce the GC time to improve efficiency. Partitcularly, our TLS and BOS have overwelmingly lower overhead due to the ability of block-wise composite key assignment.","PeriodicalId":212557,"journal":{"name":"Proceedings of the 22nd International Conference on Information Integration and Web-based Applications & Services","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126964513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Patten Matcher for English Idioms on Web IndeX 网络索引英语习语的模式匹配器
Takumi Shinzato, Jun Nemoto, Motomichi Toyama
{"title":"A Patten Matcher for English Idioms on Web IndeX","authors":"Takumi Shinzato, Jun Nemoto, Motomichi Toyama","doi":"10.1145/3428757.3429136","DOIUrl":"https://doi.org/10.1145/3428757.3429136","url":null,"abstract":"Web Index (WIX in short) is a system that achieves joining information resources on the Web. WIX replaces keywords in Web documents hyperlinks to other web pages based on a WIX file that a user chose. WIX file is a kind of a dictionary that have a set of WIX entries (keyword and target URL). Using WIX, users can join any Web contents and arbitrary dictionaries. In conventional WIX, matching and linking are executed only for fixed character strings between the keyword set and the input text. However, when a user wants to search for phrases like idioms, this matching system is not sufficient because of the declension of words, change of the verb tense, and so on. Therefore, we propose a phrasal pattern matching mechanism on WIX. This helps users easily find idiom expressions in the text on the web and get more information.","PeriodicalId":212557,"journal":{"name":"Proceedings of the 22nd International Conference on Information Integration and Web-based Applications & Services","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125840140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Music Discovery as Differentiation Strategy for Streaming Providers 音乐发现作为流媒体提供商的差异化策略
Andreas Raff, Andreas Mladenow, C. Strauss
{"title":"Music Discovery as Differentiation Strategy for Streaming Providers","authors":"Andreas Raff, Andreas Mladenow, C. Strauss","doi":"10.1145/3428757.3429151","DOIUrl":"https://doi.org/10.1145/3428757.3429151","url":null,"abstract":"Music discovery presents itself in an instant and in a multitude of possible ways. When comparing the user personas of streaming services in the dimension of music discovery, two main differentiation criteria become apparent, namely the degree of intention and the control one wants to exert when discovering new music. Against this background, this paper proposes a framework which categorises the possible ways of music discovery in a streaming provider with the help of those two criteria into active, semi-active, semi-passive and passive ways in order to analyse them separately, outline success factors and current research.","PeriodicalId":212557,"journal":{"name":"Proceedings of the 22nd International Conference on Information Integration and Web-based Applications & Services","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121190627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Rammed, or What RAM3S Taught Us RAM3S教给我们什么
Ilaria Bartolini, M. Patella
{"title":"Rammed, or What RAM3S Taught Us","authors":"Ilaria Bartolini, M. Patella","doi":"10.1145/3428757.3429098","DOIUrl":"https://doi.org/10.1145/3428757.3429098","url":null,"abstract":"RAM3S (Real-time Analysis of Massive MultiMedia Streams) is a framework that acts as a middleware software layer between multimedia stream analysis techniques and Big Data streaming platforms, so as to facilitate the implementation of the former on top of the latter. Indeed, the use of Big Data platforms can give way to the efficient management and analysis of large data amounts, but they require the user to concentrate on issues related to distributed computing, since their services are often too raw. The use of RAM3S greatly simplifies deploying non-parallel techniques to platforms like Apache Storm or Apache Flink, a fact that is demonstrated by the four different use cases we describe here. We detail the lessons we learned from exploiting RAM3S to implement the detailed use cases.","PeriodicalId":212557,"journal":{"name":"Proceedings of the 22nd International Conference on Information Integration and Web-based Applications & Services","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125358993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transfer Learning in Classifying Prescriptions and Keyword-based Medical Notes 迁移学习在处方分类和基于关键词的医学笔记中的应用
Mir Moynuddin Ahmed Shibly, Tahmina Akter Tisha, K. Islam, Md. Mohsin Uddin
{"title":"Transfer Learning in Classifying Prescriptions and Keyword-based Medical Notes","authors":"Mir Moynuddin Ahmed Shibly, Tahmina Akter Tisha, K. Islam, Md. Mohsin Uddin","doi":"10.1145/3428757.3429139","DOIUrl":"https://doi.org/10.1145/3428757.3429139","url":null,"abstract":"Medical text classification is one of the primary steps of health care automation. Diagnosing disease at the right time, and going to the right doctor is important for patients. To do that, two types of medical texts were classified into some medical specialties in this study. The first one is the keywords-based medical notes and the second one is the prescriptions. There are many methods and techniques to classify texts from any domain. But, textual resources of a specific domain can be inadequate to build a sustainable and accurate classifier. This problem can be solved by incorporating transfer learning. The objective of this study is to analyze the prospects of transfer learning in medical text classification. To do that, a transfer learning system has been created for classification tasks by fine-tuning Bidirectional Encoder Representations from Transformers aka the BERT language model, and its performance has been compared with three deep learning models - multi-layer perceptron, long short-term memory, and convolutional neural network. The fine-tuned BERT model has shown the best performance among all the other models in both classification tasks. It has 0.84 and 0.96 weighted f1-score in classifying medical notes and prescriptions respectively. This study has proved that transfer learning can be used in medical text classification, and significant improvement in performance can be achieved through it.","PeriodicalId":212557,"journal":{"name":"Proceedings of the 22nd International Conference on Information Integration and Web-based Applications & Services","volume":"4647 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122696031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mitigating Effect of Dictionary Matching Errors in Distantly Supervised Named Entity Recognition 字典匹配错误在远程监督命名实体识别中的缓解作用
Koga Kobayashi, Kei Wakabayashi
{"title":"Mitigating Effect of Dictionary Matching Errors in Distantly Supervised Named Entity Recognition","authors":"Koga Kobayashi, Kei Wakabayashi","doi":"10.1145/3428757.3429142","DOIUrl":"https://doi.org/10.1145/3428757.3429142","url":null,"abstract":"Named entity recognition (NER) is a fundamental technique that brings basic semantic awareness to natural language processing applications and services. Since we need a large amount of training data to train a custom NER model, distant supervision that leverages named entity dictionaries is expected to be a promising approach to train NER models quickly. However, dictionary matching causes a considerable number of errors that deteriorates both precision and recall of the final NER models, and we need to mitigate its effect. In this study, we particularly aim at improving precision of NER models by accounting for dictionary matching errors. Experimental results show that the proposed method can achieve an improvement of precisions especially under poor dictionary performance conditions.","PeriodicalId":212557,"journal":{"name":"Proceedings of the 22nd International Conference on Information Integration and Web-based Applications & Services","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124776615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信