Proceedings of the 10th International Conference on Web Intelligence, Mining and Semantics最新文献

筛选
英文 中文
SciPuRe
Martin Lentschat, Patrice Buche, Juliette Dibie-Barthélemy, Mathieu Roche
{"title":"SciPuRe","authors":"Martin Lentschat, Patrice Buche, Juliette Dibie-Barthélemy, Mathieu Roche","doi":"10.1145/3405962.3405978","DOIUrl":"https://doi.org/10.1145/3405962.3405978","url":null,"abstract":"Retrieving entities associated with experimental data in the textual content of scientific documents faces numbers of challenges. One of them is the assessment of the extracted entities for further process, especially the identification of false positives. We present in this paper SciPuRe (Scientific Publication Representation): a new representation of entities. The extraction process presented in this paper is driven by an Ontological and Terminological Resource (OTR). It is applied to the extraction of entities associated with food packaging permeabilities, that can be symbolic (e.g. the Packaging \"low density polyethylene\") or quantitative (e.g. the Temperature \"25\", \"°C\" or the H20_Permeability \"4.34 * 10-3\", \"cm3 μm-2 d-1 kPa\"). A representation of each entity, composed of a set of features, is built during the extraction process. These features can be gathered in three categories: Ontological, Lexical and Structural. The features of SciPuRe are used to compute Relevance scores that consider the different information available for each entity extracted. Such Relevance scores inform the usefulness of SciPuRe and can then be used to rank the extraction results and discard false positives.","PeriodicalId":247414,"journal":{"name":"Proceedings of the 10th International Conference on Web Intelligence, Mining and Semantics","volume":"203 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114980772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A Novel Dataset for Fake Android Anti-Malware Detection 伪Android反恶意软件检测的新数据集
Saeed Seraj, Michalis Pavlidis, Nikolaos Polatidis
{"title":"A Novel Dataset for Fake Android Anti-Malware Detection","authors":"Saeed Seraj, Michalis Pavlidis, Nikolaos Polatidis","doi":"10.1145/3405962.3405980","DOIUrl":"https://doi.org/10.1145/3405962.3405980","url":null,"abstract":"Today in the world people are able to get all types of Android applications (apps) from the app store or various sources over the Internet. A large number of apps is being produced daily, some of which are infected with malware. Thus, the use of anti-malware identification tools is essential. At the same time, a number of attackers who exploit a number of anti-malwares have been doing obtaining information from mobile phones in various ways, such as decompiling or infecting anti-malware. Therefore, in this paper, we developed a classification dataset from collected anti-malware data looking for fraudulent anti-malware products. Additionally, we applied various machine learning algorithms and we propose a combination of algorithms which provides high accuracy over various evaluation tests, showing that our approach is both practical and effective.","PeriodicalId":247414,"journal":{"name":"Proceedings of the 10th International Conference on Web Intelligence, Mining and Semantics","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122514870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Splitting the Web Analytics Atom: From Page Metrics and KPIs to Sub-Page Metrics and KPIs 拆分Web分析原子:从页面指标和kpi到子页面指标和kpi
Ilan Kirsh, M. Joy
{"title":"Splitting the Web Analytics Atom: From Page Metrics and KPIs to Sub-Page Metrics and KPIs","authors":"Ilan Kirsh, M. Joy","doi":"10.1145/3405962.3405984","DOIUrl":"https://doi.org/10.1145/3405962.3405984","url":null,"abstract":"Web analytics Key Performance Indicators (KPIs) are important metrics used to evaluate websites and web pages against objectives. The power of KPIs is in their simplicity. Every web page can be assessed by numeric KPI values, which can be easily calculated, compared, and tracked over time. KPIs highlight the strengths and weaknesses of individual web pages and significantly help in maintaining, improving, and optimizing websites. Current web analytics metrics and KPIs, in academic studies as well as in commercial tools, relate to entire websites and web pages. This paper advocates extending KPIs use to sub-page elements, such as paragraphs, as an effective way to refine knowledge and leverage web analytics capabilities. We discuss the potential and challenges of sub-page web analytics and define a framework for calculating sub-page metrics from accumulated in-page user activity data, such as mouse and keyboard events. Then we propose potential KPIs that may be effective in highlighting the strengths and weaknesses of individual page parts, such as paragraphs. We use web usage data from a sample website to demonstrate these ideas. This study is the first step towards sub-page web analytics metrics and KPIs. Further work is required in order to gain more knowledge about potential KPIs that are introduced in this work, as well as to explore new methods, metrics, and KPIs.","PeriodicalId":247414,"journal":{"name":"Proceedings of the 10th International Conference on Web Intelligence, Mining and Semantics","volume":"52 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114127963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Semi-automatic extraction and validation of concepts in ontology learning from texts in Spanish 西班牙语文本本体学习中概念的半自动提取与验证
Manuela Gómez-Suta, J. Echeverry-Correa, José A. Soto Mejía
{"title":"Semi-automatic extraction and validation of concepts in ontology learning from texts in Spanish","authors":"Manuela Gómez-Suta, J. Echeverry-Correa, José A. Soto Mejía","doi":"10.1145/3405962.3405977","DOIUrl":"https://doi.org/10.1145/3405962.3405977","url":null,"abstract":"The construction of ontologies from texts in Spanish is a challenge since this language lacks conceptual databases to validate abstract ontology structures as concepts and relations between them. The preceding generates the necessity of using manual evaluation by human experts; carrying high expenses that limit the calibration of algorithm parameters and large-scale evaluations. This document presents a proposal to evaluate abstract ontology structures through the task of semantic clustering of documents, without the expensive necessity of using manual evaluation or conceptual databases. The proposal is not only affordable but also applicable to model data and domains that lack structured knowledge resources. The experiments lead to the extraction and validation of the ontology structures from texts in Spanish regarding the domain of the Colombian armed conflict.","PeriodicalId":247414,"journal":{"name":"Proceedings of the 10th International Conference on Web Intelligence, Mining and Semantics","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130479668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
The ICARUS Ontology: A general aviation ontology developed using a multi-layer approach ICARUS本体:使用多层方法开发的通用航空本体
Dimosthenis Stefanidis, Chrysovalantis Christodoulou, M. Symeonidis, G. Pallis, M. Dikaiakos, Loukas Pouis, Kalia Orphanou, Fenareti Lampathaki, D. Alexandrou
{"title":"The ICARUS Ontology: A general aviation ontology developed using a multi-layer approach","authors":"Dimosthenis Stefanidis, Chrysovalantis Christodoulou, M. Symeonidis, G. Pallis, M. Dikaiakos, Loukas Pouis, Kalia Orphanou, Fenareti Lampathaki, D. Alexandrou","doi":"10.1145/3405962.3405983","DOIUrl":"https://doi.org/10.1145/3405962.3405983","url":null,"abstract":"The management of aviation data is a great challenge in the aviation industry, as they are complex and can be derived from heterogeneous data sources. To handle this challenge, ontologies can be applied to facilitate the modelling of the data across multiple data sources. This paper presents an aviation domain ontology, the ICARUS ontology, which aims at facilitating the semantic description and integration of information resources that represent the various assets of the ICARUS platform and their use. To present the functionality and usability of the proposed ontology, we present the results of querying the ontology using SPARQL queries through three use case scenarios. As shown from the evaluation, the ICARUS ontology enables the integration and reasoning over multiple sources of heterogeneous aviation-related data, the semantic description of metadata produced by ICARUS, and their storage in a knowledge-base which is dynamically updated and provides access to its contents via SPARQL queries.","PeriodicalId":247414,"journal":{"name":"Proceedings of the 10th International Conference on Web Intelligence, Mining and Semantics","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132618398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Optimizing Business Process Designs with a Multiple Population Genetic Algorithm 用多种群遗传算法优化业务流程设计
Nadir Mahammed, S. Bennabi, Mahmoud Fahsi
{"title":"Optimizing Business Process Designs with a Multiple Population Genetic Algorithm","authors":"Nadir Mahammed, S. Bennabi, Mahmoud Fahsi","doi":"10.1145/3405962.3405971","DOIUrl":"https://doi.org/10.1145/3405962.3405971","url":null,"abstract":"This article discusses a multi-objective business process optimization. The authors present an approach for an evolutionary combinatorial multi-objective optimization of business process designs with a specified genetic algorithm based on multiple populations. The results show that the optimization approach is capable of producing a satisfactory number of optimized designs alternatives.","PeriodicalId":247414,"journal":{"name":"Proceedings of the 10th International Conference on Web Intelligence, Mining and Semantics","volume":"405 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123537019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Topic Modeling of Short Texts Using Anchor Words 基于锚定词的短文本主题建模
Florian Steuber, Mirco Schönfeld, G. Rodosek
{"title":"Topic Modeling of Short Texts Using Anchor Words","authors":"Florian Steuber, Mirco Schönfeld, G. Rodosek","doi":"10.1145/3405962.3405968","DOIUrl":"https://doi.org/10.1145/3405962.3405968","url":null,"abstract":"We present Archetypal LDA or short A-LDA, a topic model tailored to short texts containing \"semantic anchors\" which convey a certain meaning or implicitly build on discussions beyond their mere presence. A-LDA is an extension to Latent Dirichlet Allocation in that we guide the process of topic inference by these semantic anchors as seed words to the LDA. We identify these seed words unsupervised from the documents and evaluate their co-occurrences using archetypal analysis, a geometric approximation problem that aims for finding k points that best approximate the data set's convex hull. These so called archetypes are considered as latent topics and used to guide the LDA. We demonstrate the effectiveness of our approach using Twitter, where semantic anchor words are the hashtags assigned to tweets by users. In direct comparison to LDA, A-LDA achieves 10-13% better results. We find that representing topics in terms of hashtags corresponding to calculated archetypes alone already results in interpretable topics and the model's performance peaks for seed confidence values ranging from 0.7 to 0.9.","PeriodicalId":247414,"journal":{"name":"Proceedings of the 10th International Conference on Web Intelligence, Mining and Semantics","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130718802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Adaptive Error Prediction for Production Lines with Unknown Dependencies 未知依赖关系生产线的自适应误差预测
S. Soller, M. Kranz, Gerold Hölzl
{"title":"Adaptive Error Prediction for Production Lines with Unknown Dependencies","authors":"S. Soller, M. Kranz, Gerold Hölzl","doi":"10.1145/3405962.3405994","DOIUrl":"https://doi.org/10.1145/3405962.3405994","url":null,"abstract":"Forecasting or predicting errors can dramatically reduce the downtime of machines in industrial settings and even allow to take counteractions long before the error affects the production system. A forecast system to predict upcoming critical values for identical production lines under different environmental circumstances is proposed. We focus on errors that result in multiple erroneous work pieces. These error patterns need manual corrections by a machine controller. An analysis of the system observed gathered the information about the types of errors that are observable. 30% of errors are measurement errors or single faulty work-pieces which are not influenced by previous work-pieces and do not show any indication to preceding work-pieces. These errors do not need any type of action by the machine controller. 70% of the observed errors are continuous system deviations which lead to multiple erroneous work-pieces in order or a high percentage of erroneous work-pieces in an observed time frame. We observe multiple production lines which consist of identical machines and produce the same product type. For the forecast of errors, we use the ARIMA, Holt and Holt-Winter method. Each production line and product type combination showed different results for the different forecast methods. We implemented a dynamic system that automatically detects the seasonality and trend of the specific combination to assign a correct forecast method and model. For 40 combinations of production line and product type the holt-winter algorithm performed best for 14, the holt-winter without seasonal or trend component performed best for 13 combinations and the holt-winter with only a trend component performed best for 10 setups. 3 combinations did not have a distinct best method for all observed results. By selecting the correct forecast methods, we were able to boost the forecast accuracy for the overall system over each single forecast method.","PeriodicalId":247414,"journal":{"name":"Proceedings of the 10th International Conference on Web Intelligence, Mining and Semantics","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116736624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Supporting Music Pattern Retrieval and Analysis: An Ontology-Based Approach 支持音乐模式检索和分析:一种基于本体的方法
C. Achkar, T. Atéchian
{"title":"Supporting Music Pattern Retrieval and Analysis: An Ontology-Based Approach","authors":"C. Achkar, T. Atéchian","doi":"10.1145/3405962.3405973","DOIUrl":"https://doi.org/10.1145/3405962.3405973","url":null,"abstract":"Analyzing music notations is found useful for musicology purposes. This can be applied by retrieving semantic information from digitally annotated music scores. In this paper, we propose an ontology that structures the knowledge extraction process of a music pattern analysis algorithm. In addition to mandatory elements that describe music scores, the proposed ontology relies on contextual elements and attributes for pattern analysis. The ontology then supports the semantic information retrieval and analysis processes of music score contents. We illustrate the whole mechanism by explaining the workflow of the ontology integrated inside a music encoding platform for eastern music.","PeriodicalId":247414,"journal":{"name":"Proceedings of the 10th International Conference on Web Intelligence, Mining and Semantics","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125642396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Using schema.org Annotations for Training and Maintaining Product Matchers 使用schema.org注释培训和维护产品匹配器
R. Peeters, Anna Primpeli, Benedikt Wichtlhuber, Christian Bizer
{"title":"Using schema.org Annotations for Training and Maintaining Product Matchers","authors":"R. Peeters, Anna Primpeli, Benedikt Wichtlhuber, Christian Bizer","doi":"10.1145/3405962.3405964","DOIUrl":"https://doi.org/10.1145/3405962.3405964","url":null,"abstract":"Product matching is a central task within e-commerce applications such as price comparison portals and online market places. State-of-the-art product matching methods achieve F1 scores above 0.90 using deep learning techniques combined with huge amounts of training data (e.g > 100K pairs of offers). Gathering and maintaining such large training corpora is costly, as it implies labeling pairs of offers as matches or non-matches. Acquiring the ability to be good at product matching thus means a major investment for an e-commerce company. This paper shows that the manual labeling of training data for product matching can be replaced by relying exclusively on schema.org annotations gathered from the public Web. We show that using only schema.org data for training, we are able to achieve F1 scores between 0.92 and 0.95 depending on the product category. As new products appear everyday, it is important that matching models can be maintained with justifiable effort. In order to give practical advice on how to maintain matching models, we compare the performance of deep learning and traditional matching models on unseen products and experiment with different fine-tuning and re-training strategies for model maintenance, again using only schema.org annotations as training data. Finally, as using the public Web as distant supervision carries inherent noise, we evaluate deep learning and traditional matching models with regards to their label-noise resistance and show that deep learning is able to deal with the amounts of identifier-noise found in schema.org annotations.","PeriodicalId":247414,"journal":{"name":"Proceedings of the 10th International Conference on Web Intelligence, Mining and Semantics","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133759918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信