Journal of Information and Data Management最新文献

筛选
英文 中文
Using Musical and Statistical Analysis of the Predominant Melody of the Voice to Create datasets from a Database of Popular Brazilian Hit Songs 使用音乐和统计分析的主要旋律的声音创建数据集从流行的巴西热门歌曲的数据库
Journal of Information and Data Management Pub Date : 2022-08-15 DOI: 10.5753/jidm.2022.2336
André A. Bertoni, Rodrigo P. Lemos
{"title":"Using Musical and Statistical Analysis of the Predominant Melody of the Voice to Create datasets from a Database of Popular Brazilian Hit Songs","authors":"André A. Bertoni, Rodrigo P. Lemos","doi":"10.5753/jidm.2022.2336","DOIUrl":"https://doi.org/10.5753/jidm.2022.2336","url":null,"abstract":"This work deals with the creation and optimization of a large set of features extracted from a database of 882 popular brazilian hit songs and non-hit songs, from 2014 to May 2019. From this database of songs, we created four datasets of musical features. The first comprises 3215 statistical features, while the second, third and fourth are completely new, as they were formed from the predominant melody of the Voice and previously there were no similar databases available for study. The second set of data represents the graph of the time-frequency spectrogram of the singer’s voice during the first 90 seconds of each song. The third dataset results from a statistical analysis carried out on the predominant melody of the voice. The fourth is the most peculiar of all, as it results from the musical semantic analysis of the predominant melody of the voice, which allowed the construction of a table with the most frequent melodic sequences of each song. Our datasets use only Brazilian songs and focus their data on a limited and contemporary period. The idea behind these datasets is to encourage the study of Machine Learning techniques that require musical information. The extracted features can help develop new studies in Music and Computer Science in the future.","PeriodicalId":293511,"journal":{"name":"Journal of Information and Data Management","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121071535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Essay-BR: a Brazilian Corpus to Automatic Essay Scoring Task Essay- br:巴西语料库的自动作文评分任务
Journal of Information and Data Management Pub Date : 2022-08-15 DOI: 10.5753/jidm.2022.2340
Jeziel C. Marinho, Rafael T. Anchiêta, Raimundo S. Moura
{"title":"Essay-BR: a Brazilian Corpus to Automatic Essay Scoring Task","authors":"Jeziel C. Marinho, Rafael T. Anchiêta, Raimundo S. Moura","doi":"10.5753/jidm.2022.2340","DOIUrl":"https://doi.org/10.5753/jidm.2022.2340","url":null,"abstract":"Automatic Essay Scoring (AES) is the computer technology that evaluates and scores the written essays, aiming to provide computational models to grade essays automatically or with minimal human involvement. While there are several AES studies in a variety of languages, few of them are focused on the Portuguese language. The main reason is the lack of a corpus with manually graded essays. In order to bridge this gap, in this paper we extended a corpus of essays written by Brazilian high school students in an online platform. All of the essays are argumentative and were scored across five competences by experts. Moreover, we conducted an experiment with the extended corpus to show some challenges posed by the Portuguese language. The corpus are publicly available at https://github.com/lplnufpi/essay-br.","PeriodicalId":293511,"journal":{"name":"Journal of Information and Data Management","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115455963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
The Impact of Privacy Regulations on DB Systems 隐私法规对数据库系统的影响
Journal of Information and Data Management Pub Date : 2021-11-19 DOI: 10.5753/jidm.2021.1958
Javam C. Machado, Paulo R. P. Amora
{"title":"The Impact of Privacy Regulations on DB Systems","authors":"Javam C. Machado, Paulo R. P. Amora","doi":"10.5753/jidm.2021.1958","DOIUrl":"https://doi.org/10.5753/jidm.2021.1958","url":null,"abstract":"Personal data usage and collection are activities that used to grow unrestricted. However, several laws in the physical world ensure rights to people regarding their privacy and information usage. In the last years, legislators passed many laws, regulations, and acts to replicate these rights to the digital world. By doing so, new constraints, rights, and duties appear on every component of the data usage and collection workflow. In this paper, we discuss legislations’ implications, identifying impacts that these regulations introduce to current DBMS, and survey recent works that aim to solve the problems raised by these impacts, highlighting research opportunities and identifying how solutions can be achieved for the problems.","PeriodicalId":293511,"journal":{"name":"Journal of Information and Data Management","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134398801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Private Reverse Top-k Algorithms Applied on Public Data of COVID-19 in the State of Ceará 私有逆Top-k算法在COVID-19国家公共数据中的应用
Journal of Information and Data Management Pub Date : 2021-11-19 DOI: 10.5753/jidm.2021.1941
Mariana M. Silva, Iago C. Chaves, Javam C. Machado
{"title":"Private Reverse Top-k Algorithms Applied on Public Data of COVID-19 in the State of Ceará","authors":"Mariana M. Silva, Iago C. Chaves, Javam C. Machado","doi":"10.5753/jidm.2021.1941","DOIUrl":"https://doi.org/10.5753/jidm.2021.1941","url":null,"abstract":"In this article we propose a differentially private reverse top-k query. Our strategy allows obtaining the less frequent data according to a search criteria, with a high guarantee of privacy of the individuals who contributed with personal data in the original database. We apply our strategy on public data for COVID-19 in the State of Ceará using two different queries. Our experimental results show that the result of the proposed top-k query returns a high degree of similarity to the result of a conventional top-k query, when the chosen budget is suitable, providing useful results for researchers, while ensuring a low probability of re-identification of individuals arising from the properties of differential privacy.","PeriodicalId":293511,"journal":{"name":"Journal of Information and Data Management","volume":"121 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131612842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Privacy-preserving of patients with Differential Privacy: an experimental evaluation in COVID-19 dataset 差异隐私患者的隐私保护:COVID-19数据集的实验评估
Journal of Information and Data Management Pub Date : 2021-11-19 DOI: 10.5753/jidm.2021.1947
Manuel E. B. Filho, Eduardo R. Duarte Neto, Javam C. Machado
{"title":"Privacy-preserving of patients with Differential Privacy: an experimental evaluation in COVID-19 dataset","authors":"Manuel E. B. Filho, Eduardo R. Duarte Neto, Javam C. Machado","doi":"10.5753/jidm.2021.1947","DOIUrl":"https://doi.org/10.5753/jidm.2021.1947","url":null,"abstract":"The pandemic of the new coronavirus (COVID-19) has brought new challenges to health systems in almost every corner of the world, many of them overburdened. The data analysis has given support in the fight against the coronavirus. Through this analysis, government authorities, together with health care providers, adopted effective strategies. Yet, those strategies can not be careless of privacy concerns. The individuals’ privacy is a right of each citizen. Privacy techniques guarantee the analysis of health data without exposing individuals’ private information. However, a balance between data privacy and utility is essential for a good analysis of the data. This work will demonstrate that it is possible to guarantee the privacy of infected patients and maintain the utility of the data, allowing a sound analysis on them, from the visualization of the application of differentially private mechanisms on queries in the data of patients tested in the State of Ceará - Brazil.","PeriodicalId":293511,"journal":{"name":"Journal of Information and Data Management","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133871382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Parallel Processing of Remote Sensing Time Series Applied to Land-Use and Land-Cover Classification 遥感时间序列并行处理在土地利用和土地覆盖分类中的应用
Journal of Information and Data Management Pub Date : 2021-10-28 DOI: 10.5753/jidm.2021.1785
Roberto U. Paiva, Sávio S. T. Oliveira, Luiz M. L. Pascoal, Leandro L. Parente, Wellington S. Martins
{"title":"Parallel Processing of Remote Sensing Time Series Applied to Land-Use and Land-Cover Classification","authors":"Roberto U. Paiva, Sávio S. T. Oliveira, Luiz M. L. Pascoal, Leandro L. Parente, Wellington S. Martins","doi":"10.5753/jidm.2021.1785","DOIUrl":"https://doi.org/10.5753/jidm.2021.1785","url":null,"abstract":"The increase in satellite launches into Earth's orbit in recent years has generated a huge amount of remote sensing data. These data, in the form of time series, have been used in automated classification approaches, generating land-use and land-cover (LULC) products for different landscapes around the world. Dynamic Time Warping (DTW) is a well-known computational method used to measure the similarity between time series. Tt has been used in many algorithms for remote sensing time series analysis. These DTW-based algorithms are capable of generating similarity measures between time series and patterns. These measures can be used as meta-features to increase the accuracy results of classification models. However, DTW-based algorithms require a lot of computational resources and have a high execution time, which makes them difficult to use in large volumes of data. This article presents a parallel and fully scalable solution to optimize the construction of meta-features through remote sensing time series (RSTS). In addition, results of the application of the generated meta-features in the training and evaluation of classification models using Random Forest are presented. The results show that the proposed approaches have led to improvements in execution time and accuracy when compared to traditional strategies.","PeriodicalId":293511,"journal":{"name":"Journal of Information and Data Management","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133614938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring Geostatistical Modeling and VisualizationTechniques of Uncertainties for Categorical Spatial Data 空间分类数据不确定性的地质统计建模与可视化技术探讨
Journal of Information and Data Management Pub Date : 2021-10-28 DOI: 10.5753/jidm.2021.1786
Carlos A. Felgueiras, Jussara O. Ortiz, Eduardo C. G. Camargo, Laércio M. Namikawa, Thales S. Körting
{"title":"Exploring Geostatistical Modeling and VisualizationTechniques of Uncertainties for Categorical Spatial Data","authors":"Carlos A. Felgueiras, Jussara O. Ortiz, Eduardo C. G. Camargo, Laércio M. Namikawa, Thales S. Körting","doi":"10.5753/jidm.2021.1786","DOIUrl":"https://doi.org/10.5753/jidm.2021.1786","url":null,"abstract":"This article presents and analyzes the indicator geostatistical modeling and some visualization techniques of uncertainty models for categorical spatial attributes. A set of sample points of some categorical attribute is used as input information. The indicator approach requires a transformation of sample points on fields of indicator samples according to the classes of interest. Experimental and theoretical semivariograms of the indicator fields are defined representing the spatial variation of the indicator information. The indicator fields, along with their semivariograms, are used to determine the uncertainty model, the conditioned probability distribution function, of the attribute at any location inside the geographic region delimited by the samples. The probability functions are considered for producing prediction and probability maps based on the maximum class probability criterion. These maps can be visualized using different techniques. In this work, it is considered individual visualization of the predicted and probability maps and a combination of them. The predicted maps can also be visualized with or without constraints related to the uncertainty probabilities. The combined visualizations are based on three-dimensional (3D) planar projection and on the Red-Green-Blue to Intensity-Hue-Saturation (RGB-IHS) fusion transformation techniques. The methodology of this article is illustrated by a case study with real data, a sample set of soil textures observed in an experimental farm located in the region of São Carlos city in São Paulo State, Brazil. The resulting maps of this case study are presented and the advantages and the drawbacks of the visualization options are analyzed and discussed.","PeriodicalId":293511,"journal":{"name":"Journal of Information and Data Management","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121639644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
J-EDA: A workbench for tuning similarity and diversity search parameters in content-based image retrieval J-EDA:一个工作台,用于在基于内容的图像检索中调整相似性和多样性搜索参数
Journal of Information and Data Management Pub Date : 2021-09-10 DOI: 10.5753/jidm.2021.1990
João V. O. Novaes, Lúcio F. D. Santos, Luiz Olmes Carvalho, Daniel de Oliveira, Marcos V. N. Bedo, Agma J. M. Traina, Caetano Traina Jr.
{"title":"J-EDA: A workbench for tuning similarity and diversity search parameters in content-based image retrieval","authors":"João V. O. Novaes, Lúcio F. D. Santos, Luiz Olmes Carvalho, Daniel de Oliveira, Marcos V. N. Bedo, Agma J. M. Traina, Caetano Traina Jr.","doi":"10.5753/jidm.2021.1990","DOIUrl":"https://doi.org/10.5753/jidm.2021.1990","url":null,"abstract":"Similarity searches can be modeled by means of distances following the Metric Spaces Theory and constitute a fast and explainable query mechanism behind content-based image retrieval (CBIR) tasks. However, classical distance-based queries, e.g., Range and k-Nearest Neighbors, may be unsuitable for exploring large datasets because the retrieved elements are often similar among themselves. Although similarity searching is enriched with the imposition of rules to foster result diversification, the fine-tuning of the diversity query is still an open issue, which is is usually carried out with and a non-optimal expensive computational inspection. This paper introduces J-EDA, a practical workbench implemented in Java that supports the tuning of similarity and diversity search parameters by enabling the automatic and parallel exploration of multiple search settings regarding a user-posed content-based image retrieval task. J-EDA implements a wide variety of classical and diversity-driven search queries, as well as many CBIR settings such as feature extractors for images, distance functions, and relevance feedback techniques. Accordingly, users can define multiple query settings and inspect their performances for spotting the most suitable parameterization for a content-based image retrieval problem at hand. The workbench reports the experimental performances with several internal and external evaluation metrics such as P × R and Mean Average Precision (mAP), which are calculated towards either incremental or batch procedures performed with or without human interaction.","PeriodicalId":293511,"journal":{"name":"Journal of Information and Data Management","volume":"210 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134529795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analysing Spatio-Temporal Voting Patterns in Brazilian Elections Through a Simple Data Science Pipeline 通过简单的数据科学管道分析巴西选举中的时空投票模式
Journal of Information and Data Management Pub Date : 2021-08-05 DOI: 10.5753/jidm.2021.1932
L. H. M. Jacintho, T. P. da Silva, A. R. S. Parmezan, G. E. A. P. A. Batista
{"title":"Analysing Spatio-Temporal Voting Patterns in Brazilian Elections Through a Simple Data Science Pipeline","authors":"L. H. M. Jacintho, T. P. da Silva, A. R. S. Parmezan, G. E. A. P. A. Batista","doi":"10.5753/jidm.2021.1932","DOIUrl":"https://doi.org/10.5753/jidm.2021.1932","url":null,"abstract":"Since 1989, the first year of the democratic presidential election after a long period of a dictatorship regime, Brazil conducted eight presidential elections. Short and long-term shifts of power and two impeachment processes marked such a period. This instability is a research case in electoral studies, mainly regarding the understanding of citizens' voting behavior. Comprehending patterns in the population behavior can give us insight into phenomena and processes that affect democratic political decisions. In light of this, our paper analyses Brazilian electoral data at the municipal level from 1998 to 2018 using a simple data science pipeline, which consists of five steps: (i) data selection; (ii) data preprocessing; (iii) identification of spatial patterns, in which we seek to understand the role of space in the election results employing spatial auto-correlation techniques; (iv) identification of temporal patterns, where we investigate similar trends of votes over the years applying a hierarchical clustering method; and (v) evaluation of results. We study the presidential elections focusing on the right and left-wing parties most relevant for the period: the Brazilian Social Democracy Party~(PSDB) and the Workers' Party~(PT). We also analyse the congressman election data regarding parties ideologically to the right and left in the political spectrum. Through the obtained results, we found the existence of spatial dependence in every electoral year investigated. Moreover, despite the changes in the political-economic context over the years, neighboring cities seem to present similar voting behavior trends.","PeriodicalId":293511,"journal":{"name":"Journal of Information and Data Management","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124645582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Weakly Supervised Learning Algorithm to Eliminate Irrelevant Association Rules in Large Knowledge Bases 大型知识库中不相关关联规则消除的弱监督学习算法
Journal of Information and Data Management Pub Date : 2021-02-14 DOI: 10.5753/jidm.2020.2025
Bruno B. Cifarelli, Rafael G. L. Miani
{"title":"Weakly Supervised Learning Algorithm to Eliminate Irrelevant Association Rules in Large Knowledge Bases","authors":"Bruno B. Cifarelli, Rafael G. L. Miani","doi":"10.5753/jidm.2020.2025","DOIUrl":"https://doi.org/10.5753/jidm.2020.2025","url":null,"abstract":"The construction and population of large knowledge bases have been widely explored in the past few years. Many techniques were developed in order to accomplish this purpose. Association rule mining algorithms can also be used to help populate these knowledge bases. Nevertheless, analyzing the amount of association rules generated can be a challenge and time-consuming task. The technique described in this article aims to eliminate irrelevant association rules in order to facilitate the rules evaluation process. To achieve that, this article presents a weakly supervised learning technique to prune irrelevant association rules. The proposed method uses irrelevant rules already discovered in past iterations and prunes off those with the same pattern. Experiments showed that the new technique can reduce and eliminate the amount of rules by about 60%, decreasing the effort required to evaluate them.","PeriodicalId":293511,"journal":{"name":"Journal of Information and Data Management","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123509212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信