Giannis Christoforidis, Pavlos Kefalas, A. Papadopoulos, Y. Manolopoulos
{"title":"Recommendation of Points-of-Interest Using Graph Embeddings","authors":"Giannis Christoforidis, Pavlos Kefalas, A. Papadopoulos, Y. Manolopoulos","doi":"10.1109/DSAA.2018.00013","DOIUrl":"https://doi.org/10.1109/DSAA.2018.00013","url":null,"abstract":"The rapid growth of Location-based Social Networks (LBSNs) has lead to the generation of massive datasets which are collected in an exponential rate. The collected information may be used to facilitate users' needs with recommendations related to their past preferences. Many recommendation models were introduced in the literature, which learn by the history of users and provide recommendations for Points-of-Interest. Unfortunately, most of them ignore the relation existing among the temporal properties, the spatial attributes and the periodicity of the check-ins. In this work, we present a novel methodology, named JLGE, that combines all aforementioned factors into one unified approach which facilitates POI recommendations. In particular, the model jointly learns the embeddings of six informational graphs i.e., two unipartite (user-user and POIPOI) and four bipartite (user-location, user-time, location-user, and location-time) into the same latent space and personalize the recommendations based on these embeddings. We have experimentally evaluated the accuracy of our model using two real-world datasets in terms of the top-n POIs recommendations. The performance evaluation results indicate a significant improvement in accuracy, in comparison to another state-of-theart graph-based approach.","PeriodicalId":208455,"journal":{"name":"2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125883116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
P. Olsen, K. Ramamurthy, Javier Ribera, Yuhao Chen, Addie M. Thompson, Ronny Luss, M. Tuinstra, N. Abe
{"title":"Detecting and Counting Panicles in Sorghum Images","authors":"P. Olsen, K. Ramamurthy, Javier Ribera, Yuhao Chen, Addie M. Thompson, Ronny Luss, M. Tuinstra, N. Abe","doi":"10.1109/DSAA.2018.00052","DOIUrl":"https://doi.org/10.1109/DSAA.2018.00052","url":null,"abstract":"Phenotyping, the process of measuring plant traits, plays a central role in plant breeding. However, traditional approaches are labor-intensive, time-consuming, costly, and error prone. Accurate, automated, high-throughput phenotyping can relieve a huge burden in the breeding pipeline. In this paper, we propose computer vision systems and approaches to annotate, detect, and count panicles (heads), a key phenotype, from aerial images of Sorghum crops. The annotation system allows the users to label panicles in Sorghum aerial images. This annotated data is used for learning by the panicle detection and counting algorithms. The proposed approaches were used with aerial imagery of 18 varieties of Sorghum crop collected at 6 different dates in the Midwestern United States. The detector has an AUC of over 0.98 and the counter has a mean absolute error of 2.66 without adapting to variety and 1.88 when using variety specific information. Our approaches are being adopted into a high-throughput phenotyping pipeline for accelerating Sorghum breeding.","PeriodicalId":208455,"journal":{"name":"2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115521630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"[Copyright notice]","authors":"","doi":"10.1109/dsaa.2018.00003","DOIUrl":"https://doi.org/10.1109/dsaa.2018.00003","url":null,"abstract":"","PeriodicalId":208455,"journal":{"name":"2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115527135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
P. Mehta, Jithin Mathews, K. Suryamukhi, K. S. Kumar, C. Babu
{"title":"Predictive Modeling for Identifying Return Defaulters in Goods and Services Tax","authors":"P. Mehta, Jithin Mathews, K. Suryamukhi, K. S. Kumar, C. Babu","doi":"10.1109/DSAA.2018.00081","DOIUrl":"https://doi.org/10.1109/DSAA.2018.00081","url":null,"abstract":"Tax evasion is an illegal practice where a person or a business entity intentionally avoids paying his/her true tax liability. Any business entity is required by the law to file their tax return statements following a periodical schedule. Avoiding to file the tax return statement is one among the most rudimentary forms of tax evasion. The dealers committing tax evasion in such a way are called return defaulters. In this paper, we construct a logistic regression model that predicts with high accuracy whether a business entity is a potential return defaulter for the upcoming tax-filing period. For the same, we analyzed the effect of the amount of sales/purchases transactions among the business entities (dealers) and the mean absolute deviation (MAD) value of the first digit Benford's law on sales transactions by a business entity. We developed this model for the commercial taxes department, government of Telangana, India.","PeriodicalId":208455,"journal":{"name":"2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114257615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"[Publisher's information]","authors":"","doi":"10.1109/dsaa.2018.00090","DOIUrl":"https://doi.org/10.1109/dsaa.2018.00090","url":null,"abstract":"","PeriodicalId":208455,"journal":{"name":"2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114448719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data Science Challenges in Computational Psychiatry and Psychiatric Research","authors":"D. Ståhl, D. Stamate","doi":"10.1109/DSAA.2018.00067","DOIUrl":"https://doi.org/10.1109/DSAA.2018.00067","url":null,"abstract":"The special session \"Data Science is Computational Psychiatry and Psychiatric Research\" at the 5th IEEE International Conference in Data Science and Advanced Analytics in Turin, Italy 2018 presents papers specifically addressing psychiatric research. In this overview, we describe the challenges of psychiatric research and demonstrates how the presented papers approach some of the problems.","PeriodicalId":208455,"journal":{"name":"2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123466157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Network-Based Models to Improve Credit Scoring Accuracy","authors":"Branka Hadji Misheva, Paolo Giudici, V. Pediroda","doi":"10.1109/DSAA.2018.00080","DOIUrl":"https://doi.org/10.1109/DSAA.2018.00080","url":null,"abstract":"Technological advancements have prompted the emergence of peer-to-peer credit services which improve user experience and offer significant reductions in costs. These advantages may be offset by a higher credit risk, due to disintermediation and information asymmetries. We postulate that network-based information can be employed as a tool for reducing risks through an improved credit scoring model that increases the accuracy of default predictions. Our research assumption is proven by means of empirical analysis that shows how including network parameters in classical scoring algorithms, such as logistic regression and CART, does indeed improve predictive accuracy.","PeriodicalId":208455,"journal":{"name":"2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116215036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Portability of Aspect Based Sentiment Analysis: Thirty Minutes for a Proof of Concept","authors":"L. Dini, Paolo Curtoni, E. Melnikova","doi":"10.1109/DSAA.2018.00085","DOIUrl":"https://doi.org/10.1109/DSAA.2018.00085","url":null,"abstract":"This paper describes a system for aspect based sentiment analysis based on the assumption that domain portability should be achieved with minimal manual configuration. The approach exploits the integration of dependency parsing, graph based extraction rules over dependency trees and distributional semantics techniques. Results are considered satisfying for a \"proof of concept\" demonstrator.","PeriodicalId":208455,"journal":{"name":"2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"4 Sect Study Dis Child 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124523822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Luciano Gervasoni, S. Fenet, Regis Perrier, P. Sturm
{"title":"Convolutional Neural Networks for Disaggregated Population Mapping Using Open Data","authors":"Luciano Gervasoni, S. Fenet, Regis Perrier, P. Sturm","doi":"10.1109/DSAA.2018.00076","DOIUrl":"https://doi.org/10.1109/DSAA.2018.00076","url":null,"abstract":"High resolution population count data are vital for numerous applications such as urban planning, transportation model calibration, and population growth impact measurements, among others. In this work, we present and evaluate an end-to-end framework for computing disaggregated population mapping employing convolutional neural networks (CNNs). Using urban data extracted from the OpenStreetMap database, a set of urban features are generated which are used to guide population density estimates at a higher resolution. A population density grid at a 200 by 200 meter spatial resolution is estimated, using as input gridded population data of 1 by 1 kilometer. Our approach relies solely on open data with a wide geographical coverage, ensuring replicability and potential applicability to a great number of cities in the world. Fine-grained gridded population data is used for 15 French cities in order to train and validate our model. A stand-alone city is kept out for the validation procedure. The results demonstrate that the neural network approach using massive OpenStreetMap data outperforms other approaches proposed in related works.","PeriodicalId":208455,"journal":{"name":"2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124524986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Selection of a Nominal Device Using Functional Data Analysis","authors":"Nevin Martin, T. Buchheit, Shahed Reza","doi":"10.1109/DSAA.2018.00049","DOIUrl":"https://doi.org/10.1109/DSAA.2018.00049","url":null,"abstract":"Nominal behavior selection of an electronic device from a measured dataset is often difficult. Device characteristics are rarely monotonic and choosing the single device measurement which best represents the center of a distribution across all regions of operation is neither obvious nor easy to interpret. Often, a device modeler uses a degree of subjectivity when selecting nominal device behavior from a dataset of measurements on a group of devices. This paper proposes applying a functional data approach to estimate the mean and nominal device of an experimental dataset. This approach was applied to a dataset of electrical measurements on a set of commercially available Zener diodes and proved to more accurately represent the average device characteristics than a point-wise calculation of the mean. It also enabled an objective method for selecting a nominal device from a dataset of device measurements taken across the full operating region of the Zener diode.","PeriodicalId":208455,"journal":{"name":"2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133556797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}