2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)最新文献

筛选
英文 中文
DNA: General Deterministic Network Adaptive Framework for Multi-Round Multi-Party Influence Maximization DNA:多轮多方影响最大化的一般确定性网络自适应框架
Tzu-Hsin Yang, Hao-Shang Ma, Jen-Wei Huang
{"title":"DNA: General Deterministic Network Adaptive Framework for Multi-Round Multi-Party Influence Maximization","authors":"Tzu-Hsin Yang, Hao-Shang Ma, Jen-Wei Huang","doi":"10.1109/DSAA.2018.00038","DOIUrl":"https://doi.org/10.1109/DSAA.2018.00038","url":null,"abstract":"The influence maximization problem has been considered a vital problem when companies provide similar products or services. Since there are limited resources, companies must determine a strategy to occupy as much market share as possible. In this paper, we propose a general Deterministic Network Adaptive (DNA) framework to solve the multi-round multi-party influence maximization problem. To obtain the most market share, using one single strategy to determine seed nodes is not sufficient in the long term. The reason is that the network status changes during the multi-round procedure. The strategies of selecting seed nodes in each round should depend on the current status of influence diffusion in the network. DNA framework leverages the concept of reinforcement learning to maximize the expected cumulative influence. In addition, the learning process is deterministic, so that it does not take time to explore the spaces that are less important. We further design a similarity function to measure the similarity between two networks. DNA framework can avoid redundant computation when the similar networks have been trained before. Moreover, we propose the method to make the policy decision to maximize the influence spread in coopetition scenario based on DNA framework. The proposed framework is evaluated with synthetic data and real-world data. From the experimental results, DNA framework outperforms the existing works in influence maximization problems. The coopetition policy which is generated by DNA has the best performance in most cases.","PeriodicalId":208455,"journal":{"name":"2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126293333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Using Data Analytics to Optimize Public Transportation on a College Campus 使用数据分析优化大学校园的公共交通
K. Zimmer, H. Kurban, Mark Jenne, Logan Keating, P. Maull, Mehmet M. Dalkilic
{"title":"Using Data Analytics to Optimize Public Transportation on a College Campus","authors":"K. Zimmer, H. Kurban, Mark Jenne, Logan Keating, P. Maull, Mehmet M. Dalkilic","doi":"10.1109/DSAA.2018.00059","DOIUrl":"https://doi.org/10.1109/DSAA.2018.00059","url":null,"abstract":"Using a large volume of bus data in the form of GPS coordinates (over 100 million data points) and automated passenger count data (over 1 million data points) we have developed (1) a system of analysis and prediction of future public transportation demand (2) a new model that uses concepts specific to college campuses that maximizes passenger satisfaction. Using these concepts we improve service of a model college public transportation service and more specifically the Indiana University Campus Bus Service (IUCBS).","PeriodicalId":208455,"journal":{"name":"2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132032386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Willingness to Share Emotion Information on Social Media: Influence of Personality and Social Context 在社交媒体上分享情绪信息的意愿:个性和社会情境的影响
Damien Dupré, G. McKeown, Nicole Andelic, Gawain Morrison
{"title":"Willingness to Share Emotion Information on Social Media: Influence of Personality and Social Context","authors":"Damien Dupré, G. McKeown, Nicole Andelic, Gawain Morrison","doi":"10.1109/DSAA.2018.00086","DOIUrl":"https://doi.org/10.1109/DSAA.2018.00086","url":null,"abstract":"Sharing personal information is an important way of communicating on social media. Among the information possibly shared, new sensors and tools allow people to share emotion information via facial emotion recognition. This paper questions whether people are prepared to share personal information such as their own emotion on social media. In the current study we examined how factors such as felt emotion, motivation for sharing on social media as well as personality affected participants' willingness to share self-reported emotion or facial expression online. By carrying out a Generalized Linear Mixed Model analysis, this study found that participants' willingness to share self-reported emotion and facial expressions was influenced by their personality traits and the motivation for sharing their emotion information that they were given. From our results we can conclude that the estimated level of privacy for certain emotional information, such as facial expression, is influenced by the motivation for sharing the information online.","PeriodicalId":208455,"journal":{"name":"2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125608452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Coolabilities API
D. Nordfors, S. Dasgupta, Ganapathy Subramanian, V. R. Ferose, Chally Grundwag, Behrang Zandi
{"title":"Coolabilities API","authors":"D. Nordfors, S. Dasgupta, Ganapathy Subramanian, V. R. Ferose, Chally Grundwag, Behrang Zandi","doi":"10.1109/DSAA.2018.00063","DOIUrl":"https://doi.org/10.1109/DSAA.2018.00063","url":null,"abstract":"Disabilities may co-occur with characteristic enhanced strengths (hereby Coolabilities). The Coolabilities API project is centered around an Open Platform available for any individual/business/institutions to seamlessly involve three important goals: 1. to share existing data and code of relevance to coolabilities 2. To create create/share new data/code 3. To build collective intelligence around coolabilities. The core of Coolabilities API is the Coolability-Disability Correlation Database - \"CODIC\". This paper illustrates the CODIC, algorithms for job matching and developer's platform implemented around it.","PeriodicalId":208455,"journal":{"name":"2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"393 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115915769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Big Data-Driven Platform for Cross-Media Monitoring 大数据驱动的跨媒体监控平台
L. Napalkova, Pablo Aragón, Juan Carlos Castro Robles
{"title":"Big Data-Driven Platform for Cross-Media Monitoring","authors":"L. Napalkova, Pablo Aragón, Juan Carlos Castro Robles","doi":"10.1109/DSAA.2018.00051","DOIUrl":"https://doi.org/10.1109/DSAA.2018.00051","url":null,"abstract":"The abundance of online media content requires highly scalable architectures to allow cross-media monitoring. This paper presents an innovative big data-as-a-service platform for analysing large complex networks in order to enhance cross-media monitoring. In contrast to the existing media monitoring systems, the platform equips marketers with several distinctive features. First, while most of the systems perform quantitative exploratory analysis of social media, our platform applies graph analytics in order to reveal social interaction types, hidden patterns in the cross-media network and the information diffusion over time. Second, our platform integrates and implements distributed versions of graph analytics algorithms (Louvain, HITS and others) that can scale to a large volume of data. Third, the creation of cross-media graphs is triggered by user-defined queries that can be easily specified by marketers. Thus, end-users can build and analyse different graphs according to specific goals of the study. Finally, the platform allows reducing Hadoop cluster usage costs due to executing the graph mining algorithms on demand triggered by user-defined queries. Instead of running costly streaming processes that continuously listen for new queries, we implemented Spark-as-a-service approach via Apache Livy REST interface.","PeriodicalId":208455,"journal":{"name":"2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123960811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Scalable and Interpretable Predictive Models for Electronic Health Records 电子健康记录的可扩展和可解释的预测模型
Amela Fejza, P. Genevès, Nabil Layaïda, J. Bosson
{"title":"Scalable and Interpretable Predictive Models for Electronic Health Records","authors":"Amela Fejza, P. Genevès, Nabil Layaïda, J. Bosson","doi":"10.1109/DSAA.2018.00045","DOIUrl":"https://doi.org/10.1109/DSAA.2018.00045","url":null,"abstract":"Early identification of patients at risk of developing complications during their hospital stay is currently one of the most challenging issues in healthcare. Complications include hospital-acquired infections, admissions to intensive care units, and in-hospital mortality. Being able to accurately predict the patients' outcomes is a crucial prerequisite for tailoring the care that certain patients receive, if it is believed that they will do poorly without additional intervention. We consider the problem of complication risk prediction, such as inpatient mortality, from the electronic health records of the patients. We study the question of making predictions on the first day at the hospital, and of making updated mortality predictions day after day during the patient's stay. We develop distributed models that are scalable and interpretable. Key insights include analysing diagnoses known at admission and drugs served, which evolve during the hospital stay. We leverage a distributed architecture to learn interpretable models from training datasets of gigantic size. We test our analyses with more than one million of patients from hundreds of hospitals, and report on the lessons learned from these experiments.","PeriodicalId":208455,"journal":{"name":"2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125004110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Latent Dirichlet Allocation in Discovering Goals in Patients Undergoing Bladder Cancer Surgery 潜在狄利克雷分配在膀胱癌手术患者目标发现中的作用
T. Atkinson
{"title":"Latent Dirichlet Allocation in Discovering Goals in Patients Undergoing Bladder Cancer Surgery","authors":"T. Atkinson","doi":"10.1109/DSAA.2018.00069","DOIUrl":"https://doi.org/10.1109/DSAA.2018.00069","url":null,"abstract":"As we begin to leverage Big Data in health care settings and particularly in assessing patient-reported outcomes, there is a need for novel analytics to address unique challenges. One such challenge is in coding transcribed interview data, typically free-text entries of statements made by interviewees during face-to-face interviews. Conventional coding of such qualitative data into themes is labor-intensive and prone to inconsistencies. Latent Dirichlet Allocation (LDA) may offer statistical rigor in summarizing patients' concerns and coping strategies in a life-threatening illness. We aim to apply LDA to interview data collected as part of a prospective, longitudinal study of QOL in patients undergoing radical cystectomy and urinary diversion for bladder cancer. LDA showed that, prior to surgery, patients' priorities were primarily in cancer surgery and recovery. Six months after the surgery, however, their goals shifted to a desire to spend more time with family, resume work, and enjoy life to its fullest extent. Novel analytics such as LDA offer the possibility of summarizing personal goals in real time without the need for conventional fixed-length measures and qualitative data coding.","PeriodicalId":208455,"journal":{"name":"2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124588436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Predicting Worker Disagreement for More Effective Crowd Labeling 预测工人不同意更有效的群体标签
Stefan Räbiger, Gizem Gezici, Y. Saygin, M. Spiliopoulou
{"title":"Predicting Worker Disagreement for More Effective Crowd Labeling","authors":"Stefan Räbiger, Gizem Gezici, Y. Saygin, M. Spiliopoulou","doi":"10.1109/DSAA.2018.00028","DOIUrl":"https://doi.org/10.1109/DSAA.2018.00028","url":null,"abstract":"Crowdsourcing is a popular mechanism used for labeling tasks to produce large corpora for training. However, producing a reliable crowd labeled training corpus is challenging and resource consuming. Research on crowdsourcing has shown that label quality is much affected by worker engagement and expertise. In this study, we postulate that label quality can also be affected by inherent ambiguity of the documents to be labeled. Such ambiguities are not known in advance, of course, but, once encountered by the workers, they lead to disagreement in the labeling – a disagreement that cannot be resolved by employing more workers. To deal with this problem, we propose a crowd labeling framework: we train a disagreement predictor on a small seed of documents, and then use this predictor to decide which documents of the complete corpus should be labeled and which should be checked for document-inherent ambiguities before assigning (and potentially wasting) worker effort on them. We report on the findings of the experiments we conducted on crowdsourcing a Twitter corpus for sentiment classification.","PeriodicalId":208455,"journal":{"name":"2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121165501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Parallel Continuous Outlier Mining in Streaming Data 流数据中的并行连续离群值挖掘
Theodoros Toliopoulos, A. Gounaris, K. Tsichlas, A. Papadopoulos, Sandra Sampaio
{"title":"Parallel Continuous Outlier Mining in Streaming Data","authors":"Theodoros Toliopoulos, A. Gounaris, K. Tsichlas, A. Papadopoulos, Sandra Sampaio","doi":"10.1109/DSAA.2018.00033","DOIUrl":"https://doi.org/10.1109/DSAA.2018.00033","url":null,"abstract":"In this work, we focus on distance-based outliers in a metric space, where the status of an entity as to whether it is an outlier is based on the number of other entities in its neighborhood. In the recent years, several solutions have tackled the problem of distance-based outliers in data streams, where outliers must be mined continuously as new elements become available. An interesting research problem is to combine the streaming environment with massively parallel systems to provide scalable stream-based algorithms. However, none of the previously proposed techniques refer to a massively parallel setting. Our proposal fills this gap and studies transferring state-of-the-art techniques in Apache Flink, a modern platform for intensive streaming analytics. We thoroughly present the technical challenges encountered and the alternatives that may be applied. We show speed-ups up to 117 (resp. 2076) times over a naive parallel (resp. non-parallel) solution in Flink, by using just an ordinary 4-core machine and a real-world dataset. Our results demonstrate that oulier mining can be achieved in an efficient and scalable manner. The resulting techniques have been made publicly available in open-source","PeriodicalId":208455,"journal":{"name":"2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116406530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
SeCredISData 2018: Special Session on Sentiment, Emotion, and Credibility of Information in Social Data SeCredISData 2018:社交数据中信息的情感、情感和可信度专题会议
F. Benamara, C. Bosco, E. Fersini, G. Pasi, V. Patti, Marco Viviani
{"title":"SeCredISData 2018: Special Session on Sentiment, Emotion, and Credibility of Information in Social Data","authors":"F. Benamara, C. Bosco, E. Fersini, G. Pasi, V. Patti, Marco Viviani","doi":"10.1109/DSAA.2018.00082","DOIUrl":"https://doi.org/10.1109/DSAA.2018.00082","url":null,"abstract":"The Social Web represents nowadays the principal means to support and foster social interactions among people through Web 2.0 technologies. Individuals interact in virtual communities to pursue mutual interests or goals, by exchanging multiple kinds of contents (i.e., textual, acoustic, visual), the so-called User-Generated Content (UGC). In this context, the SeCredISData Special Session is especially devoted at discussing the implications that the analysis of big social data has in tackling open issues related to society from different perspectives. On one side, there is the need to push forward the research on emotion and sentiment, and the investigation of affective cognitive models and their possible integration into intelligent systems. On the other side, it is urgent to address the issue of on-line information credibility assessment, in an era where trusted intermediaries have disappeared and people must rely only on their cognitive capacities to judge information. The Special Session is therefore aimed at promoting the development of models and applications able to tackle these issues.","PeriodicalId":208455,"journal":{"name":"2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126766370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信