Nurjahan Begum, Bing Hu, T. Rakthanmanon, Eamonn J. Keogh
{"title":"Towards a minimum description length based stopping criterion for semi-supervised time series classification","authors":"Nurjahan Begum, Bing Hu, T. Rakthanmanon, Eamonn J. Keogh","doi":"10.1109/IRI.2013.6642490","DOIUrl":"https://doi.org/10.1109/IRI.2013.6642490","url":null,"abstract":"In the last decade the plunging costs of sensors/storage have made it possible to obtain vast amounts of medical telemetry. However for this data to be useful, it must be annotated. This annotation, requiring the attention of medical experts is very expensive and time consuming, and remains the critical bottleneck in medical analysis. Semi-supervised learning is an obvious way to mitigate the need for human labor, however, most such algorithms are designed for intrinsically discrete objects, and do not work well in this domain, which requires the ability to deal with real-valued objects arriving in a streaming fashion. In this work we make two contributions. First, we demonstrate that in many cases just a handful of human annotated examples are sufficient to perform accurate classification. Second, we devise a novel parameter-free stopping criterion for semi-supervised learning. We evaluate our work with a comprehensive set of experiments on diverse medical data sources including electrocardiograms. Our experimental results show that our approach can construct accurate classifiers even if given only a single annotated instance.","PeriodicalId":418492,"journal":{"name":"2013 IEEE 14th International Conference on Information Reuse & Integration (IRI)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122203554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cloud-based tasking, collection, processing, exploitation, and dissemination","authors":"S. Rubin, Gordon K. Lee","doi":"10.1007/978-3-319-04717-1_1","DOIUrl":"https://doi.org/10.1007/978-3-319-04717-1_1","url":null,"abstract":"","PeriodicalId":418492,"journal":{"name":"2013 IEEE 14th International Conference on Information Reuse & Integration (IRI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129675066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Feature-to-code traceability in a collection of software variants: Combining formal concept analysis and information retrieval","authors":"H. E. Salman, A. Seriai, C. Dony","doi":"10.1109/IRI.2013.6642474","DOIUrl":"https://doi.org/10.1109/IRI.2013.6642474","url":null,"abstract":"Today, developing new software variant to meet new demands of customers by ad-hoc copying of already existing variants of a software system is a frequent phenomenon in the software industry. Typically, maintaining such variants becomes difficult and expensive over the time. To re-engineer such software variants into a software product line (SPL) for systematic reuse, it is important to identify source code elements that implement a specific feature in order to understand product variants code. Information Retrieval(IR) methods have been used widely to support this purpose in a single software. This paper proposes a new approach to improve the performance of IR methods in a collection of similar software variants. Our proposal produces following two improvements. First, increasing the accuracy of IR results by exploiting commonality and variability across software variants. Secondly, increasing the number of retrieved links that are relevant by reducing the abstraction gap between feature and source code levels. We have validated our approach with a set of variants of two different systems. The experimental results showed that the proposed approach outperforms the conventional application of IR as well as the most relevant work on the subject.","PeriodicalId":418492,"journal":{"name":"2013 IEEE 14th International Conference on Information Reuse & Integration (IRI)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121381948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Creating GUI-based DSL formal tools","authors":"Robson Silva, A. Mota, R. R. Starr","doi":"10.1109/IRI.2013.6642514","DOIUrl":"https://doi.org/10.1109/IRI.2013.6642514","url":null,"abstract":"In this paper we propose a rigorous methodology to create GUI-based DSLs formal tools. From a formal specification of a DSL we extract a metamodel and create a user-friendly (GUI) front-end. Then we use a code synthesizer to create a formally verified back-end. At the end we link both parts using a wrapper solution. We aim at providing a productive and trustworthy development methodology to safety critical industries.","PeriodicalId":418492,"journal":{"name":"2013 IEEE 14th International Conference on Information Reuse & Integration (IRI)","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130000280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Author attribution on streaming data","authors":"Sadi Evren Seker, K. Al-Naami, L. Khan","doi":"10.1109/IRI.2013.6642511","DOIUrl":"https://doi.org/10.1109/IRI.2013.6642511","url":null,"abstract":"The concept of novel authors occurring in streaming data source, such as evolving social media, is an unaddressed problem up until now. Existing author attribution techniques deals with the datasets, where the total number of authors do not change in the training or the testing time of the classifiers. This study focuses on the question, “what happens if new authors are added into the system by time?”. Moreover in this study we are also dealing with the problems that some of the authors may not stay and may disappear by time or may reappear after a while. In this study stream mining approaches are proposed to solve the problem. The test scenarios are created over the existing IMDB62 data set, which is widely used by author attribution algorithms already. We used our own shuffling algorithms to create the effect of novel authors. Also before the stream mining, POS tagging approaches and the TF-IDF methods are applied for the feature extraction. And we have applied bi-tag approach where two consecutive tags are considered as a new feature in our approach. By the help of novel techniques, first time proposed in this paper, the success rate has been increased from 35% to 61% for the authorship attribution on streaming text data.","PeriodicalId":418492,"journal":{"name":"2013 IEEE 14th International Conference on Information Reuse & Integration (IRI)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117130517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Spatial-temporal motion information integration for action detection and recognition in non-static background","authors":"Dianting Liu, M. Shyu, Guiru Zhao","doi":"10.1109/IRI.2013.6642527","DOIUrl":"https://doi.org/10.1109/IRI.2013.6642527","url":null,"abstract":"Various motion detection methods have been proposed in the past decade, but there are seldom attempts to investigate the advantages and disadvantages of different detection mechanisms so that they can complement each other to achieve a better performance. Toward such a demand, this paper proposes a human action detection and recognition framework to bridge the semantic gap between low-level pixel intensity change and the high-level understanding of the meaning of an action. To achieve a robust estimation of the region of action with the complexities of an uncontrolled background, we propose the combination of the optical flow field and Harris3D corner detector to obtain a new spatial-temporal estimation in the video sequences. The action detection method, considering the integrated motion information, works well with the dynamic background and camera motion, and demonstrates the advantage of the proposed method of integrating multiple spatial-temporal cues. Then the local features (SIFT and STIP) extracted from the estimated region of action are used to learn the Universal Background Model (UBM) for the action recognition task. The experimental results on KTH and UCF YouTube Action (UCF11) data sets show that the proposed action detection and recognition framework can not only better estimate the region of action but also achieve better recognition accuracy comparing with the peer work.","PeriodicalId":418492,"journal":{"name":"2013 IEEE 14th International Conference on Information Reuse & Integration (IRI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116361262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Behavioral sequence prediction for evolving data stream","authors":"Sheikh M. Qumruzzaman, L. Khan, B. Thuraisingham","doi":"10.1109/IRI.2013.6642509","DOIUrl":"https://doi.org/10.1109/IRI.2013.6642509","url":null,"abstract":"Behavioral pattern prediction has many applications, ranging from consumer buying behavior analysis, web surfing prediction to network attack prediction. The traditional behavioral prediction technique works mainly on a fixed dataset. But recent advances in digital technology generates a huge amount of data which contributes to data stream. Data evolves over time due to the concept drift. Stream-based classification also needs to evolve over time. Our goal is not to predict a single action/behavior, but a sequence of actions that can occur later depending on the previous actions. We call this problem “Behavioral Pattern Extrapolation”. In our research, we exploited a stream mining based technique along with Markovian model, where we used an incremental and ensemble based technique for predicting a set of future actions. We have experimented using a number of benchmark datasets and shown the effectiveness of our approach.","PeriodicalId":418492,"journal":{"name":"2013 IEEE 14th International Conference on Information Reuse & Integration (IRI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120958960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimized retrieval algorithms for personalized content aggregation","authors":"Dan He, D. S. Parker","doi":"10.1109/IRI.2013.6642482","DOIUrl":"https://doi.org/10.1109/IRI.2013.6642482","url":null,"abstract":"Personalized content aggregation methods, such as for news aggregation, are an emerging technology. The growth of mobile devices has only increased demand for timely updates on online information. To reduce traffic or bandwidth, efficient retrieval scheduling strategies have been developed to monitor new postings. Most of these methods, however, do not take user access patterns into consideration. For example, the strategy for a user who checks news once a day should be different from the strategy for a user who checks news ten times a day. In this paper, we propose a personalized content aggregation model in which delay time depends not only on the retrieval time and posting time, but also on user access patterns. With total expected delay as the objective, we derive a resource allocation strategy and retrieval scheduling strategy that is optimal when postings are Poisson. To our knowledge, this is the first personalized aggregation model on multiple data sources.","PeriodicalId":418492,"journal":{"name":"2013 IEEE 14th International Conference on Information Reuse & Integration (IRI)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122710963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Randall Wald, T. Khoshgoftaar, Amri Napolitano, Chris Sumner
{"title":"Predicting susceptibility to social bots on Twitter","authors":"Randall Wald, T. Khoshgoftaar, Amri Napolitano, Chris Sumner","doi":"10.1109/IRI.2013.6642447","DOIUrl":"https://doi.org/10.1109/IRI.2013.6642447","url":null,"abstract":"The popularity of the Twitter social networking site has made it a target for social bots, which use increasingly-complex algorithms to engage users and pretend to be humans. While much research has studied how to identify such bots in the process of spam detection, little research has looked at the other side of the question - detecting users likely to be fooled by bots. In this paper, we examine a dataset consisting of 610 users who were messaged by Twitter bots, and determine which features describing these users were most helpful in predicting whether or not they would interact with the bots (through replies or following the bot). We then use six classifiers to build models for predicting whether a given user will interact with the bot, both using the selected features and using all features. We find that a users' Klout score, friends count, and followers count are most predictive of whether a user will interact with a bot, and that the Random Forest algorithm produces the best classifier, when used in conjunction with one of the better feature ranking algorithms (although poor feature ranking can actually make performance worse than no feature ranking). Overall, these results show promise for helping understand which users are most vulnerable to social bots.","PeriodicalId":418492,"journal":{"name":"2013 IEEE 14th International Conference on Information Reuse & Integration (IRI)","volume":"471 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127813069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
V. Casola, Alessandra De Benedictis, Massimiliano Albanese
{"title":"A moving target defense approach for protecting resource-constrained distributed devices","authors":"V. Casola, Alessandra De Benedictis, Massimiliano Albanese","doi":"10.1109/IRI.2013.6642449","DOIUrl":"https://doi.org/10.1109/IRI.2013.6642449","url":null,"abstract":"Techniques aimed at continuously changing a system's attack surface, usually referred to as Moving Target Defense (MTD), are emerging as powerful tools for thwarting cyber attacks. Such mechanisms increase the uncertainty, complexity, and cost for attackers, limit the exposure of vulnerabilities, and ultimately increase overall resiliency. In this paper, we propose an MTD approach for protecting resource-constrained distributed devices through fine-grained reconfiguration at different architectural layers. In order to show the feasibility of our approach in real-world scenarios, we study its application to Wireless Sensor Networks (WSNs), introducing two different reconfiguration mechanisms. Finally, we show how the proposed mechanisms are effective in reducing the probability of successful attacks.","PeriodicalId":418492,"journal":{"name":"2013 IEEE 14th International Conference on Information Reuse & Integration (IRI)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130341381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}