{"title":"Exploiting Twitter for Border Security-Related Intelligence Gathering","authors":"J. Piskorski, Hristo Tanev, A. Balahur","doi":"10.1109/EISIC.2013.63","DOIUrl":"https://doi.org/10.1109/EISIC.2013.63","url":null,"abstract":"Nowadays, an ever-growing amount of information is being transferred through web-based social media. In particular, Twitter emerged to be an important social medium providing most up-to-date information and comments on current events and topics of any kind. This led to a continuous growth of the interest of various security-related organizations in tools for real-time monitoring of Twitter streams to collect information there from. In this paper we present some initial explorations on how to exploit Twitter for border security-related intelligence gathering. To be more precise, we present techniques for: (a) retrieving and analyzing tweets posted in third countries, in which opinions and information are provided on migration to Europe or related issues (here we experimented with sentiment analysis for improving the retrieval performance), and (b) enhancing the information extracted from online news on border security-related events in third countries with information extracted from Twitter.","PeriodicalId":229195,"journal":{"name":"2013 European Intelligence and Security Informatics Conference","volume":"131 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134408649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Increasing NER Recall with Minimal Precision Loss","authors":"J. Kuperus, C. Veenman, M. V. Keulen","doi":"10.1109/EISIC.2013.23","DOIUrl":"https://doi.org/10.1109/EISIC.2013.23","url":null,"abstract":"Named Entity Recognition (NER) is broadly used as a first step toward the interpretation of text documents. However, for many applications, such as forensic investigation, recall is currently inadequate, leading to loss of potentially important information. Entity class ambiguity cannot be resolved reliably due to the lack of context information or the exploitation thereof. Consequently, entity classification introduces too many errors, leading to severe omissions in answers to forensic queries. We propose a technique based on multiple candidate labels, effectively postponing decisions for entity classification to query time. Entity resolution exploits user feedback: a user is only asked for feedback on entities relevant to his/her query. Moreover, giving feedback can be stopped anytime when query results are considered good enough. We propose several interaction strategies that obtain increased recall with little loss in precision.","PeriodicalId":229195,"journal":{"name":"2013 European Intelligence and Security Informatics Conference","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132479008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automated Counter-Terrorism","authors":"Leslie Ball, M. Craven","doi":"10.1109/EISIC.2013.48","DOIUrl":"https://doi.org/10.1109/EISIC.2013.48","url":null,"abstract":"We present a holistic systems view of automated intelligence analysis for counter-terrorism with focus on the behavioural attributes of terrorist groups.","PeriodicalId":229195,"journal":{"name":"2013 European Intelligence and Security Informatics Conference","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132859068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cross Domain Assessment of Document to HTML Conversion Tools to Quantify Text and Structural Loss during Document Analysis","authors":"Kyle Goslin, M. Hofmann","doi":"10.1109/EISIC.2013.22","DOIUrl":"https://doi.org/10.1109/EISIC.2013.22","url":null,"abstract":"During forensic text analysis, the automation of the process is key when working with large quantities of documents. As documents often come in a wide variety of different file types, this creates the need for tailored tools to be developed to analyze each document type to correctly identify and extract text elements for analysis without loss. These text extraction tools often omit sections of text that are unreadable from documents leaving drastic inconsistencies during the forensic text analysis process. As a solution to this a single output format, HTML, was chosen as a unified analysis format. Document to HTML/CSS extraction tools each with varying techniques to convert common document formats to rich HTML/CSS counterparts were tested. This approach can reduce the amount of analysis tools needed during forensic text analysis by utilizing a single document format. Two tests were designed, a 10 point document overview test and a 48 point detailed document analysis test to assess and quantify the level of loss, rate of error and overall quality of outputted HTML structures. This study concluded that tools that utilize a number of different approaches and have an understanding of the document structure yield the best results with the least amount of loss.","PeriodicalId":229195,"journal":{"name":"2013 European Intelligence and Security Informatics Conference","volume":"136 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115416855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ned Bakelman, John V. Monaco, Sung-Hyuk Cha, C. Tappert
{"title":"Keystroke Biometric Studies on Password and Numeric Keypad Input","authors":"Ned Bakelman, John V. Monaco, Sung-Hyuk Cha, C. Tappert","doi":"10.1109/EISIC.2013.45","DOIUrl":"https://doi.org/10.1109/EISIC.2013.45","url":null,"abstract":"The keystroke biometric classification system described in this study was evaluated on two types of short input - passwords and numeric keypad input. On the password input, the system outperforms 14 other systems evaluated in a previous study using the same raw input data. The three top performing systems in that study had equal error rates between 9.6% and 10.2%. With the classification system developed in this study, equal error rates of 8.7% were achieved on both the features from the previous study and on a new set of features. On the numeric keypad input, the system achieved an equal error rate of 10.5% on the features from the previous study and 6.1% on a new set of features.","PeriodicalId":229195,"journal":{"name":"2013 European Intelligence and Security Informatics Conference","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125060066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
F. Amardeilh, Wessel Kraaij, Martijn Spitters, C. Versloot, Sinan Yurtsever
{"title":"Semi-automatic Ontology Maintenance in the Virtuoso News Monitoring System","authors":"F. Amardeilh, Wessel Kraaij, Martijn Spitters, C. Versloot, Sinan Yurtsever","doi":"10.1109/EISIC.2013.29","DOIUrl":"https://doi.org/10.1109/EISIC.2013.29","url":null,"abstract":"Domain ontologies are a central component in the Virtuoso demonstrator, a system that captures, analyzes and aggregates open news sources in order to achieve an information position that supports complex decision processes in the context of border control. However, maintenance of such an ontology is a challenging task. We demonstrate a text processing pipeline that supports domain experts in maintaining the domain ontology. The system facilitates the maintenance by generating candidate concepts that should be added to the ontology. Some initial tests have been carried out with filtering candidate concepts from a domain specific news feed.","PeriodicalId":229195,"journal":{"name":"2013 European Intelligence and Security Informatics Conference","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128478613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Digital-Forensics Based Pattern Recognition for Discovering Identities in Electronic Evidence","authors":"Hans Henseler, J. Hofste, M. V. Keulen","doi":"10.1109/EISIC.2013.24","DOIUrl":"https://doi.org/10.1109/EISIC.2013.24","url":null,"abstract":"With the pervasiveness of computers and mobile devices, digital forensics becomes more important in law enforcement. Detectives increasingly depend on the scarce support of digital specialists which impedes efficiency of criminal investigations. This paper proposes and algorithm to extract, merge and rank identities that are encountered in the electronic evidence during processing. Two experiments are described demonstrating that our approach can assist with the identification of frequently occurring identities so that investigators can prioritize the investigation of evidence units accordingly.","PeriodicalId":229195,"journal":{"name":"2013 European Intelligence and Security Informatics Conference","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126415785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Sykora, Thomas W. Jackson, A. O'Brien, Suzanne Elayan
{"title":"National Security and Social Media Monitoring: A Presentation of the EMOTIVE and Related Systems","authors":"M. Sykora, Thomas W. Jackson, A. O'Brien, Suzanne Elayan","doi":"10.1109/EISIC.2013.38","DOIUrl":"https://doi.org/10.1109/EISIC.2013.38","url":null,"abstract":"Today social media streams, such as Twitter, represent vast amounts of 'real-time' daily streaming data. Topics on these streams cover every range of human communication, ranging from banal banter, to serious reactions to events and information sharing regarding any imaginable product, item or entity. It has now become the norm for publicly visible events to break news over social media streams first, and only then followed by main stream media picking up on the news. It has been suggested in literature that social-media are a valid, valuable and effective real-time tool for gauging public subjective reactions to events and entities. Due to the vast big-data that is generated on a daily basis on social media streams, monitoring and gauging public reactions has to be automated and most of all scalable - i.e. human, expert monitoring is generally unfeasible. In this paper the EMOTIVE system, a project funded jointly by the DSTL (Defence Science and Technology Laboratory) and EPSRC, which focuses on monitoring fine-grained emotional responses relating to events of national security importance, will be presented. Similar systems for monitoring national security events are also presented and the primary traits of such national security social media monitoring systems are introduced and discussed.","PeriodicalId":229195,"journal":{"name":"2013 European Intelligence and Security Informatics Conference","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121754394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Probability Analysis of Cyber Attack Paths against Business and Commercial Enterprise Systems","authors":"Dmitry Dudorov, D. Stupples, M. Newby","doi":"10.1109/EISIC.2013.13","DOIUrl":"https://doi.org/10.1109/EISIC.2013.13","url":null,"abstract":"The level of risk of attack from new cyber-crime related malware is difficult to quantify as standard risk analysis models often take an incomplete view of the overall system. In order to understand the full malware risk faced by organisations any model developed to support the analysis must be able to address a statistical combination of all feasible attack scenarios. Moreover, since all parametric aspects of a sophisticated cyber attack cannot be quantified, a degree of expert judgement needs to be applied. We develop a modeling approach that will facilitate risk assessment of common cyber attack scenarios together with likely probabilities of successful attack for each scenario. The paper demonstrates through use cases how a combined attack can be assessed.","PeriodicalId":229195,"journal":{"name":"2013 European Intelligence and Security Informatics Conference","volume":"330 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130827162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Countering Plagiarism by Exposing Irregularities in Authors' Grammar","authors":"Michael Tschuggnall, Günther Specht","doi":"10.1109/EISIC.2013.10","DOIUrl":"https://doi.org/10.1109/EISIC.2013.10","url":null,"abstract":"Unauthorized copying or stealing of intellectual propierties of others is a serious problem in modern society. In case of textual plagiarism, it becomes more and more easier to find appropriate sources using the huge amount of data available through online databases. To counter this problem, the two main approaches are categorized as external and intrinsic plagiarism detection, respectively. While external algorithms have the possibility to compare a suspicious document with numerous sources, intrinsic algorithms are allowed to solely inspect the suspicious document in order to predict plagiarism, which is important especially if no sources are available. In this paper we present a novel approach in the field of intrinsic plagiarism detection by analyzing syntactic information of authors and finding irregularities in sentence constructions. The main idea follows the assumption that authors have their mostly unconsciously used set of how to build sentences, which can be utilized to distinguish authors. Therefore the algorithm splits a suspicious document into single sentences, tags each word with part-of-speech (POS) classifiers and creates POS-sequences representing each sentence. Subsequently, the distance between every distinct pair of sentences is calculated by applying modified sequence alignment algorithms and stored into a distance matrix. After utilizing a Gaussian normal distribution function over the mean distances for each sentence, suspicious sentences are selected, grouped and predicted to be plagiarized. Finally, thresholds and parameters the algorithm uses are optimized by applying genetic algorithms. The approach has been evaluated against a large test corpus of English documents, showing promising results.","PeriodicalId":229195,"journal":{"name":"2013 European Intelligence and Security Informatics Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129042137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}