Sara Tily, M. C. Murphy, Omkar Dangat, Christopher D. Smith
{"title":"GO Visual Browser","authors":"Sara Tily, M. C. Murphy, Omkar Dangat, Christopher D. Smith","doi":"10.1109/ICSC.2011.55","DOIUrl":"https://doi.org/10.1109/ICSC.2011.55","url":null,"abstract":"Biologists commonly annotate genome and protein properties using ontology terms. While biologists have a deep understanding of the underlying biology, they rarely have a good understanding of the semantic networks used to represent domain ontologies. This paper presents the design and implementation of the GO Visual Browser: a new tool to provide visualization of Gene Ontology terms and relationships, as well as retrieval of associated genomic data. Browsing involves selecting a term and type of ontology relationship, and then displaying an interactive map with all of the related terms. This map can then be used to search a genome database for all features, related sequences, and metadata annotated with the selected ontology terms. Our code is fully functional and is being distributed as open source software.","PeriodicalId":408382,"journal":{"name":"2011 IEEE Fifth International Conference on Semantic Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130829399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"OMEX: Software for Mining Mathematical Expression Semantics from Scientific Documents","authors":"Y. Stathopoulos, Brian Harrington","doi":"10.1109/ICSC.2011.65","DOIUrl":"https://doi.org/10.1109/ICSC.2011.65","url":null,"abstract":"Semantic analysis of scientific documents can benefit from the information carried by mathematical expressions. However, making established data-mining techniques formula-aware is pre-conditioned on the ability to process expressions in documents. In this work, we present OMEX, a software framework capable of extracting mathematical expressions from scientific documents produced using the LATEX typesetting environment.","PeriodicalId":408382,"journal":{"name":"2011 IEEE Fifth International Conference on Semantic Computing","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129277375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Predictive Semantic Social Media Analysis","authors":"D. Ostrowski","doi":"10.1109/ICSC.2011.16","DOIUrl":"https://doi.org/10.1109/ICSC.2011.16","url":null,"abstract":"Social networks today represent a substantial amount of shared knowledge and information. To leverage the interdependence of this data, we consider two forms of relational learning to facilitate semantic understanding. First, relational modeling is applied to local networks to reinforce knowledge in each entity. Then, a social dimension approach is applied to generate new (high level) features. These feature sets are then trained towards the identification of learned purchase behaviors (belief system / values) thus supporting a means of prediction. We consider this generation of higher level classifications (termed as social dimensions) to enable increased accuracy in behavior prediction in order to support more focused customer relationships.","PeriodicalId":408382,"journal":{"name":"2011 IEEE Fifth International Conference on Semantic Computing","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114111884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Bracewell, Marc T. Tomlinson, Ying Shi, Jeremy Bensley, Mary Draper
{"title":"Who's Playing Well with Others: Determining Collegiality in Text","authors":"D. Bracewell, Marc T. Tomlinson, Ying Shi, Jeremy Bensley, Mary Draper","doi":"10.1109/ICSC.2011.48","DOIUrl":"https://doi.org/10.1109/ICSC.2011.48","url":null,"abstract":"In this paper, we present a framework for determining the interpersonal relations exhibited between two individuals. Specifically, we focus on recognizing the presence or absence of collegiality in discussion threads and dialogues. Collegiality results from the existence of harmonious relationships irrespective of the group's power structure. We have identified four psychologically-motived language uses that indicate collegiality. These language uses are identified in text with the use of a set of attributes that are assigned to each language use and can be extracted using grammars and lexicons. Through the attributes, language uses, and dialogue features, a model can be learned that can determine whether two people are collegial, uncollegial, or whether there is not enough information. Using multi-class logistic regression, we obtain an overall micro-averaged F-measure of 83.3%.","PeriodicalId":408382,"journal":{"name":"2011 IEEE Fifth International Conference on Semantic Computing","volume":"108 8‐10","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120839028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kyle Richardson, D. Bobrow, C. Condoravdi, R. Waldinger, Amar K. Das
{"title":"English Access to Structured Data","authors":"Kyle Richardson, D. Bobrow, C. Condoravdi, R. Waldinger, Amar K. Das","doi":"10.1109/ICSC.2011.67","DOIUrl":"https://doi.org/10.1109/ICSC.2011.67","url":null,"abstract":"We present work on using a domain model to guide text interpretation, in the context of a project that aims to interpret English questions as a sequence of queries to be answered from structured databases. We adapt a broad-coverage and ambiguity-enabled natural language processing (NLP) system to produce domain-specific logical forms, using knowledge of the domain to zero in on the appropriate interpretation. The vocabulary of the logical forms is drawn from a domain theory that constitutes a higher-level abstraction of the contents of a set of related databases. The meanings of the terms are encoded in an axiomatic domain theory. To retrieve information from the databases, the logical forms must be instantiated by values constructed from fields in the database. The axiomatic domain theory is interpreted by the first-order theorem prover SNARK to identify the groundings, and then retrieve the values through procedural attachments semantically linked to the database. SNARK attempts to prove the logical form as a theorem by reasoning over the theory that is linked to the database and returns the exemplars of the proof(s) back to the user as answers to the query. The focus of this paper is more on the language task, however, we discuss the interaction that must occur between linguistic analysis and reasoning for an end-to-end natural language interface to databases. We illustrate the process using examples drawn from an HIV treatment domain, where the underlying databases are records of temporally bound treatments of individual patients.","PeriodicalId":408382,"journal":{"name":"2011 IEEE Fifth International Conference on Semantic Computing","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125613939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Distributing Computationally Expensive Matching of Requirements to Capability Models","authors":"Reymonrod G. Vasquez, Kunal Verma, A. Kass","doi":"10.1109/ICSC.2011.54","DOIUrl":"https://doi.org/10.1109/ICSC.2011.54","url":null,"abstract":"In this paper, we present a distributed way to automatically map users' requirements to reference process models. In a prior paper [9], we presented a tool called Process Model Requirements Gap Analyzer (ProcGap), which combines natural language processing, information retrieval, and semantic reasoning to automatically match and map textual requirements to domain-specific process models. Although the tool proved beneficial to users in reusing prior knowledge, by making it easy to use process models, the tool has one main drawback. It takes a long time to compare a very large requirements document, one that has a few thousand requirements, to a process model hierarchy with a few thousand capabilities. In this paper, we present how we solved this problem using Apache Hadoop. Apache Hadoop allows ProcGap to distribute matching task across several machines, increasing the tool's performance and usability. We present the performance comparison of running ProcGap on a single-machine, and our distributed version.","PeriodicalId":408382,"journal":{"name":"2011 IEEE Fifth International Conference on Semantic Computing","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116233205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Verhagen, J. Pustejovsky, Ronald C. Taylor, A. Sanfilippo
{"title":"Modular Semantic Tagging of Medline Abstracts and Its Use in Inferring Regulatory Networks","authors":"M. Verhagen, J. Pustejovsky, Ronald C. Taylor, A. Sanfilippo","doi":"10.1109/ICSC.2011.78","DOIUrl":"https://doi.org/10.1109/ICSC.2011.78","url":null,"abstract":"We describe Meds tract Plus, a resource for mining relations from the Medline bibliographic database that is currently under construction. It was built on the remains of Meds tract, a previously created resource that included a bio-relation server and an acronym database. Meds tract Plus uses simple and scalable natural language processing modules to structure text, is designed with reusability and extendibility in mind, and adheres to the philosophy of the Linguistic Annotation Framework. We show how Meds tract Plus has been used to provide seeds for a novel approach to inferring transcriptional regulatory networks from gene expression data.","PeriodicalId":408382,"journal":{"name":"2011 IEEE Fifth International Conference on Semantic Computing","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127342132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data-Driven vs. Semantic-Technology-Driven Tag-Based Video Location Estimation","authors":"Jaeyoung Choi, G. Friedland","doi":"10.1109/ICSC.2011.37","DOIUrl":"https://doi.org/10.1109/ICSC.2011.37","url":null,"abstract":"The following article describes two approaches to determining the geo-coordinates of the recording place of Flickr videos based on textual metadata. The systems are tested on the MediaEval 2010 Placing Task evaluation data, which consists of 5091 unfiltered test videos and metadata records. The first system is a data-driven approach that uses a heuristics based on the spatial variance of tags. The second one extends this heuristics by using semantic technologies, such as extended Word net and a geographical gazetteer. The performance peaks at being able to classify 14% of the videos to within an accuracy of 10m. The article present the two algorithms, evaluates their accuracy and discusses the advantages and disadvantages of using Semantic technologies for this task.","PeriodicalId":408382,"journal":{"name":"2011 IEEE Fifth International Conference on Semantic Computing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125132946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the Utility of WordNet for Ontology Alignment: Is it Really Worth it?","authors":"Uthayasanker Thayasivam, Prashant Doshi","doi":"10.1109/ICSC.2011.28","DOIUrl":"https://doi.org/10.1109/ICSC.2011.28","url":null,"abstract":"Many ontology alignment algorithms augment syntactic matching with the use of WordNet (WN) in order to improve their performance. The advantage of using WN in alignment seems apparent. However, we strike a more cautionary note. We analyze the utility of WN in the context of the reduction in precision and increase in execution time that its use entails. For this analysis, we particularly focus on real-world ontologies. We report distinct trends in the performance of WN-based alignment in comparison with alignment that uses syntactic matching only. We analyze the trends and their implications, and provide useful insights on the types of ontology pair for which WN-based alignment may potentially be worthwhile and those types where it may not be.","PeriodicalId":408382,"journal":{"name":"2011 IEEE Fifth International Conference on Semantic Computing","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132349449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Relating the Semantics of Dialogue Acts to Linguistic Properties: A Machine Learning Perspective through Lexical Cues","authors":"A. Fang, H. Bunt, Jing Cao, Xiaoyue Liu","doi":"10.1109/ICSC.2011.32","DOIUrl":"https://doi.org/10.1109/ICSC.2011.32","url":null,"abstract":"This paper describes a corpus-based investigation of dialogue acts. In particular, it attempts to answer questions about the empirical distribution of dialogue acts and to what extent dialogue acts can be automatically predicted from their lexical features. The Switchboard Dialogue Act Corpus is adopted and the SWBD-DAMSL tags used for automatic prediction. We show that 60-70% of the dialogue acts can be predicted from lexical features alone depending on different levels of granularity. We also present a mapping from SWBD-DAMSL tags to the tags of the new ISO standard for dialogue act annotation, as part of an ongoing investigation into the relationship between the structure and granularity of the tag set and classification accuracy. The paper concludes with discussions and suggestions for future work.","PeriodicalId":408382,"journal":{"name":"2011 IEEE Fifth International Conference on Semantic Computing","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133254958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}