Jonathan Woodbridge, B. Mortazavi, M. Sarrafzadeh, A. Bui
{"title":"Aggregated Indexing of Biomedical Time Series Data","authors":"Jonathan Woodbridge, B. Mortazavi, M. Sarrafzadeh, A. Bui","doi":"10.1109/HISB.2012.13","DOIUrl":"https://doi.org/10.1109/HISB.2012.13","url":null,"abstract":"Remote and wearable medical sensing has the potential to create very large and high dimensional datasets. Medical time series databases must be able to efficiently store, index, and mine these datasets to enable medical professionals to effectively analyze data collected from their patients. Conventional high dimensional indexing methods are a two stage process. First, a superset of the true matches is efficiently extracted from the database. Second, supersets are pruned by comparing each of their objects to the query object and rejecting any objects falling outside a predetermined radius. This pruning stage heavily dominates the computational complexity of most conventional search algorithms. Therefore, indexing algorithms can be significantly improved by reducing the amount of pruning. This paper presents an online algorithm to aggregate biomedical times series data to significantly reduce the search space (index size) without compromising the quality of search results. This algorithm is built on the observation that biomedical time series signals are composed of cyclical and often similar patterns. This algorithm takes in a stream of segments and groups them to highly concentrated collections. Locality Sensitive Hashing (LSH) is used to reduce the overall complexity of the algorithm, allowing it to run online. The output of this aggregation is used to populate an index. The proposed algorithm yields logarithmic growth of the index (with respect to the total number of objects) while keeping sensitivity and specificity simultaneously above 98%. Both memory and runtime complexities of time series search are improved when using aggregated indexes. In addition, data mining tasks, such as clustering, exhibit runtimes that are orders of magnitudes faster when run on aggregated indexes.","PeriodicalId":375089,"journal":{"name":"2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115138041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Ontological Approach for the Management of Informed Consent Permissions","authors":"M. Grando, A. Boxwala, R. Schwab, N. Alipanah","doi":"10.1109/HISB.2012.19","DOIUrl":"https://doi.org/10.1109/HISB.2012.19","url":null,"abstract":"We have developed an ontology-based model of subject's permissions and organization's obligations resulting from the informed consent process. For the initial evaluation of the ontology we modeled the research plan of an informed consent document currently used by the UCSD Moores Cancer Center (MCC) for collecting and banking biospecimens for use in cancer research. We have also populated the ontology with de-identified clinical data and sample data from patients who consented to participate in the study. Furthermore, we provided reasoning mechanisms to support requests from real uses cases involving researchers approaching MCC requesting access to use collected clinical data and biospecimens. We supported those requests by identifying resources available for reuse, while checking conformance with preexisting subject's permissions. Based on the lessons learned from this study we propose a scalable framework for specifying subject's permission and checking researcher's resource requests in compliance with given permissions. The proposed framework is an extension of an existing general-purpose policy engine based on XACML (eXtensible Access Control Markup Language), incorporating ontology-based reasoning. Given the lack of standards for sharing, integrating and checking compliance with subject's consents our research could have an important future practical impact.","PeriodicalId":375089,"journal":{"name":"2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133075773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aaron Mandel, Michael Kamerick, Douglas Berman, Lisa Dahm
{"title":"University of California Research eXchange (UCReX): A Federated Cohort Discovery System","authors":"Aaron Mandel, Michael Kamerick, Douglas Berman, Lisa Dahm","doi":"10.1109/HISB.2012.71","DOIUrl":"https://doi.org/10.1109/HISB.2012.71","url":null,"abstract":"The University of California has committed to the development of a system to encourage collaboration among its 5 Medical Center campuses. The name of this system is UCReX, for the UC Research eXchange. The goals of UCReX are to: (1) Enhance access to clinical data for research, (2) Build a technical infrastructure to allow crossinstitutional sharing of harmonized clinical data, (3) Inform data collection processes.","PeriodicalId":375089,"journal":{"name":"2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130368477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Sohn, Sean P. Murphy, Siddhartha R. Jonnalagadda, K. Wagholikar, Stephen T Wu, C. Chute, Hongfang Liu, Scott R. Halgrim
{"title":"Systematic Analysis of Cross-Institutional Medication Description Patterns in Clinical Notes","authors":"S. Sohn, Sean P. Murphy, Siddhartha R. Jonnalagadda, K. Wagholikar, Stephen T Wu, C. Chute, Hongfang Liu, Scott R. Halgrim","doi":"10.1109/HISB.2012.43","DOIUrl":"https://doi.org/10.1109/HISB.2012.43","url":null,"abstract":"In clinical notes, medication information follows certain semantic patterns and some medication descriptions contain additional word(s) between medication attributes. Therefore, it is essential to understand the semantic patterns as well as the patterns of the context interspersed among them for natural language processing tools to effectively extract comprehensive medication information. We examined both semantic and context patterns and compared those found in Mayo Clinic and i2b2 challenge data. We found that some variations exist between the institutions but the dominant patterns are common.","PeriodicalId":375089,"journal":{"name":"2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123849991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Frequency of ConText Lexical Items in Diverse Medical Texts","authors":"B. Chapman, Wei Wei, W. Chapman","doi":"10.1109/HISB.2012.60","DOIUrl":"https://doi.org/10.1109/HISB.2012.60","url":null,"abstract":"We assess the relative frequency that lexical items defined in the pyConTextNLP package occur within radiology, history and physical, and emergency department texts. While we found significant disparity in term frequency between the text types nearly half of the lexical items were not found in any of the texts indicating that significant pruning of the lexical knowledge base could be attempted. However, the study is limited by the small number of texts studied.","PeriodicalId":375089,"journal":{"name":"2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126464500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A General Purpose Phenotype Algorithm for Venous Thromboembolism Using Billing Codes and Natural Language Processing","authors":"E. M. Hinz, L. Bastarache, J. Denny","doi":"10.1109/HISB.2012.74","DOIUrl":"https://doi.org/10.1109/HISB.2012.74","url":null,"abstract":"Deep venous thrombosis and pulmonary embolism are diseases associated with significant morbidity and mortality. Well described risk factors for venous thromboembolic disease (VTE) include immobility, trauma and genetic hypercoagulabilty states, still many cases have no known associated antecedent risks. Studies to potentially define the missing risk factors preferably identify all cases of VTE. Defining VTE in the electronic health record is more challenging due to the variable duration of VTE treatment, crossover of therapeutic modalities to other chronic diseases and prevention treatment related to hospitalizations. We designed a general purpose Natural Language (NLP) algorithm to capture acute and historical cases of thromboembolic disease retrospectively in a de-identified electronic health record. Applying the NLP algorithm to a separate evaluation set found a positive predictive value of 84.7% and sensitivity of 95.3% for an F-measure of 0.897, which was similar to the training set of 0.925. Use of the same algorithm on problem lists in patients without VTE ICD-9s resulted in a PPV of 83%. NLP of VTE ICD-9 positive cases and non-ICD-9 positive problem lists provides an effective means for capture of both acute and historical cases of venous thromboembolic disease.","PeriodicalId":375089,"journal":{"name":"2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology","volume":"2015 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127790908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cloud Computing Considerations for Biomedical Applications","authors":"R. Rauscher","doi":"10.1109/HISB.2012.67","DOIUrl":"https://doi.org/10.1109/HISB.2012.67","url":null,"abstract":"This poster considers the practical barriers to public cloud use for biomedical applications and the advantages of private cloud use for such applications. In addition, it discusses operating environment statistics that are relevant to correctly allocating resources in a private cloud.","PeriodicalId":375089,"journal":{"name":"2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128452290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Ontology-Guided Approach to Retrieving Disease Manifestation Images for Health Image Base Construction","authors":"Yang Chen, Xiaofeng Ren, Guo-Qiang Zhang, Rong Xu","doi":"10.1109/HISB.2012.32","DOIUrl":"https://doi.org/10.1109/HISB.2012.32","url":null,"abstract":"Building a comprehensive medical image database, in the spirit of the UMLS, can be beneficial for assisting diagnosis, patient education and self-care. However, a highly curated, comprehensive image database is difficult to collect as well as to annotate. We present an approach to combine visual object detection technologies with medical ontology to automatically mine web photos and retrieve a large number of disease manifestation images with minimal manual labeling. Comparing to a supervised approach, our ontology-guided approach reduces manual labeling effort to 1/10 on a variety of eye/ear/mouth diseases and improves the precision of retrieval by over 10% in many cases.","PeriodicalId":375089,"journal":{"name":"2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134174500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cohort Selection through Content-Based Image Retrieval: vfM A Case Study","authors":"Mayank Agarwal, Javed Mostafa","doi":"10.1109/HISB.2012.42","DOIUrl":"https://doi.org/10.1109/HISB.2012.42","url":null,"abstract":"In this paper, we propose ViewFinder Medicine (vfM) for automatically identifying cohort classes for MRI scans. It involves predicting a cohort class for the heretofore unseen patient (and related images) and offering linkages to historical diagnosis data associated with the members of the predicted cohort class. The basic idea is to offer a relatively accurate cohort class for a new patient so that the cohort can be used as a baseline to understand current patient's status and develop a treatment plan.","PeriodicalId":375089,"journal":{"name":"2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122410657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Salvatore Loguercio, Erik L. Clarke, Benjamin M. Good, A. Su
{"title":"A Task-Based Approach for Large-Scale Evaluation of the Gene Ontology","authors":"Salvatore Loguercio, Erik L. Clarke, Benjamin M. Good, A. Su","doi":"10.1109/HISB.2012.69","DOIUrl":"https://doi.org/10.1109/HISB.2012.69","url":null,"abstract":"The Gene Ontology (GO) provides a framework to systematically classify and annotate gene function. The annotations associated with GO play a critical role in modern biology and cover many organisms. For the human genome, over 10,000 GO terms are used to annotate gene function in an expansive database of over 200,000 annotations. Due to the importance of the GO annotations in modern biology, significant effort has been put into assessing the quality of the annotations. Providing measures of annotation completeness, accuracy, and precision is critical if researchers are to use the annotations in real-world applications with confidence. Here, we describe a task-based approach that examines the completeness and utility of GO annotations through the lens of gene enrichment analysis. Our approach can be used to model the progression of the GO annotations over time, either for a particular area of interest or for the body of annotations as a whole. Using this framework, we conducted a large-scale analysis of gene expression datasets from the NCBI Gene Expression Omnibus (GEO). In particular, we identified terms of interest for each dataset through semantic annotation of biomedical data, then tracked the behavior of these terms as a function of time. The preliminary results provide significant information about the progress and character of GO annotations over time. This framework is flexible enough to examine all or part of the GO annotations, across multiple species, and with various enrichment methods. We also discuss how this framework can be used to evaluate different annotation methods. For example, by comparing the performance of annotations generated with a particular method to the performance of canonical annotations, it is possible to determine their relative quality.","PeriodicalId":375089,"journal":{"name":"2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127808756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}