{"title":"Decomposing phenotype descriptions for the human skeletal phenome.","authors":"Tudor Groza, Jane Hunter, Andreas Zankl","doi":"10.4137/BII.S10729","DOIUrl":"https://doi.org/10.4137/BII.S10729","url":null,"abstract":"<p><p>Over the course of the last few years there has been a significant amount of research performed on ontology-based formalization of phenotype descriptions. The intrinsic value and knowledge captured within such descriptions can only be expressed by taking advantage of their inner structure that implicitly combines qualities and anatomical entities. We present a meta-model (the Phenotype Fragment Ontology) and a processing pipeline that enable together the automatic decomposition and conceptualization of phenotype descriptions for the human skeletal phenome. We use this approach to showcase the usefulness of the generic concept of phenotype decomposition by performing an experimental study on all skeletal phenotype concepts defined in the Human Phenotype Ontology.</p>","PeriodicalId":88397,"journal":{"name":"Biomedical informatics insights","volume":"6 ","pages":"1-14"},"PeriodicalIF":0.0,"publicationDate":"2013-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.4137/BII.S10729","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"31264693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
John P Pestian, Pawel Matykiewicz, Michelle Linn-Gust, Brett South, Ozlem Uzuner, Jan Wiebe, K Bretonnel Cohen, John Hurdle, Christopher Brew
{"title":"Sentiment Analysis of Suicide Notes: A Shared Task.","authors":"John P Pestian, Pawel Matykiewicz, Michelle Linn-Gust, Brett South, Ozlem Uzuner, Jan Wiebe, K Bretonnel Cohen, John Hurdle, Christopher Brew","doi":"10.4137/bii.s9042","DOIUrl":"https://doi.org/10.4137/bii.s9042","url":null,"abstract":"<p><p>This paper reports on a shared task involving the assignment of emotions to suicide notes. Two features distinguished this task from previous shared tasks in the biomedical domain. One is that it resulted in the corpus of fully anonymized clinical text and annotated suicide notes. This resource is permanently available and will (we hope) facilitate future research. The other key feature of the task is that it required categorization with respect to a large set of labels. The number of participants was larger than in any previous biomedical challenge task. We describe the data production process and the evaluation measures, and give a preliminary analysis of the results. Many systems performed at levels approaching the inter-coder agreement, suggesting that human-like performance on this task is within the reach of currently available technologies.</p>","PeriodicalId":88397,"journal":{"name":"Biomedical informatics insights","volume":"5 Suppl 1","pages":"3-16"},"PeriodicalIF":0.0,"publicationDate":"2012-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.4137/bii.s9042","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40167935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Emotion Detection in Suicide Notes using Maximum Entropy Classification.","authors":"Richard Wicentowski, Matthew R Sydes","doi":"10.4137/BII.S8972","DOIUrl":"https://doi.org/10.4137/BII.S8972","url":null,"abstract":"<p><p>An ensemble of supervised maximum entropy classifiers can accurately detect and identify sentiments expressed in suicide notes. Using lexical and syntactic features extracted from a training set of externally annotated suicide notes, we trained separate classifiers for each of fifteen pre-specified emotions. This formed part of the 2011 i2b2 NLP Shared Task, Track 2. The precision and recall of these classifiers related strongly with the number of occurrences of each emotion in the training data. Evaluating on previously unseen test data, our best system achieved an F(1) score of 0.534.</p>","PeriodicalId":88397,"journal":{"name":"Biomedical informatics insights","volume":"5 Suppl. 1","pages":"51-60"},"PeriodicalIF":0.0,"publicationDate":"2012-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.4137/BII.S8972","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30824234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fabon Dzogang, Marie-Jeanne Lesot, Maria Rifqi, Bernadette Bouchon-Meunier
{"title":"Early fusion of low level features for emotion mining.","authors":"Fabon Dzogang, Marie-Jeanne Lesot, Maria Rifqi, Bernadette Bouchon-Meunier","doi":"10.4137/BII.S8973","DOIUrl":"https://doi.org/10.4137/BII.S8973","url":null,"abstract":"<p><p>WE STUDY THE DISCRIMINATION OF EMOTIONS ANNOTATED IN FREE TEXTS AT THE SENTENCE LEVEL: a sentence can either be associated with no emotion (neutral) or multiple labels of emotion. The proposed system relies on three characteristics. We implement an early fusion of grams of increasing orders transposing an approach successfully employed in the related task of opinion mining. We apply a filtering process that consists in extracting frequent n-grams and making use of the Shannon's entropy measure to respectively maintain dictionaries at balanced sizes and keep emotion specific features. Finally the overall system is implemented as a 2-step decision process: a first classifier discriminates between neutral and emotion bearing sentences, then one classifier per emotion is applied on emotion bearing sentences. The final decision is given by the classifier holding the maximum confidence. Results obtained on the testing set are promising.</p>","PeriodicalId":88397,"journal":{"name":"Biomedical informatics insights","volume":"5 Suppl. 1","pages":"129-36"},"PeriodicalIF":0.0,"publicationDate":"2012-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.4137/BII.S8973","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30824821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Leveraging psycholinguistic resources and emotional sequence models for suicide note emotion annotation.","authors":"Eric Yeh, William Jarrold, Joshua Jordan","doi":"10.4137/BII.S8979","DOIUrl":"https://doi.org/10.4137/BII.S8979","url":null,"abstract":"<p><p>We describe the submission entered by SRI International and UC Davis for the I2B2 NLP Challenge Track 2. Our system is based on a machine learning approach and employs a combination of lexical, syntactic, and psycholinguistic features. In addition, we model the sequence and locations of occurrence of emotions found in the notes. We discuss the effect of these features on the emotion annotation task, as well as the nature of the notes themselves. We also explore the use of bootstrapping to help account for what appeared to be annotator fatigue in the data. We conclude a discussion of future avenues for improving the approach for this task, and also discuss how annotations at the word span level may be more appropriate for this task than annotations at the sentence level.</p>","PeriodicalId":88397,"journal":{"name":"Biomedical informatics insights","volume":"5 Suppl. 1","pages":"155-63"},"PeriodicalIF":0.0,"publicationDate":"2012-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.4137/BII.S8979","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30824824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A hybrid system for emotion extraction from suicide notes.","authors":"Azadeh Nikfarjam, Ehsan Emadzadeh, Graciela Gonzalez","doi":"10.4137/BII.S8981","DOIUrl":"https://doi.org/10.4137/BII.S8981","url":null,"abstract":"<p><p>The reasons that drive someone to commit suicide are complex and their study has attracted the attention of scientists in different domains. Analyzing this phenomenon could significantly improve the preventive efforts. In this paper we present a method for sentiment analysis of suicide notes submitted to the i2b2/VA/Cincinnati Shared Task 2011. In this task the sentences of 900 suicide notes were labeled with the possible emotions that they reflect. In order to label the sentence with emotions, we propose a hybrid approach which utilizes both rule based and machine learning techniques. To solve the multi class problem a rule-based engine and an SVM model is used for each category. A set of syntactic and semantic features are selected for each sentence to build the rules and train the classifier. The rules are generated manually based on a set of lexical and emotional clues. We propose a new approach to extract the sentence's clauses and constitutive grammatical elements and to use them in syntactic and semantic feature generation. The method utilizes a novel method to measure the polarity of the sentence based on the extracted grammatical elements, reaching precision of 41.79 with recall of 55.03 for an f-measure of 47.50. The overall mean f-measure of all submissions was 48.75% with a standard deviation of 7%.</p>","PeriodicalId":88397,"journal":{"name":"Biomedical informatics insights","volume":"5 Suppl. 1","pages":"165-74"},"PeriodicalIF":0.0,"publicationDate":"2012-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.4137/BII.S8981","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30824825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
John P Pestian, Pawel Matykiewicz, Michelle Linn-Gust
{"title":"What's In a Note: Construction of a Suicide Note Corpus.","authors":"John P Pestian, Pawel Matykiewicz, Michelle Linn-Gust","doi":"10.4137/BII.S10213","DOIUrl":"https://doi.org/10.4137/BII.S10213","url":null,"abstract":"<p><p>This paper reports on the results of an initiative to create and annotate a corpus of suicide notes that can be used for machine learning. Ultimately, the corpus included 1,278 notes that were written by someone who died by suicide. Each note was reviewed by at least three annotators who mapped words or sentences to a schema of emotions. This corpus has already been used for extensive scientific research.</p>","PeriodicalId":88397,"journal":{"name":"Biomedical informatics insights","volume":"5 ","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2012-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.4137/BII.S10213","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"31065737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Introductory editorial.","authors":"John P Pestian","doi":"10.4137/BII.S9297","DOIUrl":"https://doi.org/10.4137/BII.S9297","url":null,"abstract":"In this special issue of Biomedical Informatics Insights we present the results of a shared task dedicated to finding emotions in suicide notes with machine learning tools. Shared tasks are not new, but conducting this type of sentiment analysis with this amount of data is. A total of 1278 notes that were written by people just prior to dying by suicide were annotated by 160 vested volunteers. Each note was read by three different volunteers and then annotated based on an emotional schema that included: abuse, anger, blame, fear, guilt, hopelessness, sorrow, forgiveness, happiness, peacefulness, hopefulness, love, pride, thankfulness, instructions, and information. These annotated notes formed the corpus required by the machine learning methods. Twenty four teams agreed to analyze these data and then submit a manuscript for review. The systems with the highest precision and recall were submitted by: Open University in Milton Keynes UK, Microsoft Research Asia in Beijing, P.R. China and Mayo Clinic in Rochester NY USA. Each of these groups received a travel stipend provided by Diamond Healthcare, Richmond VA, USA. Each manuscript was blindly reviewed by three reviewers whose results formed the decision to publish. This is somewhat of a different review process for Biomedical Informatics Insight because we used the participants to blindly review each others manuscripts rather than calling upon the pool of reviewers. The best paper was A Hybrid Model for Automatic Emotion Recognition in Suicide Notes by Hui Yang, Alistair Willis, Anne de Roeck and Bashar Nuseibeh of Open University. The full articles along with all the articles can be found in Biomedical Informatics Insights. A shared task of this magnitude does not happen by chance. Rather, it is the tenacity of the steering committee, the vested volunteers, and the staff who made this important activity occur. From it we have learned a great deal about sentiment analysis and the limitations of the data. I invite you to read the articles about this shared task and I encourage you to learn as much as we have.","PeriodicalId":88397,"journal":{"name":"Biomedical informatics insights","volume":"5 Suppl. 1","pages":"1 - 1"},"PeriodicalIF":0.0,"publicationDate":"2012-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.4137/BII.S9297","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30823723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yan Xu, Yue Wang, Jiahua Liu, Zhuowen Tu, Jian-Tao Sun, Junichi Tsujii, Eric Chang
{"title":"Suicide note sentiment classification: a supervised approach augmented by web data.","authors":"Yan Xu, Yue Wang, Jiahua Liu, Zhuowen Tu, Jian-Tao Sun, Junichi Tsujii, Eric Chang","doi":"10.4137/BII.S8956","DOIUrl":"https://doi.org/10.4137/BII.S8956","url":null,"abstract":"<p><strong>Objective: </strong>To create a sentiment classification system for the Fifth i2b2/VA Challenge Track 2, which can identify thirteen subjective categories and two objective categories.</p><p><strong>Design: </strong>We developed a hybrid system using Support Vector Machine (SVM) classifiers with augmented training data from the Internet. Our system consists of three types of classification-based systems: the first system uses spanning n-gram features for subjective categories, the second one uses bag-of-n-gram features for objective categories, and the third one uses pattern matching for infrequent or subtle emotion categories. The spanning n-gram features are selected by a feature selection algorithm that leverages emotional corpus from weblogs. Special normalization of objective sentences is generalized with shallow parsing and external web knowledge. We utilize three sources of web data: the weblog of LiveJournal which helps to improve the feature selection, the eBay List which assists in special normalization of information and instructions categories, and the suicide project web which provides unlabeled data with similar properties as suicide notes.</p><p><strong>Measurements: </strong>The performance is evaluated by the overall micro-averaged precision, recall and F-measure.</p><p><strong>Result: </strong>Our system achieved an overall micro-averaged F-measure of 0.59. Happiness_peacefulness had the highest F-measure of 0.81. We were ranked as the second best out of 26 competing teams.</p><p><strong>Conclusion: </strong>Our results indicated that classifying fine-grained sentiments at sentence level is a non-trivial task. It is effective to divide categories into different groups according to their semantic properties. In addition, our system performance benefits from external knowledge extracted from publically available web data of other purposes; performance can be further enhanced when more training data is available.</p>","PeriodicalId":88397,"journal":{"name":"Biomedical informatics insights","volume":"5 Suppl. 1","pages":"31-41"},"PeriodicalIF":0.0,"publicationDate":"2012-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.4137/BII.S8956","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30823725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sunghwan Sohn, Manabu Torii, Dingcheng Li, Kavishwar Wagholikar, Stephen Wu, Hongfang Liu
{"title":"A hybrid approach to sentiment sentence classification in suicide notes.","authors":"Sunghwan Sohn, Manabu Torii, Dingcheng Li, Kavishwar Wagholikar, Stephen Wu, Hongfang Liu","doi":"10.4137/BII.S8961","DOIUrl":"https://doi.org/10.4137/BII.S8961","url":null,"abstract":"<p><p>This paper describes the sentiment classification system developed by the Mayo Clinic team for the 2011 I2B2/VA/Cincinnati Natural Language Processing (NLP) Challenge. The sentiment classification task is to assign any pertinent emotion to each sentence in suicide notes. We have implemented three systems that have been trained on suicide notes provided by the I2B2 challenge organizer-a machine learning system, a rule-based system, and a system consisting of a combination of both. Our machine learning system was trained on re-annotated data in which apparently inconsistent emotion assignment was adjusted. Then, the machine learning methods by RIPPER and multinomial Naïve Bayes classifiers, manual pattern matching rules, and the combination of the two systems were tested to determine the emotions within sentences. The combination of the machine learning and rule-based system performed best and produced a micro-average F-score of 0.5640.</p>","PeriodicalId":88397,"journal":{"name":"Biomedical informatics insights","volume":"5 Suppl. 1","pages":"43-50"},"PeriodicalIF":0.0,"publicationDate":"2012-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.4137/BII.S8961","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30824233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}