{"title":"利用选择性迁移学习从蛋白质结构分析文献中提取蛋白质相互作用句子的方法","authors":"Shun Koyabu, Riku Kyogoku, T. Ohkawa","doi":"10.1109/BIBE.2012.6399705","DOIUrl":null,"url":null,"abstract":"With the progress of research on structural analysis of proteins, a large number of studies have been conducted on extracting the protein interaction information from literature. For automatic extraction of interaction information, the machine learning approach is useful. Generally, linguistic features obtained directly from the literature are used for learning, but a non-linguistic feature such as the atomic distance calculated from the protein structure data is often very effective for learning and classification. We call this type of feature a “key feature” in this study. In the machine learning approach, preparing enough training instances to train the classifier is important, but this often requires great cost. In such a situation, transfer learning is one of the better approaches. However, it is difficult to apply a simple transfer learning algorithm to a task in which the key feature cannot be prepared in the source domain. In this study, we propose a new transfer learning method called STEK (Selective Transfer learning based on Effectiveness of a Key feature). In this method, we focus on the effectiveness of the key feature, and divide a set of instances into two categories. One is a set of instances applying transfer learning and the other is a set of instances avoiding the use of transfer learning. The proposed method with the InstPrune algorithm showed stably high precision, recall and F-measure on average.","PeriodicalId":330164,"journal":{"name":"2012 IEEE 12th International Conference on Bioinformatics & Bioengineering (BIBE)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Method of extracting sentences about protein interaction from the literature on protein structure analysis using selective transfer learning\",\"authors\":\"Shun Koyabu, Riku Kyogoku, T. Ohkawa\",\"doi\":\"10.1109/BIBE.2012.6399705\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the progress of research on structural analysis of proteins, a large number of studies have been conducted on extracting the protein interaction information from literature. For automatic extraction of interaction information, the machine learning approach is useful. Generally, linguistic features obtained directly from the literature are used for learning, but a non-linguistic feature such as the atomic distance calculated from the protein structure data is often very effective for learning and classification. We call this type of feature a “key feature” in this study. In the machine learning approach, preparing enough training instances to train the classifier is important, but this often requires great cost. In such a situation, transfer learning is one of the better approaches. However, it is difficult to apply a simple transfer learning algorithm to a task in which the key feature cannot be prepared in the source domain. In this study, we propose a new transfer learning method called STEK (Selective Transfer learning based on Effectiveness of a Key feature). In this method, we focus on the effectiveness of the key feature, and divide a set of instances into two categories. One is a set of instances applying transfer learning and the other is a set of instances avoiding the use of transfer learning. The proposed method with the InstPrune algorithm showed stably high precision, recall and F-measure on average.\",\"PeriodicalId\":330164,\"journal\":{\"name\":\"2012 IEEE 12th International Conference on Bioinformatics & Bioengineering (BIBE)\",\"volume\":\"35 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-11-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 IEEE 12th International Conference on Bioinformatics & Bioengineering (BIBE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BIBE.2012.6399705\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE 12th International Conference on Bioinformatics & Bioengineering (BIBE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBE.2012.6399705","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Method of extracting sentences about protein interaction from the literature on protein structure analysis using selective transfer learning
With the progress of research on structural analysis of proteins, a large number of studies have been conducted on extracting the protein interaction information from literature. For automatic extraction of interaction information, the machine learning approach is useful. Generally, linguistic features obtained directly from the literature are used for learning, but a non-linguistic feature such as the atomic distance calculated from the protein structure data is often very effective for learning and classification. We call this type of feature a “key feature” in this study. In the machine learning approach, preparing enough training instances to train the classifier is important, but this often requires great cost. In such a situation, transfer learning is one of the better approaches. However, it is difficult to apply a simple transfer learning algorithm to a task in which the key feature cannot be prepared in the source domain. In this study, we propose a new transfer learning method called STEK (Selective Transfer learning based on Effectiveness of a Key feature). In this method, we focus on the effectiveness of the key feature, and divide a set of instances into two categories. One is a set of instances applying transfer learning and the other is a set of instances avoiding the use of transfer learning. The proposed method with the InstPrune algorithm showed stably high precision, recall and F-measure on average.