Natural language processing and information systems : ... International Conference on Applications of Natural Language to Information Systems, NLDB ... revised papers. International Conference on Applications of Natural Language to Info...最新文献
{"title":"How important is domain-specific language model pretraining and instruction finetuning for biomedical relation extraction?","authors":"Aviv Brokman, Ramakanth Kavuluru","doi":"10.1007/978-3-031-97141-9_6","DOIUrl":"https://doi.org/10.1007/978-3-031-97141-9_6","url":null,"abstract":"<p><p>Major technical advances in the general NLP domain are often subsequently applied to the high-value, data-rich biomedical domain. The past few years have seen generative language models (LMs), instruction finetuning, and few-shot learning become foci of NLP research. As such, generative LMs pretrained on biomedical corpora have proliferated and biomedical instruction finetuning has been attempted as well, all with the hope that domain specificity improves performance on downstream tasks. Given the nontrivial effort in training such models, we investigate what, if any, benefits they have in the key biomedical NLP task of relation extraction. Specifically, we address two questions: (1) Do LMs trained on biomedical corpora outperform those trained on general domain corpora? (2) Do models instruction finetuned on biomedical datasets outperform those finetuned on assorted datasets or those simply pretrained? We tackle these questions using existing LMs, testing across four datasets. In a surprising result, general-domain models typically outperformed biomedical-domain models. However, biomedical instruction finetuning improved performance to a similar degree as general instruction finetuning, despite having orders of magnitude fewer instructions. Our findings suggest it may be more fruitful to focus research effort on larger-scale biomedical instruction finetuning of general LMs over building domain-specific biomedical LMs.</p>","PeriodicalId":92107,"journal":{"name":"Natural language processing and information systems : ... International Conference on Applications of Natural Language to Information Systems, NLDB ... revised papers. International Conference on Applications of Natural Language to Info...","volume":"15836 ","pages":"80-94"},"PeriodicalIF":0.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12367199/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144981967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comparison of pipelines, seq2seq models, and LLMs for rare disease information extraction.","authors":"Shashank Gupta, Xuguang Ai, Yuhang Jiang, Ramakanth Kavuluru","doi":"10.1007/978-3-031-97141-9_4","DOIUrl":"https://doi.org/10.1007/978-3-031-97141-9_4","url":null,"abstract":"<p><p>End-to-end relation extraction (E2ERE) is an important application of natural language processing (NLP) in biomedicine. The extracted relations populate knowledge graphs and drive more high level applications in knowledge discovery and information retrieval. E2ERE is frequently handled at the sentence level involving continuous entities. A more complex setting is document level E2ERE with discontinuous and overlapping/nested entities. We identified a recently introduced RE dataset for rare diseases (RareDis) that has these complex traits. Among current E2ERE methods, we see three well-known paradigms: (1) pipeline based approaches where a named entity recognition (NER) model's output is input to a relation classification (RC) model; (2) joint sequence-to-sequence style models where the raw input text is directly transformed into relations through linearization schemas; and (3) generative large language models (LLMs), where prompts, fine-tuning, and in-context learning are being leveraged for RE. While LLMs are becoming popular because of tools such as ChatGPT, the biomedical NLP community needs to carefully evaluate which paradigm is more suitable for E2ERE. In this effort, using the RareDis dataset as a complex use-case, we evaluate the best representative models from each of the three paradigms for E2ERE. Our findings reveal that pipeline models are still the best, while sequence-to-sequence models are not far behind. We verify these findings on a second E2ERE dataset for chemical-protein interactions. Although LLMs are more suitable for zero-shot settings, our results show that it is better to work with more conventional models trained and tailored for E2ERE when training data is available. Our contribution is also the first to conduct E2ERE for the RareDis dataset.</p>","PeriodicalId":92107,"journal":{"name":"Natural language processing and information systems : ... International Conference on Applications of Natural Language to Information Systems, NLDB ... revised papers. International Conference on Applications of Natural Language to Info...","volume":"15836 ","pages":"49-63"},"PeriodicalIF":0.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12367198/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144982006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Natural Language Processing and Information Systems: 28th International Conference on Applications of Natural Language to Information Systems, NLDB 2023, Derby, UK, June 21–23, 2023, Proceedings","authors":"","doi":"10.1007/978-3-031-35320-8","DOIUrl":"https://doi.org/10.1007/978-3-031-35320-8","url":null,"abstract":"","PeriodicalId":92107,"journal":{"name":"Natural language processing and information systems : ... International Conference on Applications of Natural Language to Information Systems, NLDB ... revised papers. International Conference on Applications of Natural Language to Info...","volume":"64 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90355689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Natural Language Processing and Information Systems: 27th International Conference on Applications of Natural Language to Information Systems, NLDB 2022, Valencia, Spain, June 15–17, 2022, Proceedings","authors":"","doi":"10.1007/978-3-031-08473-7","DOIUrl":"https://doi.org/10.1007/978-3-031-08473-7","url":null,"abstract":"","PeriodicalId":92107,"journal":{"name":"Natural language processing and information systems : ... International Conference on Applications of Natural Language to Information Systems, NLDB ... revised papers. International Conference on Applications of Natural Language to Info...","volume":"51 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85066102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Natural Language Processing and Information Systems: 26th International Conference on Applications of Natural Language to Information Systems, NLDB 2021, Saarbrücken, Germany, June 23–25, 2021, Proceedings","authors":"","doi":"10.1007/978-3-030-80599-9","DOIUrl":"https://doi.org/10.1007/978-3-030-80599-9","url":null,"abstract":"","PeriodicalId":92107,"journal":{"name":"Natural language processing and information systems : ... International Conference on Applications of Natural Language to Information Systems, NLDB ... revised papers. International Conference on Applications of Natural Language to Info...","volume":"10 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85277237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Elisabeth Métais, F. Meziane, H. Horacek, P. Cimiano
{"title":"Natural Language Processing and Information Systems: 25th International Conference on Applications of Natural Language to Information Systems, NLDB 2020, Saarbrücken, Germany, June 24–26, 2020, Proceedings","authors":"Elisabeth Métais, F. Meziane, H. Horacek, P. Cimiano","doi":"10.1007/978-3-030-51310-8","DOIUrl":"https://doi.org/10.1007/978-3-030-51310-8","url":null,"abstract":"","PeriodicalId":92107,"journal":{"name":"Natural language processing and information systems : ... International Conference on Applications of Natural Language to Information Systems, NLDB ... revised papers. International Conference on Applications of Natural Language to Info...","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88936232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Human-in-the-Loop Conversation Agent for Customer Service","authors":"Peteris Paikens, Arturs Znotins, Guntis Barzdins","doi":"10.1007/978-3-030-51310-8_25","DOIUrl":"https://doi.org/10.1007/978-3-030-51310-8_25","url":null,"abstract":"","PeriodicalId":92107,"journal":{"name":"Natural language processing and information systems : ... International Conference on Applications of Natural Language to Information Systems, NLDB ... revised papers. International Conference on Applications of Natural Language to Info...","volume":"8 1","pages":"277 - 284"},"PeriodicalIF":0.0,"publicationDate":"2020-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82744777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Deeksha Varshney, Asif Ekbal, Ganesh Nagaraja, Mrigank Tiwari, A. Gopinath, P. Bhattacharyya
{"title":"Natural Language Generation Using Transformer Network in an Open-Domain Setting","authors":"Deeksha Varshney, Asif Ekbal, Ganesh Nagaraja, Mrigank Tiwari, A. Gopinath, P. Bhattacharyya","doi":"10.1007/978-3-030-51310-8_8","DOIUrl":"https://doi.org/10.1007/978-3-030-51310-8_8","url":null,"abstract":"","PeriodicalId":92107,"journal":{"name":"Natural language processing and information systems : ... International Conference on Applications of Natural Language to Information Systems, NLDB ... revised papers. International Conference on Applications of Natural Language to Info...","volume":"51 1","pages":"82 - 93"},"PeriodicalIF":0.0,"publicationDate":"2020-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73003522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving the Community Question Retrieval Performance Using Attention-Based Siamese LSTM","authors":"Nouha Othman, R. Faiz, K. Smaïli","doi":"10.1007/978-3-030-51310-8_23","DOIUrl":"https://doi.org/10.1007/978-3-030-51310-8_23","url":null,"abstract":"","PeriodicalId":92107,"journal":{"name":"Natural language processing and information systems : ... International Conference on Applications of Natural Language to Information Systems, NLDB ... revised papers. International Conference on Applications of Natural Language to Info...","volume":"91 1","pages":"252 - 263"},"PeriodicalIF":0.0,"publicationDate":"2020-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90764043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Studying Attention Models in Sentiment Attitude Extraction Task","authors":"Nicolay Rusnachenko, Natalia V. Loukachevitch","doi":"10.1007/978-3-030-51310-8_15","DOIUrl":"https://doi.org/10.1007/978-3-030-51310-8_15","url":null,"abstract":"","PeriodicalId":92107,"journal":{"name":"Natural language processing and information systems : ... International Conference on Applications of Natural Language to Information Systems, NLDB ... revised papers. International Conference on Applications of Natural Language to Info...","volume":"128 1","pages":"157 - 169"},"PeriodicalIF":0.0,"publicationDate":"2020-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77537958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}