BioFinBERT: Finetuning Large Language Models (LLMs) to Analyze Sentiment of Press Releases and Financial Text Around Inflection Points of Biotech Stocks
Valentina Aparicio, Daniel Gordon, Sebastian G. Huayamares, Yuhuai Luo
{"title":"BioFinBERT: Finetuning Large Language Models (LLMs) to Analyze Sentiment of Press Releases and Financial Text Around Inflection Points of Biotech Stocks","authors":"Valentina Aparicio, Daniel Gordon, Sebastian G. Huayamares, Yuhuai Luo","doi":"arxiv-2401.11011","DOIUrl":null,"url":null,"abstract":"Large language models (LLMs) are deep learning algorithms being used to\nperform natural language processing tasks in various fields, from social\nsciences to finance and biomedical sciences. Developing and training a new LLM\ncan be very computationally expensive, so it is becoming a common practice to\ntake existing LLMs and finetune them with carefully curated datasets for\ndesired applications in different fields. Here, we present BioFinBERT, a\nfinetuned LLM to perform financial sentiment analysis of public text associated\nwith stocks of companies in the biotechnology sector. The stocks of biotech\ncompanies developing highly innovative and risky therapeutic drugs tend to\nrespond very positively or negatively upon a successful or failed clinical\nreadout or regulatory approval of their drug, respectively. These clinical or\nregulatory results are disclosed by the biotech companies via press releases,\nwhich are followed by a significant stock response in many cases. In our\nattempt to design a LLM capable of analyzing the sentiment of these press\nreleases,we first finetuned BioBERT, a biomedical language representation model\ndesigned for biomedical text mining, using financial textual databases. Our\nfinetuned model, termed BioFinBERT, was then used to perform financial\nsentiment analysis of various biotech-related press releases and financial text\naround inflection points that significantly affected the price of biotech\nstocks.","PeriodicalId":501478,"journal":{"name":"arXiv - QuantFin - Trading and Market Microstructure","volume":"18 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuantFin - Trading and Market Microstructure","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2401.11011","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Large language models (LLMs) are deep learning algorithms being used to
perform natural language processing tasks in various fields, from social
sciences to finance and biomedical sciences. Developing and training a new LLM
can be very computationally expensive, so it is becoming a common practice to
take existing LLMs and finetune them with carefully curated datasets for
desired applications in different fields. Here, we present BioFinBERT, a
finetuned LLM to perform financial sentiment analysis of public text associated
with stocks of companies in the biotechnology sector. The stocks of biotech
companies developing highly innovative and risky therapeutic drugs tend to
respond very positively or negatively upon a successful or failed clinical
readout or regulatory approval of their drug, respectively. These clinical or
regulatory results are disclosed by the biotech companies via press releases,
which are followed by a significant stock response in many cases. In our
attempt to design a LLM capable of analyzing the sentiment of these press
releases,we first finetuned BioBERT, a biomedical language representation model
designed for biomedical text mining, using financial textual databases. Our
finetuned model, termed BioFinBERT, was then used to perform financial
sentiment analysis of various biotech-related press releases and financial text
around inflection points that significantly affected the price of biotech
stocks.