Musammet Rafia Karim, Siam Shibly Antar, Mohammad Ashrafuzzaman Khan
{"title":"Idea Generation using Transformer Decoder Models","authors":"Musammet Rafia Karim, Siam Shibly Antar, Mohammad Ashrafuzzaman Khan","doi":"10.1145/3579654.3579706","DOIUrl":null,"url":null,"abstract":"Our work aims to generate new ideas to explore in a specific domain using generative language models. For example, doctors can write about known symptoms as cues to the system, and then the system will generate ideas based on the cues. Similar scenarios can be thought of for other scientific domains. We used transformer-based decoders, especially GPT3-based transformer decoders, as the language models and generators. As the data, we used COVID-19 open research dataset [18]. We finetuned GPT-NEO-125M and GPT-NEO-1.3B models with 125 million and 1.3 billion parameters, respectively. The later model generated more coherent text and could link ideas relevant to the same problem better. We report here our findings with examples generated from our finetuned models.","PeriodicalId":146783,"journal":{"name":"Proceedings of the 2022 5th International Conference on Algorithms, Computing and Artificial Intelligence","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 5th International Conference on Algorithms, Computing and Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3579654.3579706","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Our work aims to generate new ideas to explore in a specific domain using generative language models. For example, doctors can write about known symptoms as cues to the system, and then the system will generate ideas based on the cues. Similar scenarios can be thought of for other scientific domains. We used transformer-based decoders, especially GPT3-based transformer decoders, as the language models and generators. As the data, we used COVID-19 open research dataset [18]. We finetuned GPT-NEO-125M and GPT-NEO-1.3B models with 125 million and 1.3 billion parameters, respectively. The later model generated more coherent text and could link ideas relevant to the same problem better. We report here our findings with examples generated from our finetuned models.