Bill Gates Happi Happi, Geraud Fokou Pelap, Danai Symeonidou, Pierre Larmande
{"title":"GRU-SCANET:释放基于gru的正弦捕获网络的力量,用于精确驱动的命名实体识别。","authors":"Bill Gates Happi Happi, Geraud Fokou Pelap, Danai Symeonidou, Pierre Larmande","doi":"10.1093/bioadv/vbaf096","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>Pre-trained Language Models (PLMs) have achieved remarkable performance across various natural language processing tasks. However, they encounter challenges in biomedical named entity recognition (NER), such as high computational costs and the need for complex fine-tuning. These limitations hinder the efficient recognition of biological entities, especially within specialized corpora. To address these issues, we introduce GRU-SCANET (Gated Recurrent Unit-based Sinusoidal Capture Network), a novel architecture that directly models the relationship between input tokens and entity classes. Our approach offers a computationally efficient alternative for extracting biological entities by capturing contextual dependencies within biomedical texts.</p><p><strong>Results: </strong>GRU-SCANET combines positional encoding, bidirectional GRUs (BiGRUs), an attention-based encoder, and a conditional random field (CRF) decoder to achieve high precision in entity labeling. This design effectively mitigates the challenges posed by unbalanced data across multiple corpora. Our model consistently outperforms leading benchmarks, achieving better performance than BioBERT (8/8 evaluations), PubMedBERT (5/5 evaluations), and the previous state-of-the-art (SOTA) models (8/8 evaluations), including Bern2 (5/5 evaluations). These results highlight the strength of our approach in capturing token-entity relationships more effectively than existing methods, advancing the state of biomedicalNER.</p><p><strong>Availability and implementation: </strong>https://github.com/ANR-DIG-AI/GRU-SCANET.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf096"},"PeriodicalIF":2.8000,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12198495/pdf/","citationCount":"0","resultStr":"{\"title\":\"GRU-SCANET: unleashing the power of GRU-based sinusoidal capture network for precision-driven named entity recognition.\",\"authors\":\"Bill Gates Happi Happi, Geraud Fokou Pelap, Danai Symeonidou, Pierre Larmande\",\"doi\":\"10.1093/bioadv/vbaf096\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Motivation: </strong>Pre-trained Language Models (PLMs) have achieved remarkable performance across various natural language processing tasks. However, they encounter challenges in biomedical named entity recognition (NER), such as high computational costs and the need for complex fine-tuning. These limitations hinder the efficient recognition of biological entities, especially within specialized corpora. To address these issues, we introduce GRU-SCANET (Gated Recurrent Unit-based Sinusoidal Capture Network), a novel architecture that directly models the relationship between input tokens and entity classes. Our approach offers a computationally efficient alternative for extracting biological entities by capturing contextual dependencies within biomedical texts.</p><p><strong>Results: </strong>GRU-SCANET combines positional encoding, bidirectional GRUs (BiGRUs), an attention-based encoder, and a conditional random field (CRF) decoder to achieve high precision in entity labeling. This design effectively mitigates the challenges posed by unbalanced data across multiple corpora. Our model consistently outperforms leading benchmarks, achieving better performance than BioBERT (8/8 evaluations), PubMedBERT (5/5 evaluations), and the previous state-of-the-art (SOTA) models (8/8 evaluations), including Bern2 (5/5 evaluations). These results highlight the strength of our approach in capturing token-entity relationships more effectively than existing methods, advancing the state of biomedicalNER.</p><p><strong>Availability and implementation: </strong>https://github.com/ANR-DIG-AI/GRU-SCANET.</p>\",\"PeriodicalId\":72368,\"journal\":{\"name\":\"Bioinformatics advances\",\"volume\":\"5 1\",\"pages\":\"vbaf096\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2025-06-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12198495/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bioinformatics advances\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/bioadv/vbaf096\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"MATHEMATICAL & COMPUTATIONAL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics advances","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioadv/vbaf096","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
GRU-SCANET: unleashing the power of GRU-based sinusoidal capture network for precision-driven named entity recognition.
Motivation: Pre-trained Language Models (PLMs) have achieved remarkable performance across various natural language processing tasks. However, they encounter challenges in biomedical named entity recognition (NER), such as high computational costs and the need for complex fine-tuning. These limitations hinder the efficient recognition of biological entities, especially within specialized corpora. To address these issues, we introduce GRU-SCANET (Gated Recurrent Unit-based Sinusoidal Capture Network), a novel architecture that directly models the relationship between input tokens and entity classes. Our approach offers a computationally efficient alternative for extracting biological entities by capturing contextual dependencies within biomedical texts.
Results: GRU-SCANET combines positional encoding, bidirectional GRUs (BiGRUs), an attention-based encoder, and a conditional random field (CRF) decoder to achieve high precision in entity labeling. This design effectively mitigates the challenges posed by unbalanced data across multiple corpora. Our model consistently outperforms leading benchmarks, achieving better performance than BioBERT (8/8 evaluations), PubMedBERT (5/5 evaluations), and the previous state-of-the-art (SOTA) models (8/8 evaluations), including Bern2 (5/5 evaluations). These results highlight the strength of our approach in capturing token-entity relationships more effectively than existing methods, advancing the state of biomedicalNER.
Availability and implementation: https://github.com/ANR-DIG-AI/GRU-SCANET.