TForMIX:一种结合LLM和多维建模的技术预测方法

IF 3.6 3区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS
Giselle F. Rosa;Jones O. Avelino;Maria Claudia Cavalcanti;Julio Cesar Duarte
{"title":"TForMIX:一种结合LLM和多维建模的技术预测方法","authors":"Giselle F. Rosa;Jones O. Avelino;Maria Claudia Cavalcanti;Julio Cesar Duarte","doi":"10.1109/ACCESS.2025.3605116","DOIUrl":null,"url":null,"abstract":"Technical documents, such as scientific papers and patents, are widely used as a basis for Technological Foresight (TF) processes. Typically, these analyses require identifying elements (e.g., terms) in the textual contents of these documents, which are relevant to the scientific-technological domain under investigation. Information Extraction (IE) and Natural Language Processing (NLP) techniques are useful tools to automate the identification of these elements, which is essential in TF processes that usually involve the analysis of a corpus of hundreds (and sometimes thousands) of documents. An analytical view over this corpus, based on the occurrence of those relevant elements, helps prioritize document analysis and, consequently, accelerates the whole TF process. However, building a system that provides such analytical insight is expensive. Moreover, for each domain-specific TF process, a new system would have to be built. Thus, there is a need for viable solutions to analytically explore a corpus, according to the specific requirements of each domain. This work presents Technological Foresight with Multidimensional Information eXtraction (TForMIX), a novel method for building Decision Support Systems (DSSs) that applies Named Entity Recognition (NER) and Relation Extraction (RE) while allowing multidimensional analytical exploration of entities and relations together with bibliometric data from documents. TForMIX is a flexible method that can be applied to different domains, and speeds up building DSSs for each domain. Additionally, we evaluate the applicability of the produced DSSs in TF processes by conducting a practical experiment that demonstrates that applying the method to generate DSSs, supported by IE techniques, can significantly contribute to the conduction of TF analyses. The combination of the used theories, innovative methods, and proposed practical validation highlighted the high-quality nature of the analysis in this study while offering the potential for valuable insights and contributions to the TF process.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"13 ","pages":"153320-153339"},"PeriodicalIF":3.6000,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11146655","citationCount":"0","resultStr":"{\"title\":\"TForMIX: A Method That Combines LLM and Multidimensional Modeling for Technological Foresight\",\"authors\":\"Giselle F. Rosa;Jones O. Avelino;Maria Claudia Cavalcanti;Julio Cesar Duarte\",\"doi\":\"10.1109/ACCESS.2025.3605116\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Technical documents, such as scientific papers and patents, are widely used as a basis for Technological Foresight (TF) processes. Typically, these analyses require identifying elements (e.g., terms) in the textual contents of these documents, which are relevant to the scientific-technological domain under investigation. Information Extraction (IE) and Natural Language Processing (NLP) techniques are useful tools to automate the identification of these elements, which is essential in TF processes that usually involve the analysis of a corpus of hundreds (and sometimes thousands) of documents. An analytical view over this corpus, based on the occurrence of those relevant elements, helps prioritize document analysis and, consequently, accelerates the whole TF process. However, building a system that provides such analytical insight is expensive. Moreover, for each domain-specific TF process, a new system would have to be built. Thus, there is a need for viable solutions to analytically explore a corpus, according to the specific requirements of each domain. This work presents Technological Foresight with Multidimensional Information eXtraction (TForMIX), a novel method for building Decision Support Systems (DSSs) that applies Named Entity Recognition (NER) and Relation Extraction (RE) while allowing multidimensional analytical exploration of entities and relations together with bibliometric data from documents. TForMIX is a flexible method that can be applied to different domains, and speeds up building DSSs for each domain. Additionally, we evaluate the applicability of the produced DSSs in TF processes by conducting a practical experiment that demonstrates that applying the method to generate DSSs, supported by IE techniques, can significantly contribute to the conduction of TF analyses. The combination of the used theories, innovative methods, and proposed practical validation highlighted the high-quality nature of the analysis in this study while offering the potential for valuable insights and contributions to the TF process.\",\"PeriodicalId\":13079,\"journal\":{\"name\":\"IEEE Access\",\"volume\":\"13 \",\"pages\":\"153320-153339\"},\"PeriodicalIF\":3.6000,\"publicationDate\":\"2025-09-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11146655\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Access\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11146655/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Access","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11146655/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

技术文件,如科学论文和专利,被广泛用作技术预见(TF)过程的基础。通常,这些分析需要在这些文件的文本内容中识别与所调查的科学技术领域相关的元素(例如,术语)。信息抽取(IE)和自然语言处理(NLP)技术是自动识别这些元素的有用工具,这在TF过程中是必不可少的,因为TF过程通常涉及对数百(有时甚至数千)个文档的语料库进行分析。基于这些相关元素的语料库的分析视图有助于确定文档分析的优先级,从而加快整个TF过程。然而,构建一个提供这种分析洞察力的系统是昂贵的。此外,对于每个特定领域的TF进程,都必须构建一个新的系统。因此,根据每个领域的具体需求,需要可行的解决方案来分析探索语料库。本研究提出了基于多维信息提取(TForMIX)的技术预见,这是一种用于构建决策支持系统(DSSs)的新方法,它应用命名实体识别(NER)和关系提取(RE),同时允许对实体和关系以及文档中的文献计量数据进行多维分析探索。TForMIX是一种灵活的方法,可以应用于不同的领域,并加快为每个领域构建dss的速度。此外,我们通过进行实际实验来评估生成的DSSs在TF过程中的适用性,该实验表明,在IE技术的支持下,应用该方法生成DSSs可以显著促进TF分析的传导。所使用的理论、创新的方法和提出的实际验证的结合突出了本研究分析的高质量性质,同时为TF过程提供了有价值的见解和贡献的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
TForMIX: A Method That Combines LLM and Multidimensional Modeling for Technological Foresight
Technical documents, such as scientific papers and patents, are widely used as a basis for Technological Foresight (TF) processes. Typically, these analyses require identifying elements (e.g., terms) in the textual contents of these documents, which are relevant to the scientific-technological domain under investigation. Information Extraction (IE) and Natural Language Processing (NLP) techniques are useful tools to automate the identification of these elements, which is essential in TF processes that usually involve the analysis of a corpus of hundreds (and sometimes thousands) of documents. An analytical view over this corpus, based on the occurrence of those relevant elements, helps prioritize document analysis and, consequently, accelerates the whole TF process. However, building a system that provides such analytical insight is expensive. Moreover, for each domain-specific TF process, a new system would have to be built. Thus, there is a need for viable solutions to analytically explore a corpus, according to the specific requirements of each domain. This work presents Technological Foresight with Multidimensional Information eXtraction (TForMIX), a novel method for building Decision Support Systems (DSSs) that applies Named Entity Recognition (NER) and Relation Extraction (RE) while allowing multidimensional analytical exploration of entities and relations together with bibliometric data from documents. TForMIX is a flexible method that can be applied to different domains, and speeds up building DSSs for each domain. Additionally, we evaluate the applicability of the produced DSSs in TF processes by conducting a practical experiment that demonstrates that applying the method to generate DSSs, supported by IE techniques, can significantly contribute to the conduction of TF analyses. The combination of the used theories, innovative methods, and proposed practical validation highlighted the high-quality nature of the analysis in this study while offering the potential for valuable insights and contributions to the TF process.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Access
IEEE Access COMPUTER SCIENCE, INFORMATION SYSTEMSENGIN-ENGINEERING, ELECTRICAL & ELECTRONIC
CiteScore
9.80
自引率
7.70%
发文量
6673
审稿时长
6 weeks
期刊介绍: IEEE Access® is a multidisciplinary, open access (OA), applications-oriented, all-electronic archival journal that continuously presents the results of original research or development across all of IEEE''s fields of interest. IEEE Access will publish articles that are of high interest to readers, original, technically correct, and clearly presented. Supported by author publication charges (APC), its hallmarks are a rapid peer review and publication process with open access to all readers. Unlike IEEE''s traditional Transactions or Journals, reviews are "binary", in that reviewers will either Accept or Reject an article in the form it is submitted in order to achieve rapid turnaround. Especially encouraged are submissions on: Multidisciplinary topics, or applications-oriented articles and negative results that do not fit within the scope of IEEE''s traditional journals. Practical articles discussing new experiments or measurement techniques, interesting solutions to engineering. Development of new or improved fabrication or manufacturing techniques. Reviews or survey articles of new or evolving fields oriented to assist others in understanding the new area.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信