Giselle F. Rosa;Jones O. Avelino;Maria Claudia Cavalcanti;Julio Cesar Duarte
{"title":"TForMIX: A Method That Combines LLM and Multidimensional Modeling for Technological Foresight","authors":"Giselle F. Rosa;Jones O. Avelino;Maria Claudia Cavalcanti;Julio Cesar Duarte","doi":"10.1109/ACCESS.2025.3605116","DOIUrl":null,"url":null,"abstract":"Technical documents, such as scientific papers and patents, are widely used as a basis for Technological Foresight (TF) processes. Typically, these analyses require identifying elements (e.g., terms) in the textual contents of these documents, which are relevant to the scientific-technological domain under investigation. Information Extraction (IE) and Natural Language Processing (NLP) techniques are useful tools to automate the identification of these elements, which is essential in TF processes that usually involve the analysis of a corpus of hundreds (and sometimes thousands) of documents. An analytical view over this corpus, based on the occurrence of those relevant elements, helps prioritize document analysis and, consequently, accelerates the whole TF process. However, building a system that provides such analytical insight is expensive. Moreover, for each domain-specific TF process, a new system would have to be built. Thus, there is a need for viable solutions to analytically explore a corpus, according to the specific requirements of each domain. This work presents Technological Foresight with Multidimensional Information eXtraction (TForMIX), a novel method for building Decision Support Systems (DSSs) that applies Named Entity Recognition (NER) and Relation Extraction (RE) while allowing multidimensional analytical exploration of entities and relations together with bibliometric data from documents. TForMIX is a flexible method that can be applied to different domains, and speeds up building DSSs for each domain. Additionally, we evaluate the applicability of the produced DSSs in TF processes by conducting a practical experiment that demonstrates that applying the method to generate DSSs, supported by IE techniques, can significantly contribute to the conduction of TF analyses. The combination of the used theories, innovative methods, and proposed practical validation highlighted the high-quality nature of the analysis in this study while offering the potential for valuable insights and contributions to the TF process.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"13 ","pages":"153320-153339"},"PeriodicalIF":3.6000,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11146655","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Access","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11146655/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Technical documents, such as scientific papers and patents, are widely used as a basis for Technological Foresight (TF) processes. Typically, these analyses require identifying elements (e.g., terms) in the textual contents of these documents, which are relevant to the scientific-technological domain under investigation. Information Extraction (IE) and Natural Language Processing (NLP) techniques are useful tools to automate the identification of these elements, which is essential in TF processes that usually involve the analysis of a corpus of hundreds (and sometimes thousands) of documents. An analytical view over this corpus, based on the occurrence of those relevant elements, helps prioritize document analysis and, consequently, accelerates the whole TF process. However, building a system that provides such analytical insight is expensive. Moreover, for each domain-specific TF process, a new system would have to be built. Thus, there is a need for viable solutions to analytically explore a corpus, according to the specific requirements of each domain. This work presents Technological Foresight with Multidimensional Information eXtraction (TForMIX), a novel method for building Decision Support Systems (DSSs) that applies Named Entity Recognition (NER) and Relation Extraction (RE) while allowing multidimensional analytical exploration of entities and relations together with bibliometric data from documents. TForMIX is a flexible method that can be applied to different domains, and speeds up building DSSs for each domain. Additionally, we evaluate the applicability of the produced DSSs in TF processes by conducting a practical experiment that demonstrates that applying the method to generate DSSs, supported by IE techniques, can significantly contribute to the conduction of TF analyses. The combination of the used theories, innovative methods, and proposed practical validation highlighted the high-quality nature of the analysis in this study while offering the potential for valuable insights and contributions to the TF process.
IEEE AccessCOMPUTER SCIENCE, INFORMATION SYSTEMSENGIN-ENGINEERING, ELECTRICAL & ELECTRONIC
CiteScore
9.80
自引率
7.70%
发文量
6673
审稿时长
6 weeks
期刊介绍:
IEEE Access® is a multidisciplinary, open access (OA), applications-oriented, all-electronic archival journal that continuously presents the results of original research or development across all of IEEE''s fields of interest.
IEEE Access will publish articles that are of high interest to readers, original, technically correct, and clearly presented. Supported by author publication charges (APC), its hallmarks are a rapid peer review and publication process with open access to all readers. Unlike IEEE''s traditional Transactions or Journals, reviews are "binary", in that reviewers will either Accept or Reject an article in the form it is submitted in order to achieve rapid turnaround. Especially encouraged are submissions on:
Multidisciplinary topics, or applications-oriented articles and negative results that do not fit within the scope of IEEE''s traditional journals.
Practical articles discussing new experiments or measurement techniques, interesting solutions to engineering.
Development of new or improved fabrication or manufacturing techniques.
Reviews or survey articles of new or evolving fields oriented to assist others in understanding the new area.