工程纠纷判例文本的自动摘要

IF 8 1区 工程技术 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Wonkyoung Seo , Youngcheol Kang
{"title":"工程纠纷判例文本的自动摘要","authors":"Wonkyoung Seo ,&nbsp;Youngcheol Kang","doi":"10.1016/j.aei.2025.103381","DOIUrl":null,"url":null,"abstract":"<div><div>Advancements in text analysis are driving the adoption of document automation in the construction industry. Despite significant financial losses from construction disputes, efforts to automate document processes in this domain remain limited. Effective dispute management requires the rapid identification of relevant precedent cases to help practitioners respond appropriately. However, the complexity and length of such texts pose challenges to quick comprehension. This study presents a natural language processing (NLP) model for automatically summarizing construction dispute case texts. The model was tested on 300 U.S. construction dispute cases sourced from the Westlaw database. Various NLP models, including large language models (LLMs) such as OpenAI’s models and BERT, were evaluated, achieving an F-score of approximately 0.39 based on the ROUGE-L metric. To accomplish the domain-specific objective of summarizing construction precedent cases, this study explored multiple approaches, including data preprocessing, fine-tuning, and model engineering using LangChain. Furthermore, this study aims to develop models for summarizing legal precedent texts and investigates methods to capture the distinctive characteristics of construction dispute data compared to general legal texts. The models were validated through domain experts who recognize the unique nature of construction disputes, enhancing the reliability of the evaluation process. The findings contribute significantly to the automation of construction dispute document summarization, enabling practitioners to manage such cases more efficiently.</div></div>","PeriodicalId":50941,"journal":{"name":"Advanced Engineering Informatics","volume":"65 ","pages":"Article 103381"},"PeriodicalIF":8.0000,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Auto-summarization of the texts of construction dispute precedents\",\"authors\":\"Wonkyoung Seo ,&nbsp;Youngcheol Kang\",\"doi\":\"10.1016/j.aei.2025.103381\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Advancements in text analysis are driving the adoption of document automation in the construction industry. Despite significant financial losses from construction disputes, efforts to automate document processes in this domain remain limited. Effective dispute management requires the rapid identification of relevant precedent cases to help practitioners respond appropriately. However, the complexity and length of such texts pose challenges to quick comprehension. This study presents a natural language processing (NLP) model for automatically summarizing construction dispute case texts. The model was tested on 300 U.S. construction dispute cases sourced from the Westlaw database. Various NLP models, including large language models (LLMs) such as OpenAI’s models and BERT, were evaluated, achieving an F-score of approximately 0.39 based on the ROUGE-L metric. To accomplish the domain-specific objective of summarizing construction precedent cases, this study explored multiple approaches, including data preprocessing, fine-tuning, and model engineering using LangChain. Furthermore, this study aims to develop models for summarizing legal precedent texts and investigates methods to capture the distinctive characteristics of construction dispute data compared to general legal texts. The models were validated through domain experts who recognize the unique nature of construction disputes, enhancing the reliability of the evaluation process. The findings contribute significantly to the automation of construction dispute document summarization, enabling practitioners to manage such cases more efficiently.</div></div>\",\"PeriodicalId\":50941,\"journal\":{\"name\":\"Advanced Engineering Informatics\",\"volume\":\"65 \",\"pages\":\"Article 103381\"},\"PeriodicalIF\":8.0000,\"publicationDate\":\"2025-04-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Advanced Engineering Informatics\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1474034625002745\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advanced Engineering Informatics","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1474034625002745","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

文本分析的进步正在推动建筑行业采用文档自动化。尽管建筑纠纷造成了巨大的经济损失,但在这一领域自动化文档处理的努力仍然有限。有效的争议管理需要快速识别相关的先例案例,以帮助从业者做出适当的反应。然而,这些文本的复杂性和长度给快速理解带来了挑战。本研究提出一种自然语言处理(NLP)模型,用于自动总结建筑纠纷案例文本。该模型在来自Westlaw数据库的300个美国建筑纠纷案例中进行了测试。各种NLP模型,包括大型语言模型(llm),如OpenAI的模型和BERT,进行了评估,基于ROUGE-L指标获得了约0.39的f分。为了实现总结构建先例案例的特定领域目标,本研究探索了多种方法,包括使用LangChain进行数据预处理、微调和模型工程。此外,本研究旨在建立总结判例法律文本的模型,并探讨捕捉建筑纠纷数据与一般法律文本相比的独特特征的方法。这些模型由领域专家验证,他们认识到建筑纠纷的独特性,提高了评估过程的可靠性。研究结果对建筑纠纷文件摘要的自动化做出了重大贡献,使从业者能够更有效地管理此类案件。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Auto-summarization of the texts of construction dispute precedents
Advancements in text analysis are driving the adoption of document automation in the construction industry. Despite significant financial losses from construction disputes, efforts to automate document processes in this domain remain limited. Effective dispute management requires the rapid identification of relevant precedent cases to help practitioners respond appropriately. However, the complexity and length of such texts pose challenges to quick comprehension. This study presents a natural language processing (NLP) model for automatically summarizing construction dispute case texts. The model was tested on 300 U.S. construction dispute cases sourced from the Westlaw database. Various NLP models, including large language models (LLMs) such as OpenAI’s models and BERT, were evaluated, achieving an F-score of approximately 0.39 based on the ROUGE-L metric. To accomplish the domain-specific objective of summarizing construction precedent cases, this study explored multiple approaches, including data preprocessing, fine-tuning, and model engineering using LangChain. Furthermore, this study aims to develop models for summarizing legal precedent texts and investigates methods to capture the distinctive characteristics of construction dispute data compared to general legal texts. The models were validated through domain experts who recognize the unique nature of construction disputes, enhancing the reliability of the evaluation process. The findings contribute significantly to the automation of construction dispute document summarization, enabling practitioners to manage such cases more efficiently.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Advanced Engineering Informatics
Advanced Engineering Informatics 工程技术-工程:综合
CiteScore
12.40
自引率
18.20%
发文量
292
审稿时长
45 days
期刊介绍: Advanced Engineering Informatics is an international Journal that solicits research papers with an emphasis on 'knowledge' and 'engineering applications'. The Journal seeks original papers that report progress in applying methods of engineering informatics. These papers should have engineering relevance and help provide a scientific base for more reliable, spontaneous, and creative engineering decision-making. Additionally, papers should demonstrate the science of supporting knowledge-intensive engineering tasks and validate the generality, power, and scalability of new methods through rigorous evaluation, preferably both qualitatively and quantitatively. Abstracting and indexing for Advanced Engineering Informatics include Science Citation Index Expanded, Scopus and INSPEC.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信