UCLP: Unsupervised Classification of Key Aspects in Vulnerability Descriptions Through Label Profile

IF 1.8 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING
Linyi Han, Hang Li, Xiaowang Zhang, Youmeng Li, Zhiyong Feng
{"title":"UCLP: Unsupervised Classification of Key Aspects in Vulnerability Descriptions Through Label Profile","authors":"Linyi Han,&nbsp;Hang Li,&nbsp;Xiaowang Zhang,&nbsp;Youmeng Li,&nbsp;Zhiyong Feng","doi":"10.1002/smr.70052","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Textual vulnerability descriptions (TVDs) in repositories like NVD and IBM X-Force Exchange are essential for security engineers managing vulnerabilities. Engineers typically search for key aspects in TVDs using specific phrases, but with multiple expressions for each aspect, retrieving all relevant records is challenging. We propose a label-based retrieval framework that classifies key aspects and retrieves TVDs by their broader categories. Given the large data volume, manual labeling is infeasible, making unsupervised classification critical. However, short labels and repeated words diminish semantic clarity, affecting classification accuracy. We introduce Unsupervised Classification through Label Profile (UCLP), which expands label semantics through label profiles inspired by recommendation systems. We construct profiles using neural network weights and apply TF-IDF to calculate similarities, smoothing distributions with an arctangent function. Results show that UCLP significantly outperforms four benchmarks, raising accuracy from 68.3% to 78.9% and improving three real-world applications.</p>\n </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 9","pages":""},"PeriodicalIF":1.8000,"publicationDate":"2025-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Software-Evolution and Process","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/smr.70052","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0

Abstract

Textual vulnerability descriptions (TVDs) in repositories like NVD and IBM X-Force Exchange are essential for security engineers managing vulnerabilities. Engineers typically search for key aspects in TVDs using specific phrases, but with multiple expressions for each aspect, retrieving all relevant records is challenging. We propose a label-based retrieval framework that classifies key aspects and retrieves TVDs by their broader categories. Given the large data volume, manual labeling is infeasible, making unsupervised classification critical. However, short labels and repeated words diminish semantic clarity, affecting classification accuracy. We introduce Unsupervised Classification through Label Profile (UCLP), which expands label semantics through label profiles inspired by recommendation systems. We construct profiles using neural network weights and apply TF-IDF to calculate similarities, smoothing distributions with an arctangent function. Results show that UCLP significantly outperforms four benchmarks, raising accuracy from 68.3% to 78.9% and improving three real-world applications.

UCLP:通过标签配置文件对漏洞描述中的关键方面进行无监督分类
像NVD和IBM X-Force Exchange这样的存储库中的文本漏洞描述(tvd)对于安全工程师管理漏洞至关重要。工程师通常使用特定的短语搜索tvd中的关键方面,但是由于每个方面都有多个表达式,因此检索所有相关记录是具有挑战性的。我们提出了一个基于标签的检索框架,该框架对关键方面进行分类,并根据其更广泛的类别检索tvd。由于数据量大,人工标注是不可行的,这使得无监督分类变得至关重要。然而,短标签和重复词降低了语义清晰度,影响了分类的准确性。我们通过标签概要介绍无监督分类(UCLP),它通过受推荐系统启发的标签概要扩展标签语义。我们使用神经网络权重构建轮廓,并应用TF-IDF计算相似度,使用arctan函数平滑分布。结果表明,UCLP显著优于四个基准,将准确率从68.3%提高到78.9%,并改善了三个实际应用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Software-Evolution and Process
Journal of Software-Evolution and Process COMPUTER SCIENCE, SOFTWARE ENGINEERING-
自引率
10.00%
发文量
109
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信