基于机器学习模型的文本文档信息检索性能评价

2023 International Conference on Disruptive Technologies (ICDT) Pub Date : 2023-05-11 DOI:10.1109/ICDT57929.2023.10150858

Subhasish Chowdhury, Suresh Kumar

{"title":"基于机器学习模型的文本文档信息检索性能评价","authors":"Subhasish Chowdhury, Suresh Kumar","doi":"10.1109/ICDT57929.2023.10150858","DOIUrl":null,"url":null,"abstract":"Text mining is thought to have a high commercial potential due to the significant amounts of unstructured text data produced on the Internet. The practice of obtaining previously undiscovered, comprehensible, potentially useful patterns or knowledge from a corpus of text data is known as text mining. In this study, we attempt to extract the structured information from the text and then use various machine-learning models to categorize the data. We then look for the model that provides the highest level of classification accuracy.","PeriodicalId":266681,"journal":{"name":"2023 International Conference on Disruptive Technologies (ICDT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Performance Evaluation of Text Document Using Machine Learning Models for Information Retrieval\",\"authors\":\"Subhasish Chowdhury, Suresh Kumar\",\"doi\":\"10.1109/ICDT57929.2023.10150858\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Text mining is thought to have a high commercial potential due to the significant amounts of unstructured text data produced on the Internet. The practice of obtaining previously undiscovered, comprehensible, potentially useful patterns or knowledge from a corpus of text data is known as text mining. In this study, we attempt to extract the structured information from the text and then use various machine-learning models to categorize the data. We then look for the model that provides the highest level of classification accuracy.\",\"PeriodicalId\":266681,\"journal\":{\"name\":\"2023 International Conference on Disruptive Technologies (ICDT)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 International Conference on Disruptive Technologies (ICDT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDT57929.2023.10150858\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference on Disruptive Technologies (ICDT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDT57929.2023.10150858","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

由于Internet上产生了大量的非结构化文本数据，文本挖掘被认为具有很高的商业潜力。从文本数据语料库中获取以前未发现的、可理解的、可能有用的模式或知识的实践称为文本挖掘。在本研究中，我们尝试从文本中提取结构化信息，然后使用各种机器学习模型对数据进行分类。然后，我们寻找提供最高级别分类精度的模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Performance Evaluation of Text Document Using Machine Learning Models for Information Retrieval

Text mining is thought to have a high commercial potential due to the significant amounts of unstructured text data produced on the Internet. The practice of obtaining previously undiscovered, comprehensible, potentially useful patterns or knowledge from a corpus of text data is known as text mining. In this study, we attempt to extract the structured information from the text and then use various machine-learning models to categorize the data. We then look for the model that provides the highest level of classification accuracy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 International Conference on Disruptive Technologies (ICDT)

自引率

0.00%

发文量