{"title":"Analyzing unstructured data: text analytics in JMP","authors":"Volker Kraft","doi":"10.52041/srap.17204","DOIUrl":null,"url":null,"abstract":"As much as 80% of all data is unstructured but still has exploitable information available. For example, unstructured text data could result from comment fields in surveys or incident reports. You want to explore this unstructured text to better understand the information that it contains. Text Mining, based on a transformation of free text into numerical summaries, can pave the way for new findings. This example of the new text mining feature in JMP starts with a multi-step text preparation using techniques like stemming and tokenizing. This data curation is pivotal for the subsequent analysis phase, exploring data clusters and semantics. Finally, combining text mining results with other structured data takes familiar multivariate analysis and predictive modeling to a next level.","PeriodicalId":421900,"journal":{"name":"Teaching Statistics in a Data Rich World IASE Satellite Conference","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Teaching Statistics in a Data Rich World IASE Satellite Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.52041/srap.17204","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
As much as 80% of all data is unstructured but still has exploitable information available. For example, unstructured text data could result from comment fields in surveys or incident reports. You want to explore this unstructured text to better understand the information that it contains. Text Mining, based on a transformation of free text into numerical summaries, can pave the way for new findings. This example of the new text mining feature in JMP starts with a multi-step text preparation using techniques like stemming and tokenizing. This data curation is pivotal for the subsequent analysis phase, exploring data clusters and semantics. Finally, combining text mining results with other structured data takes familiar multivariate analysis and predictive modeling to a next level.