{"title":"社交媒体数据挖掘与分析的概念框架","authors":"S. R. Joseph, Keletso J. Letsholo, H. Hlomani","doi":"10.14257/ijdta.2017.10.10.02","DOIUrl":null,"url":null,"abstract":"Social media data possess the characteristics of Big Data such as volume, veracity, velocity, variability and value. These characteristics make its analysis a bit more challenging than conventional data. Manual analysis approaches are unable to cope with the fast pace at which data is being generated. Processing data manually is also time consuming and requires a lot of effort as compared to using computational methods. However, computational analysis methods usually cannot capture in-depth meanings (semantics) within data. On their individual capacity, each approach is insufficient. As a solution, we propose a Conceptual Framework, which integrates both the traditional approaches and computational approaches to the mining and analysis of social media data. This allows us to leverage the strengths of traditional content analysis, with its regular meticulousness and relative understanding, whilst exploiting the extensive capacity of Big Data analytics and accuracy of computational methods. The proposed Conceptual Framework was evaluated in two stages using an example case of the political landscape of Botswana data collected from Facebook and Twitter platforms. Firstly, a user study was carried through the Inductive Content Analysis (ICA) process using the collected data. Additionally, a questionnaire was conducted to evaluate the usability of ICA as perceived by the participants. Secondly, an experimental study was conducted to evaluate the performance of data mining algorithms on the data from the ICA process. The results, from the user study, showed that the ICA process is flexible and systematic in terms of allowing the users to analyse social media data, hence reducing the time and effort required to manually analyse data. The users’ perception in terms of ease of use and usefulness of the ICA on analysing social media data is positive. The results from the experimental study show that data mining algorithms produced higher accurate results in classifying data when supplied with data from the ICA process. That is, when data mining algorithms are integrated with the ICA process, they are able to overcome the difficulty they face to capture semantics within data. Overall, the results of this study, including the Proposed Conceptual Framework are useful to scholars and practitioners who wish to do some researches on social media data mining and analysis. The Framework serves as a guide to the mining and analysis of the social media data in a systematic manner.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"2011 1","pages":"11-34"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"A Conceptual Framework for the Mining and Analysis of the Social Media Data\",\"authors\":\"S. R. Joseph, Keletso J. Letsholo, H. Hlomani\",\"doi\":\"10.14257/ijdta.2017.10.10.02\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Social media data possess the characteristics of Big Data such as volume, veracity, velocity, variability and value. These characteristics make its analysis a bit more challenging than conventional data. Manual analysis approaches are unable to cope with the fast pace at which data is being generated. Processing data manually is also time consuming and requires a lot of effort as compared to using computational methods. However, computational analysis methods usually cannot capture in-depth meanings (semantics) within data. On their individual capacity, each approach is insufficient. As a solution, we propose a Conceptual Framework, which integrates both the traditional approaches and computational approaches to the mining and analysis of social media data. This allows us to leverage the strengths of traditional content analysis, with its regular meticulousness and relative understanding, whilst exploiting the extensive capacity of Big Data analytics and accuracy of computational methods. The proposed Conceptual Framework was evaluated in two stages using an example case of the political landscape of Botswana data collected from Facebook and Twitter platforms. Firstly, a user study was carried through the Inductive Content Analysis (ICA) process using the collected data. Additionally, a questionnaire was conducted to evaluate the usability of ICA as perceived by the participants. Secondly, an experimental study was conducted to evaluate the performance of data mining algorithms on the data from the ICA process. The results, from the user study, showed that the ICA process is flexible and systematic in terms of allowing the users to analyse social media data, hence reducing the time and effort required to manually analyse data. The users’ perception in terms of ease of use and usefulness of the ICA on analysing social media data is positive. The results from the experimental study show that data mining algorithms produced higher accurate results in classifying data when supplied with data from the ICA process. That is, when data mining algorithms are integrated with the ICA process, they are able to overcome the difficulty they face to capture semantics within data. Overall, the results of this study, including the Proposed Conceptual Framework are useful to scholars and practitioners who wish to do some researches on social media data mining and analysis. The Framework serves as a guide to the mining and analysis of the social media data in a systematic manner.\",\"PeriodicalId\":13926,\"journal\":{\"name\":\"International journal of database theory and application\",\"volume\":\"2011 1\",\"pages\":\"11-34\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-10-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International journal of database theory and application\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.14257/ijdta.2017.10.10.02\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of database theory and application","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14257/ijdta.2017.10.10.02","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Conceptual Framework for the Mining and Analysis of the Social Media Data
Social media data possess the characteristics of Big Data such as volume, veracity, velocity, variability and value. These characteristics make its analysis a bit more challenging than conventional data. Manual analysis approaches are unable to cope with the fast pace at which data is being generated. Processing data manually is also time consuming and requires a lot of effort as compared to using computational methods. However, computational analysis methods usually cannot capture in-depth meanings (semantics) within data. On their individual capacity, each approach is insufficient. As a solution, we propose a Conceptual Framework, which integrates both the traditional approaches and computational approaches to the mining and analysis of social media data. This allows us to leverage the strengths of traditional content analysis, with its regular meticulousness and relative understanding, whilst exploiting the extensive capacity of Big Data analytics and accuracy of computational methods. The proposed Conceptual Framework was evaluated in two stages using an example case of the political landscape of Botswana data collected from Facebook and Twitter platforms. Firstly, a user study was carried through the Inductive Content Analysis (ICA) process using the collected data. Additionally, a questionnaire was conducted to evaluate the usability of ICA as perceived by the participants. Secondly, an experimental study was conducted to evaluate the performance of data mining algorithms on the data from the ICA process. The results, from the user study, showed that the ICA process is flexible and systematic in terms of allowing the users to analyse social media data, hence reducing the time and effort required to manually analyse data. The users’ perception in terms of ease of use and usefulness of the ICA on analysing social media data is positive. The results from the experimental study show that data mining algorithms produced higher accurate results in classifying data when supplied with data from the ICA process. That is, when data mining algorithms are integrated with the ICA process, they are able to overcome the difficulty they face to capture semantics within data. Overall, the results of this study, including the Proposed Conceptual Framework are useful to scholars and practitioners who wish to do some researches on social media data mining and analysis. The Framework serves as a guide to the mining and analysis of the social media data in a systematic manner.