{"title":"A Quantitative Social Network Analysis of the Character Relationships in the Mahabharata","authors":"Eren Gultepe, Vivek Mathangi","doi":"10.3390/heritage6110366","DOIUrl":null,"url":null,"abstract":"Despite the advances in computational literary analysis of Western literature, in-depth analysis of the South Asian literature has been lacking. Thus, social network analysis of the main characters in the Indian epic Mahabharata was performed, in which it was prepossessed into verses, followed by a term frequency–inverse document frequency (TF-IDF) transformation. Then, Latent Semantic Analysis (LSA) word vectors were obtained by applying compact Singular Value Decomposition (SVD) on the term–document matrix. As a novel innovation to this study, these word vectors were adaptively converted into a fully connected similarity matrix and transformed, using a novel locally weighted K-Nearest Neighbors (KNN) algorithm, into a social network. The viability of the social networks was assessed by their ability to (i) recover individual character-to-character relationships; (ii) embed the overall network structure (verified with centrality measures and correlations); and (iii) detect communities of the Pandavas (protagonist) and Kauravas (antagonist) using spectral clustering. Thus, the proposed scheme successfully (i) predicted the character-to-character connections of the most important and second most important characters at an F-score of 0.812 and 0.785, respectively, (ii) recovered the overall structure of the ground-truth networks by matching the original centralities (corr. > 0.5, p < 0.05), and (iii) differentiated the Pandavas from the Kauravas with an F-score of 0.749.","PeriodicalId":12934,"journal":{"name":"Heritage","volume":"07 1","pages":"0"},"PeriodicalIF":2.0000,"publicationDate":"2023-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Heritage","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/heritage6110366","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"HUMANITIES, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Despite the advances in computational literary analysis of Western literature, in-depth analysis of the South Asian literature has been lacking. Thus, social network analysis of the main characters in the Indian epic Mahabharata was performed, in which it was prepossessed into verses, followed by a term frequency–inverse document frequency (TF-IDF) transformation. Then, Latent Semantic Analysis (LSA) word vectors were obtained by applying compact Singular Value Decomposition (SVD) on the term–document matrix. As a novel innovation to this study, these word vectors were adaptively converted into a fully connected similarity matrix and transformed, using a novel locally weighted K-Nearest Neighbors (KNN) algorithm, into a social network. The viability of the social networks was assessed by their ability to (i) recover individual character-to-character relationships; (ii) embed the overall network structure (verified with centrality measures and correlations); and (iii) detect communities of the Pandavas (protagonist) and Kauravas (antagonist) using spectral clustering. Thus, the proposed scheme successfully (i) predicted the character-to-character connections of the most important and second most important characters at an F-score of 0.812 and 0.785, respectively, (ii) recovered the overall structure of the ground-truth networks by matching the original centralities (corr. > 0.5, p < 0.05), and (iii) differentiated the Pandavas from the Kauravas with an F-score of 0.749.
尽管西方文学的计算文学分析取得了进展,但对南亚文学的深入分析一直缺乏。因此,对印度史诗《摩诃婆罗多》中的主要人物进行了社会网络分析,其中将其前置为诗句,然后进行术语频率-逆文档频率(TF-IDF)转换。然后,对词-文档矩阵进行压缩奇异值分解(SVD),得到潜在语义分析(LSA)词向量;作为本研究的新颖创新,这些词向量被自适应地转换成一个全连接的相似矩阵,并使用一种新颖的局部加权k近邻(KNN)算法转换成一个社会网络。社会网络的可行性是通过它们的能力来评估的:(1)恢复个体人物对人物的关系;(ii)嵌入整体网络结构(通过中心性度量和相关性进行验证);(iii)利用光谱聚类检测Pandavas(主角)和Kauravas(拮抗剂)的群落。因此,该方案成功地(i)预测了最重要和次重要字符的字符间连接,f值分别为0.812和0.785;(ii)通过匹配原始中心性(corr. >0.5, p <(iii)区分Pandavas和Kauravas, f值为0.749。