{"title":"心理健康耻辱与自然语言处理:有限语料库镜头下的两个谜","authors":"Min Hyung Lee, Richard Kyung","doi":"10.1109/aiiot54504.2022.9817362","DOIUrl":null,"url":null,"abstract":"Mental health stigma is an elephant in the room. It exacerbates one's illness, impedes approaches to treatment, and ultimately contributes to the persistence of a “mental health epidemic.” A definitive solution for managing stigmatized language is yet to be discovered, especially on the internet, where stigma is virtually ubiquitous in the forms of user posts, text messages, and biased articles. This study proposes text classification, a subset of natural language processing (NLP), as a solution to identify stigma in context. NLP is frequently used to detect human sentiments and emotions to eradicate hate speech, racism, and personal attacks; however, it has not been thoroughly explored in the field of mental health stigma, and the lack of preexisting data presents a challenge. Facing limited resources, the study hypothesized that the BERT model's fine-tuning method allowed for a small corpus to provide satisfactory results. The model returned surprisingly impressive results (0.94 accuracies, 0.91 F1-Score). The study not only confirms that NLP can be used as an effective solution to detect and later reduce stigma but also that the BERT model is still proficient with a limited corpus. Therefore, NLP tasks historically focused on thoroughly researched fields with an abundance of data, can also be used effectively in underdeveloped, unexplored fields of research that currently lack the datasets needed for training.","PeriodicalId":409264,"journal":{"name":"2022 IEEE World AI IoT Congress (AIIoT)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Mental Health Stigma and Natural Language Processing: Two Enigmas Through the Lens of a Limited Corpus\",\"authors\":\"Min Hyung Lee, Richard Kyung\",\"doi\":\"10.1109/aiiot54504.2022.9817362\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Mental health stigma is an elephant in the room. It exacerbates one's illness, impedes approaches to treatment, and ultimately contributes to the persistence of a “mental health epidemic.” A definitive solution for managing stigmatized language is yet to be discovered, especially on the internet, where stigma is virtually ubiquitous in the forms of user posts, text messages, and biased articles. This study proposes text classification, a subset of natural language processing (NLP), as a solution to identify stigma in context. NLP is frequently used to detect human sentiments and emotions to eradicate hate speech, racism, and personal attacks; however, it has not been thoroughly explored in the field of mental health stigma, and the lack of preexisting data presents a challenge. Facing limited resources, the study hypothesized that the BERT model's fine-tuning method allowed for a small corpus to provide satisfactory results. The model returned surprisingly impressive results (0.94 accuracies, 0.91 F1-Score). The study not only confirms that NLP can be used as an effective solution to detect and later reduce stigma but also that the BERT model is still proficient with a limited corpus. Therefore, NLP tasks historically focused on thoroughly researched fields with an abundance of data, can also be used effectively in underdeveloped, unexplored fields of research that currently lack the datasets needed for training.\",\"PeriodicalId\":409264,\"journal\":{\"name\":\"2022 IEEE World AI IoT Congress (AIIoT)\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE World AI IoT Congress (AIIoT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/aiiot54504.2022.9817362\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE World AI IoT Congress (AIIoT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/aiiot54504.2022.9817362","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Mental Health Stigma and Natural Language Processing: Two Enigmas Through the Lens of a Limited Corpus
Mental health stigma is an elephant in the room. It exacerbates one's illness, impedes approaches to treatment, and ultimately contributes to the persistence of a “mental health epidemic.” A definitive solution for managing stigmatized language is yet to be discovered, especially on the internet, where stigma is virtually ubiquitous in the forms of user posts, text messages, and biased articles. This study proposes text classification, a subset of natural language processing (NLP), as a solution to identify stigma in context. NLP is frequently used to detect human sentiments and emotions to eradicate hate speech, racism, and personal attacks; however, it has not been thoroughly explored in the field of mental health stigma, and the lack of preexisting data presents a challenge. Facing limited resources, the study hypothesized that the BERT model's fine-tuning method allowed for a small corpus to provide satisfactory results. The model returned surprisingly impressive results (0.94 accuracies, 0.91 F1-Score). The study not only confirms that NLP can be used as an effective solution to detect and later reduce stigma but also that the BERT model is still proficient with a limited corpus. Therefore, NLP tasks historically focused on thoroughly researched fields with an abundance of data, can also be used effectively in underdeveloped, unexplored fields of research that currently lack the datasets needed for training.