Natural Language Processing Journal最新文献_第6页

Providing Citations to Support Fact-Checking: Contextualizing Detection of Sentences Needing Citation on Small Wikipedias 提供引文以支持事实核查：小型维基百科上需要引用的句子的上下文检测

Natural Language Processing Journal Pub Date : 2024-08-02 DOI: 10.1016/j.nlp.2024.100093

Aida Halitaj, Arkaitz Zubiaga

{"title":"Providing Citations to Support Fact-Checking: Contextualizing Detection of Sentences Needing Citation on Small Wikipedias","authors":"Aida Halitaj, Arkaitz Zubiaga","doi":"10.1016/j.nlp.2024.100093","DOIUrl":"10.1016/j.nlp.2024.100093","url":null,"abstract":"<div><p>Authoritative citations are critical to ensure information integrity, especially in encyclopedias like Wikipedia. To date, research on automating citation worthiness detection has largely focused on the most resourceful language, English Wikipedia, neglecting the applicability to smaller Wikipedias. In addition, previous research proposed models that analyze the content inherent to a sentence to determine its citation worthiness, overlooking the potential of additional context to improve the prediction. Addressing these gaps, our study proposes a transformer-based contextualized approach for smaller Wikipedias, presenting a novel method to compile high-quality datasets for the Albanian, Basque, and Catalan editions. We develop the <strong>C</strong>ontextualized <strong>C</strong>itation <strong>W</strong>orthiness (CCW) model, employing sentence representations enriched with adjacent sentences and topic categories for enhanced contextual insight. Empirical experiments on three newly created datasets demonstrate significant performance improvements of our contextualized CCW model, with 6%, 3% and 6% absolute improvements over the baseline for Albanian, Basque and Catalan datasets, respectively. We conduct an in-depth analysis to understand the influence and extent to which preceding and succeeding context as well as topic categories contribute to the accuracy of citation-worthiness predictions. Our findings suggest that incorporating such contextual information aids in the automatic identification of sentences in need of citations, not least when both the preceding and succeeding context are incorporated. This has implications for supporting Wikipedia projects across low-resource languages, promoting better article validation and fact-checking.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"8 ","pages":"Article 100093"},"PeriodicalIF":0.0,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000414/pdfft?md5=5d5c2344f9651734d9e20fc37a799aae&pid=1-s2.0-S2949719124000414-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141992927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Contrastive adversarial gender debiasing 对比对抗性的性别去伪存真

Natural Language Processing Journal Pub Date : 2024-07-23 DOI: 10.1016/j.nlp.2024.100092

Nicolás Torres

{"title":"Contrastive adversarial gender debiasing","authors":"Nicolás Torres","doi":"10.1016/j.nlp.2024.100092","DOIUrl":"10.1016/j.nlp.2024.100092","url":null,"abstract":"<div><p>This research contributes a comprehensive analysis of gender bias within contemporary AI language models, specifically examining iterations of the GPT series, alongside Gemini and Llama. The study offers a systematic investigation, encompassing multiple experiments spanning sentence completions, generative narratives, bilingual analysis, and visual perception assessments. The primary objective is to scrutinize the evolution of gender bias in these models across iterations, explore biases in professions and contexts, and evaluate multilingual disparities. Notably, the analyses reveal a marked evolution in GPT iterations, with GPT4 showcasing significantly reduced or negligible biases, signifying substantial advancements in bias mitigation. Professions and contexts exhibit model biases, indicating associations with specific genders. Multilingual evaluations demonstrate subtle disparities in gender bias tendencies between English and Spanish narratives. To effectively mitigate these biases, we propose a novel Contrastive Adversarial Gender Debiasing (CAGD) method that synergistically combines contrastive learning and adversarial training techniques. The CAGD method enables language models to learn gender-neutral representations while promoting robustness against gender biases, consistently outperforming original and adversarially debiased models across various tasks and metrics. These findings underscore the complexity of gender bias in AI language models, emphasizing the need for continual bias mitigation strategies, such as the proposed CAGD approach, and ethical considerations in AI development and deployment.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"8 ","pages":"Article 100092"},"PeriodicalIF":0.0,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000402/pdfft?md5=8cd94a1ff5d32d6fad021f90862cb81a&pid=1-s2.0-S2949719124000402-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141848697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

NLP-based smart decision making for business and academics 基于 NLP 的商业和学术智能决策

Natural Language Processing Journal Pub Date : 2024-07-20 DOI: 10.1016/j.nlp.2024.100090

Pradnya Sawant, Kavita Sonawane

{"title":"NLP-based smart decision making for business and academics","authors":"Pradnya Sawant, Kavita Sonawane","doi":"10.1016/j.nlp.2024.100090","DOIUrl":"10.1016/j.nlp.2024.100090","url":null,"abstract":"<div><p>Natural Language Processing (NLP) systems enable machines to understand, interpret, and generate human-like language, bridging the gap between human communication and computer understanding. Natural Language Interface to Databases (NLIDB) and Natural Language Interface to Visualization (NLIV) systems are designed to enable non-technical users to retrieve and visualize data through natural language queries. However, these systems often face challenges in handling complex correlation and analytical questions, limiting their effectiveness for comprehensive data analysis. Additionally, current Business Intelligence (BI) tools also struggle with understanding the context and semantics of complex questions, further hindering their usability for strategic decision-making. Also, when building these models for generating the queries from natural language, the system handles only the semantic parsing issues as each column header is being changed manually to their normal names by all existing models which is time-consuming, tedious, and subjective.</p><p>Recent studies reflect the need for attention to context, semantics, and especially ambiguities in dealing with natural language questions. To address this problem, the proposed architecture focuses on understanding the context, correlation-based semantic analysis, and removal of ambiguities using a novel approach. An Enhanced Longest Common Subsequence (ELCS) is suggested where existing LCS is modified with a memorization component for mapping the natural language question tokens with ambiguous table column headers. This can speed up the overall process as human intervention is not required to manually change the column headers. The same is evidenced by carrying out thorough experimentation and comparative study in terms of precision, recall, and F1 score. By synthesizing the latest advancements and addressing challenges, this paper has proved how NLP can significantly enhance the accuracy and efficiency of information retrieval and visualization, broadening the inclusivity and usability of NLIDB, NLIV, and BI systems.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"8 ","pages":"Article 100090"},"PeriodicalIF":0.0,"publicationDate":"2024-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000384/pdfft?md5=fd2e14b2d3243c083595a7e1f7015f23&pid=1-s2.0-S2949719124000384-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141842666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Exploring transformer models in the sentiment analysis task for the under-resource Bengali language 探索资源不足的孟加拉语情感分析任务中的转换器模型

Natural Language Processing Journal Pub Date : 2024-07-17 DOI: 10.1016/j.nlp.2024.100091

Md. Nesarul Hoque , Umme Salma , Md. Jamal Uddin , Md. Martuza Ahamad , Sakifa Aktar

{"title":"Exploring transformer models in the sentiment analysis task for the under-resource Bengali language","authors":"Md. Nesarul Hoque , Umme Salma , Md. Jamal Uddin , Md. Martuza Ahamad , Sakifa Aktar","doi":"10.1016/j.nlp.2024.100091","DOIUrl":"10.1016/j.nlp.2024.100091","url":null,"abstract":"<div><p>In the sentiment analysis (SA) task, we can obtain a positive or negative-typed comment or feedback from an online user or a customer about any object, such as a movie, drama, food, and others. This user’s sentiment may positively impact various decision-making processes. In this regard, a lot of studies have been done on identifying sentiments from a text in high-resource languages like English. However, a small number of studies are detected in the under-resource Bengali language because of the unavailability of the benchmark corpus, limitations of text processing application software, and so on. Furthermore, there is still enough space to enhance the classification performance of the SA task. In this research, we experiment on a recognized Bengali dataset of 11,807 comments to find positive or negative sentiments. We employ five state-of-the-art transformer-based pretrained models, such as multilingual Bidirectional Encoder Representations from Transformers (mBERT), BanglaBERT, Bangla-Bert-Base, DistilmBERT, and XLM-RoBERTa-base (XLM-R-base), with tuning of the hyperparameters. After that, we propose a combined model named Transformer-ensemble that presents outstanding detection performance with an accuracy of 95.97% and an F1-score of 95.96% compared to the existing recent methods in the Bengali SA task.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"8 ","pages":"Article 100091"},"PeriodicalIF":0.0,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000396/pdfft?md5=224e6ebbfc8811318218e54f481e4c76&pid=1-s2.0-S2949719124000396-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141847924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

TransLSTM: A hybrid LSTM-Transformer model for fine-grained suggestion mining TransLSTM：用于细粒度建议挖掘的 LSTM 和 Transformer 混合模型

Natural Language Processing Journal Pub Date : 2024-07-14 DOI: 10.1016/j.nlp.2024.100089

Samad Riaz , Amna Saghir , Muhammad Junaid Khan , Hassan Khan , Hamid Saeed Khan , M. Jaleed Khan

{"title":"TransLSTM: A hybrid LSTM-Transformer model for fine-grained suggestion mining","authors":"Samad Riaz , Amna Saghir , Muhammad Junaid Khan , Hassan Khan , Hamid Saeed Khan , M. Jaleed Khan","doi":"10.1016/j.nlp.2024.100089","DOIUrl":"10.1016/j.nlp.2024.100089","url":null,"abstract":"<div><p>Digital platforms on the internet are invaluable for collecting user feedback, suggestions, and opinions about various topics, such as company products and services. This data is instrumental in shaping business strategies, enhancing product development, and refining service delivery. Suggestion mining is a key task in natural language processing, which focuses on extracting and analysing suggestions from these digital sources. Initially, suggestion mining utilized manually crafted features, but recent advancements have highlighted the efficacy of deep learning models, which automatically learn features. Models like Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), Long Short-Term Memory (LSTM), and Bidirectional Encoder Representations from Transformers (BERT) have been employed in this field. However, considering the relatively small datasets and the faster training time of LSTM compared to BERT, we introduce TransLSTM, a novel LSTM-Transformer hybrid model for suggestion mining. This model aims to automatically pinpoint and extract suggestions by harnessing both local and global text dependencies. It combines the sequential dependency handling of LSTM with the contextual interaction capabilities of the Transformer, thus effectively identifying and extracting suggestions. We evaluated our method against state-of-the-art approaches using the SemEval Task-9 dataset, a benchmark for suggestion mining. Our model shows promising performance, surpassing existing deep learning methods by 6.76% with an F1 score of 0.834 for SubTask A and 0.881 for SubTask B. Additionally, our paper presents an exhaustive literature review on suggestion mining from digital platforms, covering both traditional and state-of-the-art text classification techniques.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"8 ","pages":"Article 100089"},"PeriodicalIF":0.0,"publicationDate":"2024-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000372/pdfft?md5=01d5468c4cb646548ed9ac72a0da2eb9&pid=1-s2.0-S2949719124000372-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141706671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A comprehensive survey on answer generation methods using NLP 使用 NLP 生成答案方法的综合调查

Natural Language Processing Journal Pub Date : 2024-07-02 DOI: 10.1016/j.nlp.2024.100088

Prashant Upadhyay, Rishabh Agarwal, Sumeet Dhiman, Abhinav Sarkar, Saumya Chaturvedi

{"title":"A comprehensive survey on answer generation methods using NLP","authors":"Prashant Upadhyay, Rishabh Agarwal, Sumeet Dhiman, Abhinav Sarkar, Saumya Chaturvedi","doi":"10.1016/j.nlp.2024.100088","DOIUrl":"10.1016/j.nlp.2024.100088","url":null,"abstract":"<div><p>Recent advancements in question-answering systems have significantly enhanced the capability of computers to understand and respond to queries in natural language. This paper presents a comprehensive review of the evolution of question answering systems, with a focus on the developments over the last few years. We examine the foundational aspects of a question answering framework, including question analysis, answer extraction, and passage retrieval. Additionally, we delve into the challenges that question answering systems encounter, such as the intricacies of question processing, the necessity of contextual data sources, and the complexities involved in real-time question answering. Our study categorizes existing question answering systems based on the types of questions they address, the nature of the answers they produce, and the various approaches employed to generate these answers. We also explore the distinctions between opinion-based, extraction-based, retrieval-based, and generative answer generation. The classification provides insight into the strengths and limitations of each method, paving the way for future innovations in the field. This review aims to offer a clear understanding of the current state of question answering systems and to identify the scaling needed to meet the rising expectations and demands of users for coherent and accurate automated responses in natural language.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"8 ","pages":"Article 100088"},"PeriodicalIF":0.0,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000360/pdfft?md5=57245c441a09df1168241bb40a6f9e06&pid=1-s2.0-S2949719124000360-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141623096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Topic-aware response selection for dialog systems 对话系统的主题感知响应选择

Natural Language Processing Journal Pub Date : 2024-06-24 DOI: 10.1016/j.nlp.2024.100087

Wei Yuan , Zongyang Ma , Aijun An , Jimmy Xiangji Huang

{"title":"Topic-aware response selection for dialog systems","authors":"Wei Yuan , Zongyang Ma , Aijun An , Jimmy Xiangji Huang","doi":"10.1016/j.nlp.2024.100087","DOIUrl":"https://doi.org/10.1016/j.nlp.2024.100087","url":null,"abstract":"<div><p>It is challenging for a persona-based chitchat system to return responses consistent with the dialog context and the persona of the agent. This particularly holds for a retrieval-based chitchat system that selects the most appropriate response from a set of candidates according to the dialog context and the persona of the agent. A persona usually has some dominant topics (e.g., <em>sports</em>, <em>music</em>). Adhering to these topics can enhance the consistency of responses. However, previous studies rarely explore the topical semantics of the agent’s persona in the chitchat system, which often fails to return responses coherent with the persona. In this paper, we propose a Topic-Aware Response Selection (TARS) model, capturing multi-grained matching between the dialog context and a response and also between the persona and a response at both the word and the topic levels, to select the appropriate topic-aware response from the pool of response candidates. Empirical results on the public persona-based empathetic conversation (PEC) data demonstrate the promising performance of the TARS model for response selection.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"8 ","pages":"Article 100087"},"PeriodicalIF":0.0,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000359/pdfft?md5=460e17e8ab71eeba6fb71be3795c94c0&pid=1-s2.0-S2949719124000359-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141541701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A modified Vector Space Model for semantic information retrieval 用于语义信息检索的改进向量空间模型

Natural Language Processing Journal Pub Date : 2024-06-13 DOI: 10.1016/j.nlp.2024.100081

Callistus Ireneous Nakpih

{"title":"A modified Vector Space Model for semantic information retrieval","authors":"Callistus Ireneous Nakpih","doi":"10.1016/j.nlp.2024.100081","DOIUrl":"10.1016/j.nlp.2024.100081","url":null,"abstract":"<div><p>In this research, we present a modified Vector Space Model which focuses on the semantic relevance of words for retrieving documents. The modified VSM resolves the problem of the classical model performing only lexical matching of query terms to document terms for retrievals. This problem also restricts the classical model from retrieving documents that do not have exact match of query terms even if they are semantically relevant to the query. In the modified model, we introduced a Query Relevance Update technique, which pads the original query set with semantically relevant document terms for optimised semantic retrieval results. The modified model also includes a novel <span><math><mrow><mi>t</mi><mi>f</mi><mo>−</mo><mi>p</mi></mrow></math></span> which replaces the <span><math><mrow><mi>t</mi><mi>f</mi><mo>−</mo><mi>i</mi><mi>d</mi><mi>f</mi></mrow></math></span> technique of the classical VSM, which is used to compute the Term Frequency weights. The replacement of the <span><math><mrow><mi>t</mi><mi>f</mi><mo>−</mo><mi>i</mi><mi>d</mi><mi>f</mi></mrow></math></span> resolves the problem of the classical model penalising terms that occur across documents with the assumption that they are stop words, which in practice, there are usually such words which carry relevant semantic information for documents’ retrieval. We also extended the cosine similarity function with a proportionality weight <span><math><msub><mrow><mi>p</mi></mrow><mrow><mi>q</mi><mi>d</mi></mrow></msub></math></span>, which moderates biases for high frequency of terms in longer documents. The <span><math><msub><mrow><mi>p</mi></mrow><mrow><mi>q</mi><mi>d</mi></mrow></msub></math></span> ensures that the frequency of query terms including the updated ones are accounted for in proportionality with documents size for the overall ranking of documents. The simulated results reveal that, the modified VSM does achieve semantic retrieval of documents beyond lexical matching of query and document terms.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"8 ","pages":"Article 100081"},"PeriodicalIF":0.0,"publicationDate":"2024-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000293/pdfft?md5=3a5de846966e83dc34ea6a2e3b7d202f&pid=1-s2.0-S2949719124000293-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141405572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Cutting through the noise to motivate people: A comprehensive analysis of COVID-19 social media posts de/motivating vaccination 穿透噪音，激发人们的积极性：全面分析 COVID-19 社交媒体上关于疫苗接种的帖子

Natural Language Processing Journal Pub Date : 2024-06-13 DOI: 10.1016/j.nlp.2024.100085

Ashiqur Rahman , Ehsan Mohammadi , Hamed Alhoori

{"title":"Cutting through the noise to motivate people: A comprehensive analysis of COVID-19 social media posts de/motivating vaccination","authors":"Ashiqur Rahman , Ehsan Mohammadi , Hamed Alhoori","doi":"10.1016/j.nlp.2024.100085","DOIUrl":"10.1016/j.nlp.2024.100085","url":null,"abstract":"<div><p>The COVID-19 pandemic exposed significant weaknesses in the healthcare information system. The overwhelming volume of misinformation on social media and other socioeconomic factors created extraordinary challenges to motivate people to take proper precautions and get vaccinated. In this context, our work explored a novel direction by analyzing an extensive dataset collected over two years, identifying the topics de/motivating the public about COVID-19 vaccination. We analyzed these topics based on time, geographic location, and political orientation. We noticed that while the motivating topics remain the same over time and geographic location, the demotivating topics change rapidly. We also identified that intrinsic motivation, rather than external mandate, is more advantageous to inspire the public. This study addresses scientific communication and public motivation in social media. It can help public health officials, policymakers, and social media platforms develop more effective messaging strategies to cut through the noise of misinformation and educate the public about scientific findings.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"8 ","pages":"Article 100085"},"PeriodicalIF":0.0,"publicationDate":"2024-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000335/pdfft?md5=bda60786d7ac110df5894be0ee669f0e&pid=1-s2.0-S2949719124000335-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141392804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Generating dynamic lip-syncing using target audio in a multimedia environment 在多媒体环境中使用目标音频生成动态唇语同步

Natural Language Processing Journal Pub Date : 2024-06-10 DOI: 10.1016/j.nlp.2024.100084

Diksha Pawar, Prashant Borde, Pravin Yannawar

{"title":"Generating dynamic lip-syncing using target audio in a multimedia environment","authors":"Diksha Pawar, Prashant Borde, Pravin Yannawar","doi":"10.1016/j.nlp.2024.100084","DOIUrl":"https://doi.org/10.1016/j.nlp.2024.100084","url":null,"abstract":"<div><p>The presented research focuses on the challenging task of creating lip-sync facial videos that align with a specified target speech segment. A novel deep-learning model has been developed to produce precise synthetic lip movements corresponding to the speech extracted from an audio source. Consequently, there are instances where portions of the visual data may fall out of sync with the updated audio and this challenge is handled through, a novel strategy, leveraging insights from a robust lip-sync discriminator. Additionally, this study introduces fresh criteria and evaluation benchmarks for assessing lip synchronization in unconstrained videos. LipChanger demonstrates improved PSNR values, indicative of enhanced image quality. Furthermore, it exhibits highly accurate lip synthesis, as evidenced by lower LMD values and higher SSIM values. These outcomes suggest that the LipChanger approach holds significant potential for enhancing lip synchronization in talking face videos, resulting in more realistic lip movements. The proposed LipChanger model and its associated evaluation benchmarks show promise and could potentially contribute to advancements in lip-sync technology for unconstrained talking face videos.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"8 ","pages":"Article 100084"},"PeriodicalIF":0.0,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000323/pdfft?md5=84516d2e22e4420f113635a3914da66f&pid=1-s2.0-S2949719124000323-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141328741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0