Tirthankar Ghosal, Ondrej Bojar, Muskaan Singh, A. Nedoluzhko
{"title":"Overview of the First Shared Task on Automatic Minuting (AutoMin) at Interspeech 2021","authors":"Tirthankar Ghosal, Ondrej Bojar, Muskaan Singh, A. Nedoluzhko","doi":"10.21437/automin.2021-1","DOIUrl":"https://doi.org/10.21437/automin.2021-1","url":null,"abstract":"In this article, we report the findings of the First Shared Task on Automatic Minuting (AutoMin) . The primary objective of the AutoMin shared task was to garner community participation to create minutes from multi-party meetings automatically. The shared task was endorsed by the International Speech Communication Association (ISCA) and was also an Interspeech 2021 satellite event. AutoMin was held virtually on September 4, 2021. The motivation for AutoMin was to bring together the Speech and Natural Language Processing (NLP) community to jointly investigate the challenges and propose in-novative solutions for this timely yet important use case. Ten different teams from diverse backgrounds participated in the shared task and presented their systems. More details on the shared task can be found at https://elitr.github.io/ automatic-minuting .","PeriodicalId":186820,"journal":{"name":"First Shared Task on Automatic Minuting at Interspeech 2021","volume":"146 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116474539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Team JU_PAD @ AutoMin 2021: MoM Generation from Multiparty Meeting Transcript","authors":"Sarthak Pan, Palash Nandi, Dipankar Das","doi":"10.21437/automin.2021-5","DOIUrl":"https://doi.org/10.21437/automin.2021-5","url":null,"abstract":"Use of online meeting platforms for long multi-party discus-sion is gradually increasing and generation of Minutes of Meeting (MoM) is crucial for subsequent events. MOM records all key issues, possible solutions, decisions and actions taken dur-ing the meeting. Hence the importance of minuting cannot be overemphasized in a time when a significant number of meet-ings take place in the virtual space. Automatic generation of MoM can potentially save up to 80% of time while revisiting. In this paper, we present an abstractive approach for automatic generation of meeting minutes. It aims to deal with problems like the nature of spoken text, length of transcripts and lack of document structure and conversation fillers. The system is evaluated on a test dataset. The evaluation score is calculated by both manual and automatic systems. Text summarization metrics ROUGE-1, ROUGE-2, ROUGE-L [1] are used for automated scoring and metrics Adequacy, Grammatical Correctness, Fluency are used for manual scoring. The proposed model achieved 0.221, 0.046, 0,125 for ROUGE-1, ROUGE-2 , ROUGE-L respectively in automated evaluation and 3.5/5, 3/5, 3/5 for Adequacy, Grammatical Correctness, Fluency respectively in manual evaluation.","PeriodicalId":186820,"journal":{"name":"First Shared Task on Automatic Minuting at Interspeech 2021","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124292559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Team Symantlytical @ AutoMin 2021: Generating Readable Minutes with GPT-2 and BERT-based Automatic Minuting Approach","authors":"Amitesh Garg, Muskaan Singh","doi":"10.21437/automin.2021-8","DOIUrl":"https://doi.org/10.21437/automin.2021-8","url":null,"abstract":"This paper describes our participation system run to Automatic Minuting @ Interspeech 2021 1 . The task was motivated towards generating automatic minutes. We make a initial step towards, namely Main Task A , Task B and Task C . The main task A, was to automatically create minutes from multiparty meeting transcripts, while task B to identify whether the minute belongs to the transcript and task C. GPT-2[1]. The shared task, consist-ing of three subtasks, required to produce, contrast and scruti-nize the meeting minutes. The process of automating minuting is considered to be one of the most challenging tasks in natural language processing and sequence-to-sequence transforma-tion. It involves testing the semantic meaningfulness, readability and reasonable adequacy of the Minutes produced in the system. In the proposed work, we have developed a system using pre-trained language models in order to generate dialogue summaries or minutes. The designed methodology considers cov-erage, adequacy and readability to produce the best utilizable summary of a meeting transcript with any length. Our evaluation results in subtask A achieve a score of 11% R-L which by far is the most challenging than subtask as it required systems to generate the rational minutes of the given meeting transcripts.","PeriodicalId":186820,"journal":{"name":"First Shared Task on Automatic Minuting at Interspeech 2021","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126682838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Team The Turing TESTament @ AutoMin 2021: A Pipeline based Approach to Generate Meeting Minutes Using TOPSIS","authors":"Umang Sharma, Muskaan Singh, Harpreet Singh","doi":"10.21437/automin.2021-9","DOIUrl":"https://doi.org/10.21437/automin.2021-9","url":null,"abstract":"In this paper, we present our submission for AutoMin Shared Task@INTERSPEECH 2021. The objectives in this task were divided into three tasks, with the main task to create a summary based on a transcript from a meeting. The other two tasks were to compare minutes and transcripts to find out if they were from the same meeting or not. We propose a pipeline-based system that extracts the important sentences from the transcript using features and then a topsis algorithm to summarize. It cre-ates a flexible system that can provide a set of sentences from any given transcript that can best describe it based on selected features and heuristic evaluation metrics. The proposed system presents readable, grammatically correct, and fluent minutes for given meeting transcripts. We make our codebase ac-cessible here https://github.com/umangSharmacs/ theTuringTestament .","PeriodicalId":186820,"journal":{"name":"First Shared Task on Automatic Minuting at Interspeech 2021","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133246852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Team ABC @ AutoMin 2021: Generating Readable Minutes with a BART-based Automatic Minuting Approach","authors":"Kartik Shinde, Nidhir Bhavsar, Aakash Bhatnagar, Tirthankar Ghosal","doi":"10.21437/automin.2021-2","DOIUrl":"https://doi.org/10.21437/automin.2021-2","url":null,"abstract":"This paper documents the approach of our Team ABC for the First Shared Task on Automatic Minuting (AutoMin) at Interspeech 2021. This challenge’s primary task (Task A) was to generate meeting minutes from multi-party meeting proceedings automatically. For this purpose, we develop an automatic minuting pipeline where we leverage a denoising autoencoder for pretraining sequence-to-sequence models and fine-tune it on a large-scale abstractive dialogue summarization dataset to summarize meeting transcripts. Specifically, we use a BART model and train it on the SAMSum dialogue summarization dataset. Our pipeline first splits the given transcript into blocks of smaller conversations, eliminates redundancies with a specially-crafted rule-based algorithm, summarizes the conversation blocks, retrieves the block-wise summaries, cleans, structures, and finally integrates the summaries to produce the meeting minutes. Our proposed system performs the best in several evaluation metrics (automatic, human) in the AutoMin shared task. We use certain text similarity metrics for the subsidiary tasks to determine whether a given transcript-minute pair corresponds to the same meeting (Task B) and if a given pair of meeting minutes belong to the same meeting (Task C). However, our simple machine-learning-based approach did not perform well in addressing the objective of the subsidiary tasks in the challenge. We publicly release our system codes here 1 .","PeriodicalId":186820,"journal":{"name":"First Shared Task on Automatic Minuting at Interspeech 2021","volume":"266 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123106490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Team UEDIN @ AutoMin 2021: Creating Minutes by Learning to Filter an Extracted Summary","authors":"Philip Williams, B. Haddow","doi":"10.21437/automin.2021-10","DOIUrl":"https://doi.org/10.21437/automin.2021-10","url":null,"abstract":"We describe the University of Edinburgh’s submission to the First Shared Task on Automatic Minuting. We developed an English-language minuting system for Task A that combines BERT-based extractive summarization with logistic regression-based filtering and rule-based pre- and post-processing steps. In the human evaluation, our system averaged scores of 2.1 on adequacy, 3.9 on grammatical correctness, and 3.3 on fluency.","PeriodicalId":186820,"journal":{"name":"First Shared Task on Automatic Minuting at Interspeech 2021","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121670167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Felix Schneider, Sebastian Stüker, V. Parthasarathy
{"title":"Team Zoom @ AutoMin 2021: Cross-domain Pretraining for Automatic Minuting","authors":"Felix Schneider, Sebastian Stüker, V. Parthasarathy","doi":"10.21437/automin.2021-11","DOIUrl":"https://doi.org/10.21437/automin.2021-11","url":null,"abstract":"This Paper describes Zoom’s submission to the First Shared Task on Automatic Minuting at Interspeech 2021. We participated in Task A: generating abstractive summaries of meetings. For this task, we use a transformer-based summarization model which is first trained on data from a similar domain and then finetuned for domain transfer. In this configuration, our model does not yet produce usable summaries. We theorize that in the choice of pretraining corpus, the target side is more important than the source.","PeriodicalId":186820,"journal":{"name":"First Shared Task on Automatic Minuting at Interspeech 2021","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133908061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
O. Iakovenko, A. Andreeva, Anna Lapidus, Liana Mikaelyan
{"title":"Team MTS @ AutoMin 2021: An Overview of Existing Summarization Approaches and Comparison to Unsupervised Summarization Techniques","authors":"O. Iakovenko, A. Andreeva, Anna Lapidus, Liana Mikaelyan","doi":"10.21437/automin.2021-7","DOIUrl":"https://doi.org/10.21437/automin.2021-7","url":null,"abstract":"Remote communication through video or audio conferences has become more popular than ever because of the worldwide pan-demic. These events, therefore, have provoked the development of systems for automatic minuting of spoken language leading to AutoMin 2021 challenge. The following paper illustrates the results of the research that team MTS has carried out while par-ticipating in the Automatic Minutes challenge. In particular, in this paper we analyze existing approaches to text and speech summarization, propose an unsupervised summarization technique based on clustering and provide a pipeline that includes an adapted automatic speech recognition block able to run on real-life recordings. The proposed unsupervised technique out-performs pre-trained summarization models on the automatic minuting task with Rouge 1, Rouge 2 and Rouge L values of 0.21, 0.02 and 0.2 on the dev set, with Rouge 1, Rouge 2, Rouge L, Adequacy, Grammatical correctness and Fluency values of 0.180, 0.035, 0.098, 1.857, 2.304, 1.911 on the test set accord-ingly","PeriodicalId":186820,"journal":{"name":"First Shared Task on Automatic Minuting at Interspeech 2021","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129746568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Team AutoMinuters @ AutoMin 2021: Leveraging state-of-the-art Text Summarization model to Generate Minutes using Transfer Learning","authors":"Parth Mahajan, Muskaan Singh, Harpreet Singh","doi":"10.21437/automin.2021-3","DOIUrl":"https://doi.org/10.21437/automin.2021-3","url":null,"abstract":"This paper presents our submission for the first shared task of automatic minuting (AutoMin@Interspeech 2021). The shared task consists of one main task generate minutes from the given meeting transcript. For this challenge, we leveraged state-of-art text summarization models to generate minutes using the transfer learning approach. We also provide an empirical analysis of our proposed method with other text summarization approaches. We evaluate our system submission quantitatively with 33% BERTscore and 11.6 % ROUGE L, which is rela-tively higher than the average submission in the shared task. Along with the qualitative evaluation, we also vouch for quantitative assessment, where we achieve (2.32, 2.64, 2.52) scores out of five for adequacy, grammatical correctness, and fluency. For the other two tasks, we use Jaccard and cosine text similarity metrics for a given transcript-minute pair corresponding to the same meeting (Task B) and if a given pair of meeting minutes belong to the same meeting (Task C). However, our simple approach yielded 94.8 % (task B) and 92.3% (task C), clearly outperforming most submissions in the challenge. We make our codebase release here https://github. com/mahajanparth19/Automin_Submission .","PeriodicalId":186820,"journal":{"name":"First Shared Task on Automatic Minuting at Interspeech 2021","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127112779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Team Matus and Francesco @ AutoMin 2021: Towards Neural Summarization of Meetings","authors":"Matús Zilinec, F. Re","doi":"10.21437/automin.2021-6","DOIUrl":"https://doi.org/10.21437/automin.2021-6","url":null,"abstract":"As online meetings are becoming increasingly ubiquitous, there is an increasing demand to record the main outcomes of these meetings for future reference. Automatic summarization of meetings is a challenging, yet relatively unex-plored natural language processing task with a wide range of potential applications. This paper describes our submission to the First Shared Task on Automatic Minuting at Interspeech 2021. In contrast to previous research focused on the summarization of narrated documents, we examine the specifics of bullet-point spoken language summarization on the AutoMin dataset of online meetings in English. Furthermore, we investigate whether existing abstractive summarization systems can be transferred to this new domain. In this regard, we develop a minuting pipeline based on the state-of-the-art PEGASUS summarization model. This includes pre-processing of conversational data, few-shot transfer learning using reference minutes generated by human annotators, filtering and post-processing of the resulting candidate summaries into a suitable bullet-point minutes format. We conclude by evaluating the completeness and shortening aspects of our system, and discuss its limitations and potential future research directions.","PeriodicalId":186820,"journal":{"name":"First Shared Task on Automatic Minuting at Interspeech 2021","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122643558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}