Proceedings of the 14th Annual Meeting of the Forum for Information Retrieval Evaluation最新文献

筛选
英文 中文
EmoThreat@FIRE2022: Shared Track on Emotions and Threat Detection in Urdu EmoThreat@FIRE2022:乌尔都语情绪和威胁检测共享轨道
S. Butt, Maaz Amjad, Fazlourrahman Balouchzah, Noman Ashraf, Rajesh Sharma, G. Sidorov, A. Gelbukh
{"title":"EmoThreat@FIRE2022: Shared Track on Emotions and Threat Detection in Urdu","authors":"S. Butt, Maaz Amjad, Fazlourrahman Balouchzah, Noman Ashraf, Rajesh Sharma, G. Sidorov, A. Gelbukh","doi":"10.1145/3574318.3574327","DOIUrl":"https://doi.org/10.1145/3574318.3574327","url":null,"abstract":"Many languages with a wealth of resources have been researched to solve the challenges of emotion and targeted abuse detection, i.e. threat. But when it comes to languages, such as Urdu, it is noted that there is a severe lack of both resources and approaches in terms of Urdu language processing. Therefore, this study concentrated on offering resources for Urdu by organizing a shared task called “EmoThreat: Emotions and Threat detection in Urdu\". The task offered two tasks: (i) multi-label emotion classification (Task A), and (ii) binary threat detection (Task B). Task B was a multi-class problem since it was further subdivided into the identification of threats posed by groups and individuals. This paper provides an overview of the methodology and results obtained by each of the 10 distinct teams who participated in the shared task. In addition, each group presented a detailed error analysis as part of their submission for the best model. The top-performing system in Task A received a macro-F1 score of 0.687. In contrast, subtask 1 of Task B received a score of 0.716 macro-F1 while subtask 2 of Task B obtained a 0.539 macro-F1 score.","PeriodicalId":270700,"journal":{"name":"Proceedings of the 14th Annual Meeting of the Forum for Information Retrieval Evaluation","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124259216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Overview of the FIRE 2022 track: Information Retrieval from Microblogs during Disasters (IRMiDis) FIRE 2022专题概述:灾害期间从微博中获取信息(IRMiDis)
Soham Poddar, Moumita Basu, Kripabandhu Ghosh, Saptarshi Ghosh
{"title":"Overview of the FIRE 2022 track: Information Retrieval from Microblogs during Disasters (IRMiDis)","authors":"Soham Poddar, Moumita Basu, Kripabandhu Ghosh, Saptarshi Ghosh","doi":"10.1145/3574318.3574319","DOIUrl":"https://doi.org/10.1145/3574318.3574319","url":null,"abstract":"Microblogging sites such as Twitter play an important role in dealing with various mass emergencies including natural disasters and pandemics. Over the last several years, the track on Information Retrieval from Microblogs during Disasters (IRMiDis), organized as part of the FIRE conference series, has provided annotated datasets for developing ML/NLP techniques for utilizing microblogs for various practical tasks that would help authorities better deal with disaster situations. In particular, the FIRE 2022 IRMiDis track focused on two important tasks – (i) to detect the vaccine-related stance of tweets related to COVID-19 vaccines, and (ii) to detect reporting of COVID-19 symptom in tweets.","PeriodicalId":270700,"journal":{"name":"Proceedings of the 14th Annual Meeting of the Forum for Information Retrieval Evaluation","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116927560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Design Considerations for a Sustainable Scholarly Big Data Service 可持续学术大数据服务的设计考虑
Jian Wu, Shaurya Rohatgi, Manoj K. Angadi, Kavya S. Puranik, C. Lee Giles
{"title":"Design Considerations for a Sustainable Scholarly Big Data Service","authors":"Jian Wu, Shaurya Rohatgi, Manoj K. Angadi, Kavya S. Puranik, C. Lee Giles","doi":"10.1145/3574318.3574340","DOIUrl":"https://doi.org/10.1145/3574318.3574340","url":null,"abstract":"The advancement of web programming techniques, such as Ajax and jQuery, and datastores, such as Apache Solr and Elasticsearch, have made it much easier to deploy small to medium scale web-based search engines. However, developing a sustainable search engine that supports scholarly big data services is still challenging often because of limited human resources and financial support. Such scenarios are typical in academic settings or small businesses. Here, we showcase how four key design decisions were made by trading-off competing factors such as performance, cost, and efficiency, when developing the Next Generation CiteSeerX (NGX), the successor of CiteSeerX, which was a pioneering digital library search engine that has been serving academic communities for more than two decades. This work extends our previous work in Wu et al. (2021) and discusses design considerations of infrastructure, web applications, indexing, and document filtering. These design considerations can be generalized to other web-based search engines with a similar scale that are deployed in small business or academic settings with limited resources.","PeriodicalId":270700,"journal":{"name":"Proceedings of the 14th Annual Meeting of the Forum for Information Retrieval Evaluation","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130387342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Topic-Mono-BERT: A Joint Retrieval-Clustering System for Retrieving Overview Passages Topic-Mono-BERT:一个用于检索概述文章的联合检索-聚类系统
Sumanta Kashyapi, Laura Dietz
{"title":"Topic-Mono-BERT: A Joint Retrieval-Clustering System for Retrieving Overview Passages","authors":"Sumanta Kashyapi, Laura Dietz","doi":"10.1145/3574318.3574336","DOIUrl":"https://doi.org/10.1145/3574318.3574336","url":null,"abstract":"For most queries, the set of relevant documents spans multiple subtopics. Inspired by the neural ranking models and query-specific neural clustering models, we develop Topic-Mono-BERT which performs both tasks jointly. Based on text embeddings of BERT, our model learns a shared embedding that is optimized for both tasks. The clustering hypothesis would suggest that embeddings which place topically similar text in close proximity will also perform better on ranking tasks. Our model is trained with the Wikimarks approach to obtain training signals for relevance and subtopics on the same queries. Our task is to identify overview passages that can be used to construct a succinct answer to the query. Our empirical evaluation on two publicly available passage retrieval datasets suggests that including the clustering supervision in the ranking model leads to about improvement in identifying text passages that summarize different subtopics within a query.","PeriodicalId":270700,"journal":{"name":"Proceedings of the 14th Annual Meeting of the Forum for Information Retrieval Evaluation","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129103848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Multilingual Dataset for Identification of Factual Claims in Indian Twitter 印度推特中事实声明识别的多语言数据集
Subhabrata Dutta, Rudra Dhar, Prantik Guha, Arpan Murmu, Dipankar Das
{"title":"A Multilingual Dataset for Identification of Factual Claims in Indian Twitter","authors":"Subhabrata Dutta, Rudra Dhar, Prantik Guha, Arpan Murmu, Dipankar Das","doi":"10.1145/3574318.3574348","DOIUrl":"https://doi.org/10.1145/3574318.3574348","url":null,"abstract":"The need for automated fact-checking is getting prominent with every passing day as the spread of misinformation is swelling over the ever-increasing stream of online content. We focus on fine-grained labelling of factual information in tweets to facilitate better fact-checking systems capable of providing improved justifications. In this paper, we present a token-level annotation of factual claims in tweets from Indian Twitter. To deal with the multilingual variety of the Indian diaspora, we deal with tweets in English, Bengali, Hindi, and their codemixed variants. To the best of our knowledge, this dataset is first of kind, both in terms of labelling scheme as well as data sources.","PeriodicalId":270700,"journal":{"name":"Proceedings of the 14th Annual Meeting of the Forum for Information Retrieval Evaluation","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126168004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Classification of Waste Materials using CNN Based on Transfer Learning 基于迁移学习的CNN废弃物分类
Sujan Poudel, Prakash Poudyal
{"title":"Classification of Waste Materials using CNN Based on Transfer Learning","authors":"Sujan Poudel, Prakash Poudyal","doi":"10.1145/3574318.3574345","DOIUrl":"https://doi.org/10.1145/3574318.3574345","url":null,"abstract":"Waste Management is important for humans as well as nature for healthy life and a clean environment. The major step for effective waste management is the segregation of waste according to its types. The advancement of technology such as hardware and artificial intelligence is used for the segregation of waste. There are several machine learning and deep learning algorithms available for image classification. Among them, Convolutional Neural Network is the most used one. The main objective of this work is to classify images of waste materials using CNN into seven categories (cardboard, glass, metal, organic, paper, plastic, and trash). Then, cardboard, organic, and paper class images are considered biodegradable waste, and other classes are considered non-biodegradable waste. The pre-trained CNN model such as InceptionV3, InceptionResNetV2, Xception, VGG19, MobileNet, ResNet50 and DenseNet201 have been trained and performed fine-tuning on the waste dataset. Among these models, the VGG19 model performed with less accuracy, whereas the InceptionV3 model performed with high learning accuracy. Overall, the obtained result is promising.","PeriodicalId":270700,"journal":{"name":"Proceedings of the 14th Annual Meeting of the Forum for Information Retrieval Evaluation","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129447689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Triplet Loss based Siamese Networks for Automatic Short Answer Grading 基于三重损失的Siamese网络自动简答评分
Nagamani Yeruva, Sarada Venna, Hemalatha Indukuri, Mounika Marreddy
{"title":"Triplet Loss based Siamese Networks for Automatic Short Answer Grading","authors":"Nagamani Yeruva, Sarada Venna, Hemalatha Indukuri, Mounika Marreddy","doi":"10.1145/3574318.3574337","DOIUrl":"https://doi.org/10.1145/3574318.3574337","url":null,"abstract":"Grading student work is critical for assessing their understanding and providing necessary feedback. However, answer grading can become monotonous for teachers. On the standard ASAG data set, our system shows substantial improvements in classification disparity of correct and incorrect answers from a reference answer compared to baseline methods. Our supervised model (1) utilizes recent advances in semantic word embeddings and (2) implements ideas from one-shot learning methods, which are proven to work with minimal. We present experimental results from a model based on different approaches and demonstrates decent performance on standard benchmark dataset.","PeriodicalId":270700,"journal":{"name":"Proceedings of the 14th Annual Meeting of the Forum for Information Retrieval Evaluation","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117256924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
FIRE 2022 ILSUM Track: Indian Language Summarization FIRE 2022 ILSUM轨道:印度语言摘要
Shrey Satapara, Bhavan Modha, Sandip J Modha, Parth Mehta
{"title":"FIRE 2022 ILSUM Track: Indian Language Summarization","authors":"Shrey Satapara, Bhavan Modha, Sandip J Modha, Parth Mehta","doi":"10.1145/3574318.3574328","DOIUrl":"https://doi.org/10.1145/3574318.3574328","url":null,"abstract":"This abstract provides a short overview of the first edition of the shared task on Indian Language Summarization (ILSUM) organized at the 14th Forum for Information Retrieval Evaluation (FIRE 2022). A more detailed discussion is available in the track overview paper. The objective of this shared task was to create benchmark data for text summarization in Indian languages. This edition included three languages Hindi, Gujarati, and Indian English which is an officially recognized dialect of English mainly used in the Indian subcontinent. The task saw an enthusiastic response, with registrations from over 50 teams. A total of 12 teams submitted test runs across the three languages out of which 10 teams submitted working notes. Standard ROUGE metrics were used as the evaluation metric.","PeriodicalId":270700,"journal":{"name":"Proceedings of the 14th Annual Meeting of the Forum for Information Retrieval Evaluation","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127789607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Overview of the HASOC Subtrack at FIRE 2022: Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages FIRE 2022的HASOC子轨道概述:英语和印度雅利安语言中的仇恨言论和攻击性内容识别
Thomas Mandl, Sandip J Modha, Gautam Kishore Shahi, Hiren Madhu, Shrey Satapara, Prasenjit Majumder, Johannes Schäfer, Tharindu Ranasinghe, Marcos Zampieri, D. Nandini, A. Jaiswal
{"title":"Overview of the HASOC Subtrack at FIRE 2022: Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages","authors":"Thomas Mandl, Sandip J Modha, Gautam Kishore Shahi, Hiren Madhu, Shrey Satapara, Prasenjit Majumder, Johannes Schäfer, Tharindu Ranasinghe, Marcos Zampieri, D. Nandini, A. Jaiswal","doi":"10.1145/3574318.3574326","DOIUrl":"https://doi.org/10.1145/3574318.3574326","url":null,"abstract":"In recent years, the spread of online offensive content has become of great concern, motivating researchers to develop robust systems capable of identifying such content automatically. To carry out a fair evaluation of these systems, several international shared tasks have been organized, providing the community with essential benchmark data and evaluation methods for various languages. Organized since 2019, the HASOC (Hate Speech and Offensive Content Identification) shared task is one of these initiatives. In its fourth iteration, HASOC 2022 included three tasks for English-Hindi codemix, German and Marathi. Tasks 1 and 2 were on conversational hate speech detection. The idea is to detect supporting hate speech, profanity, or other forms of offensiveness depending on the surrounding context of Twitter posts. Task 1 was offered in Hindi-English codemix and German. Task 2 was provided for Hindi-English codemix, and it was focused on further classifying the problematic tweets in conversational hate speech into standalone and contextual hate. This paper presents a brief description of tasks, data, and participation.","PeriodicalId":270700,"journal":{"name":"Proceedings of the 14th Annual Meeting of the Forum for Information Retrieval Evaluation","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121852142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 83
Findings of shared task on Sentiment Analysis and Homophobia Detection of YouTube Comments in Code-Mixed Dravidian Languages 代码混合德拉威语YouTube评论情感分析与同性恋恐惧症检测的共享任务研究
Subalalitha Chinnaudayar Navaneethakrishnan, Bharathi Raja Chakravarthi, Kogilavani Shanmugavadivel, Malliga Subramanian, Prasanna Kumar Kumaresan, Bharathi, Lavanya Sambath Kumar, Rahul Ponnusamy
{"title":"Findings of shared task on Sentiment Analysis and Homophobia Detection of YouTube Comments in Code-Mixed Dravidian Languages","authors":"Subalalitha Chinnaudayar Navaneethakrishnan, Bharathi Raja Chakravarthi, Kogilavani Shanmugavadivel, Malliga Subramanian, Prasanna Kumar Kumaresan, Bharathi, Lavanya Sambath Kumar, Rahul Ponnusamy","doi":"10.1145/3574318.3574347","DOIUrl":"https://doi.org/10.1145/3574318.3574347","url":null,"abstract":"We present an overview of sentiment analysis and homophobia detection of YouTube comments in code-mixed Dravidian languages in this paper. We provide the details of this task and the submitted systems for the tasks. We introduce two studies: task A for detecting sentiment analysis and task B on homophobia detection, which is organized by the FIRE 2022. A total of 95 participants registered for the shared task, 13 teams finally submitted their results for task-A a, and 10 teams submitted their results for task B. The teams explored tasks A and B using traditional machine learning and deep learning models. Most of the benchmark systems have been analyzed by participants capable of handling code-mixed scenarios in Dravidian languages.","PeriodicalId":270700,"journal":{"name":"Proceedings of the 14th Annual Meeting of the Forum for Information Retrieval Evaluation","volume":"62 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116226393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信