语义相关性:SQL分配中的抄袭检测策略

2023 6th World Conference on Computing and Communication Technologies (WCCCT) Pub Date : 2023-01-06 DOI:10.1109/WCCCT56755.2023.10052438

Chukwuka Victor Obionwu, R. Kumar, Suhas Shantharam, David Broneske, Gunter Saake

{"title":"语义相关性:SQL分配中的抄袭检测策略","authors":"Chukwuka Victor Obionwu, R. Kumar, Suhas Shantharam, David Broneske, Gunter Saake","doi":"10.1109/WCCCT56755.2023.10052438","DOIUrl":null,"url":null,"abstract":"The Structured Query Language is the de facto language for defining, and manipulating data in a relational database. Thus, its mastery is important for students in computer science related discipline. Ergo, most universities offer more different courses that enable students to acquire SQL skill. However, this objective is plagued by code plagiarism, a major problem affecting the academic community. While plagiarism detection in other languages are detectable, detecting copied code in SQL is a difficult task to solve as most of the queries are relatively same, which makes plagiarism detection strategies ineffective when the objects are SQL queries. Research efforts in natural language processing has seen the development of several strategies that has facilitated complex evaluation of text strings. In this endavour, we liverage semantic similarity, a method that enables the evaluation of the semantic textual similarity between text strings, and the idea of distance between words, and the likelyness of their meaning to detect plagiarised SQL queries by semantically evaluating raw student query submissions from our SQL courses which are offered every semester. Result show that the semantic similarity strategy was able to detect code similarity, which translated to plagiarism in a considerable umber of submissions. In all, we describe in this paper, our plagiarism detection strategy, the limitations of our strategy, possible means that may be effective at addressing these limitations.","PeriodicalId":112978,"journal":{"name":"2023 6th World Conference on Computing and Communication Technologies (WCCCT)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Semantic Relatedness: A Strategy for Plagiarism Detection in SQL Assignments\",\"authors\":\"Chukwuka Victor Obionwu, R. Kumar, Suhas Shantharam, David Broneske, Gunter Saake\",\"doi\":\"10.1109/WCCCT56755.2023.10052438\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Structured Query Language is the de facto language for defining, and manipulating data in a relational database. Thus, its mastery is important for students in computer science related discipline. Ergo, most universities offer more different courses that enable students to acquire SQL skill. However, this objective is plagued by code plagiarism, a major problem affecting the academic community. While plagiarism detection in other languages are detectable, detecting copied code in SQL is a difficult task to solve as most of the queries are relatively same, which makes plagiarism detection strategies ineffective when the objects are SQL queries. Research efforts in natural language processing has seen the development of several strategies that has facilitated complex evaluation of text strings. In this endavour, we liverage semantic similarity, a method that enables the evaluation of the semantic textual similarity between text strings, and the idea of distance between words, and the likelyness of their meaning to detect plagiarised SQL queries by semantically evaluating raw student query submissions from our SQL courses which are offered every semester. Result show that the semantic similarity strategy was able to detect code similarity, which translated to plagiarism in a considerable umber of submissions. In all, we describe in this paper, our plagiarism detection strategy, the limitations of our strategy, possible means that may be effective at addressing these limitations.\",\"PeriodicalId\":112978,\"journal\":{\"name\":\"2023 6th World Conference on Computing and Communication Technologies (WCCCT)\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 6th World Conference on Computing and Communication Technologies (WCCCT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WCCCT56755.2023.10052438\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 6th World Conference on Computing and Communication Technologies (WCCCT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WCCCT56755.2023.10052438","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

结构化查询语言是用于定义和操作关系数据库中的数据的实际语言。因此，掌握它对于计算机科学相关学科的学生来说是非常重要的。因此，大多数大学提供更多不同的课程，使学生能够获得SQL技能。然而，这一目标受到代码抄袭的困扰，这是影响学术界的一个主要问题。在其他语言中，抄袭检测是可以检测的，而在SQL中，抄袭检测是一个很难解决的问题，因为大多数查询都是相对相同的，这使得当对象是SQL查询时，抄袭检测策略无效。在自然语言处理方面的研究已经看到了几种策略的发展，这些策略促进了对文本字符串的复杂评估。在这项努力中，我们利用语义相似度，这是一种能够评估文本字符串之间的语义文本相似度的方法，单词之间的距离，以及它们的含义的可能性，通过语义评估我们每学期提供的SQL课程的原始学生查询提交来检测抄袭的SQL查询。结果表明，语义相似策略能够检测到代码相似度，从而在相当数量的投稿中转化为抄袭。总之，我们在本文中描述了我们的抄袭检测策略，我们策略的局限性，可能的方法可以有效地解决这些局限性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Semantic Relatedness: A Strategy for Plagiarism Detection in SQL Assignments

The Structured Query Language is the de facto language for defining, and manipulating data in a relational database. Thus, its mastery is important for students in computer science related discipline. Ergo, most universities offer more different courses that enable students to acquire SQL skill. However, this objective is plagued by code plagiarism, a major problem affecting the academic community. While plagiarism detection in other languages are detectable, detecting copied code in SQL is a difficult task to solve as most of the queries are relatively same, which makes plagiarism detection strategies ineffective when the objects are SQL queries. Research efforts in natural language processing has seen the development of several strategies that has facilitated complex evaluation of text strings. In this endavour, we liverage semantic similarity, a method that enables the evaluation of the semantic textual similarity between text strings, and the idea of distance between words, and the likelyness of their meaning to detect plagiarised SQL queries by semantically evaluating raw student query submissions from our SQL courses which are offered every semester. Result show that the semantic similarity strategy was able to detect code similarity, which translated to plagiarism in a considerable umber of submissions. In all, we describe in this paper, our plagiarism detection strategy, the limitations of our strategy, possible means that may be effective at addressing these limitations.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 6th World Conference on Computing and Communication Technologies (WCCCT)

自引率

0.00%

发文量