{"title":"使用特征向量表示来识别app inventor中的类似项目","authors":"Maja Svanberg","doi":"10.1109/BLOCKS.2017.8120430","DOIUrl":null,"url":null,"abstract":"In trying to understand the big picture of how users learn to program in App Inventor, we want to be able to represent projects in a way suitable for large scale learning analytics. Here I present different representations of projects that could potentially be used to identify App Inventor projects that have structural similarities to each other, e.g., projects created by users following tutorials. I compare the different representations based solely on how accurately they predict the correct tutorial from a labeled data set. The results suggest that we use both blocks and components from a project, apply TF-IDF to the counts of each feature, and measure distance or similarity in terms of a generalized Jaccard distance. This work lays the foundation for being able to find clusters of similar projects to distinguish original from unoriginal projects and to be able to filter out similar projects when trying to determine a user's skill level.","PeriodicalId":424744,"journal":{"name":"2017 IEEE Blocks and Beyond Workshop (B&B)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Using feature vector representations to identify similar projects in app inventor\",\"authors\":\"Maja Svanberg\",\"doi\":\"10.1109/BLOCKS.2017.8120430\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In trying to understand the big picture of how users learn to program in App Inventor, we want to be able to represent projects in a way suitable for large scale learning analytics. Here I present different representations of projects that could potentially be used to identify App Inventor projects that have structural similarities to each other, e.g., projects created by users following tutorials. I compare the different representations based solely on how accurately they predict the correct tutorial from a labeled data set. The results suggest that we use both blocks and components from a project, apply TF-IDF to the counts of each feature, and measure distance or similarity in terms of a generalized Jaccard distance. This work lays the foundation for being able to find clusters of similar projects to distinguish original from unoriginal projects and to be able to filter out similar projects when trying to determine a user's skill level.\",\"PeriodicalId\":424744,\"journal\":{\"name\":\"2017 IEEE Blocks and Beyond Workshop (B&B)\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE Blocks and Beyond Workshop (B&B)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BLOCKS.2017.8120430\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE Blocks and Beyond Workshop (B&B)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BLOCKS.2017.8120430","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Using feature vector representations to identify similar projects in app inventor
In trying to understand the big picture of how users learn to program in App Inventor, we want to be able to represent projects in a way suitable for large scale learning analytics. Here I present different representations of projects that could potentially be used to identify App Inventor projects that have structural similarities to each other, e.g., projects created by users following tutorials. I compare the different representations based solely on how accurately they predict the correct tutorial from a labeled data set. The results suggest that we use both blocks and components from a project, apply TF-IDF to the counts of each feature, and measure distance or similarity in terms of a generalized Jaccard distance. This work lays the foundation for being able to find clusters of similar projects to distinguish original from unoriginal projects and to be able to filter out similar projects when trying to determine a user's skill level.