SIGKDD explorations : newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining最新文献_第9页

A conversation with Professor Bole Shi 与史伯乐教授的对话

SIGKDD explorations : newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining Pub Date : 2012-05-01 DOI: 10.1145/2207243.2207261

Baile Shi

引用次数: 0

A conversation with Professors Deyi Li and Jie Tang 与李德毅、唐杰教授对话

SIGKDD explorations : newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining Pub Date : 2012-05-01 DOI: 10.1145/2207243.2207257

Deyi Li, Jie Tang

{"title":"A conversation with Professors Deyi Li and Jie Tang","authors":"Deyi Li, Jie Tang","doi":"10.1145/2207243.2207257","DOIUrl":"https://doi.org/10.1145/2207243.2207257","url":null,"abstract":"Roughly speaking, Chinese KDD research mainly underwent three stages. It was in 1993 when National Science Foundation of China (NSFC) started to sponsor research on knowledge discovery and data mining. This can be considered as the first stage. The major research around that time was focused on “Knowledge Discovery from Database”, including sub-topics such as frequent mining and association rule mining from databases. The research was mainly conducted in academic institutes. The second stage started from the end of 1990’s, with the emergence and the rapid proliferation of Web-based applications. People started to notice that the largest data source for mining is the information on the Web instead of traditional databases. At the same time the mining tasks became more diversified. In the second stage, the term “Web Mining” became popular in the field. Research labs on “knowledge engineering”, “web/internet mining” have been built in different research institutes and rapidly developed. Several web search companies also emerged in this stage such as Baidu and Sogou. The third stage began around 2005, when online social applications and media (such as, in China, Tencent, Sina Weibo, Renren) become a prevalent and complex force to influence our daily life. Indeed, Tencent, the largest social network in China, already has more than 700 million registered users, the same number of Facebook; Sina Weibo has attracted 250 million users in the past two years, a figure higher than Twitter. These online networks grow very fast and they provide a huge amount of user generated content, which presents great opportunities in understanding the science of these networks. Accordingly, the emphasis of the research started to switch to mining social networks. This is a more diverse research field, attracting researchers from a wide range of academic fields, including theory and algorithms, data mining and machine learning, computer systems and networks, statistical physics and complex systems, social psychology, economics and managerial science. Another important change in this stage is that Chinese companies are paying more and more attention to data mining research. Not only Chinese Internet companies (e.g., Tencent, Baidu, Sogou, Youdao, etc.) but also communication/hardware IT companies (e.g., China Mobile, Huawei, ZTE, Lenovo) started to build data mining research labs. There is little doubt that for now it is the best time for data mining in China.","PeriodicalId":90050,"journal":{"name":"SIGKDD explorations : newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining","volume":"46 1","pages":"75-76"},"PeriodicalIF":0.0,"publicationDate":"2012-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80008526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Mapping question items to skills with non-negative matrix factorization 用非负矩阵分解映射问题项到技能

SIGKDD explorations : newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining Pub Date : 2012-05-01 DOI: 10.1145/2207243.2207248

M. Desmarais

{"title":"Mapping question items to skills with non-negative matrix factorization","authors":"M. Desmarais","doi":"10.1145/2207243.2207248","DOIUrl":"https://doi.org/10.1145/2207243.2207248","url":null,"abstract":"Intelligent learning environments need to assess the student skills to tailor course material, provide helpful hints, and in general provide some kind of personalized interaction. To perform this assessment, question items, exercises, and tasks are presented to the student. This assessment relies on a mapping of tasks to skills. However, the process of deciding which skills are involved in a given task is tedious and challenging. Means to automate it are highly desirable, even if only partial automation that provides supportive tools can be achieved. A recent technique based on Non-negative Matrix Factorization (NMF) was shown to offer valuable results, especially due to the fact that the resulting factorization allows a straightforward interpretation in terms of a Q-matrix. We investigate the factors and assumptions under which NMF can effectively derive the underlying high level skills behind assessment results. We demonstrate the use of different techniques to analyze and interpret the output of NMF. We propose a simple model to generate simulated data and to provide lower and upper bounds for quantifying skill effect. Using the simulated data, we show that, under the assumption of independent skills, the NMF technique is highly effective in deriving the Q-matrix. However, the NMF performance degrades under different ratios of variance between subject performance, item difficulty, and skill mastery. The results corroborates conclusions from previous work in that high level skills, corresponding to general topics like World History and Biology, seem to have no substantial effect on test performance, whereas other topics like Mathematics and French do. The analysis and visualization techniques of the NMF output, along with the simulation approach presented in this paper, should be useful for future investigations using NMF for Q-matrix induction from data.","PeriodicalId":90050,"journal":{"name":"SIGKDD explorations : newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining","volume":"1 1","pages":"30-36"},"PeriodicalIF":0.0,"publicationDate":"2012-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81237442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 65

A conversation with Dr. Haifeng Wang 与王海峰博士的对话

SIGKDD explorations : newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining Pub Date : 2012-05-01 DOI: 10.1145/2207243.2207264

Haifeng Wang

引用次数: 0

A conversation with Professor Zhongzhi Shi 与史忠植教授的对话

SIGKDD explorations : newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining Pub Date : 2012-05-01 DOI: 10.1145/2207243.2207263

Zhongzhi Shi

引用次数: 1

A conversation with Dr. Edward Y. Chang 与张德昌博士的对话

SIGKDD explorations : newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining Pub Date : 2012-05-01 DOI: 10.1145/2207243.2207256

Edward Y. Chang

{"title":"A conversation with Dr. Edward Y. Chang","authors":"Edward Y. Chang","doi":"10.1145/2207243.2207256","DOIUrl":"https://doi.org/10.1145/2207243.2207256","url":null,"abstract":"1. Please share with us your view on the history and important milestones of the Chinese KDD research and application areas. Ample evidence shows that KDD has become a major topic of interest in both research and industry in China since 2006. In academia, professor Zhi-Hua Zhou at Nanjing University in 2006 chaired a National Machine Learning workshop, inviting researchers in the greater China area to share their experience. In 2009, the first Asian Conference on Machine learning was inaugurated in Nanjing. In industry, both Google and MSRA influenced China Internet leading companies such as Tencent, Baidu, Alibaba, and subsequently Renren and Shanda, to start their large-scale KDD operations. Three KDD engineers on my team were recruited to join Baidu knowledge, the primary KDD application of these Internet companies this far is monetization, improving their ad/offer relevance and hence revenue. Genome Institute (BGI) have made impressive progress in areas of computer vision, pattern recognition, and bio-genomics. Applications such as face, gesture, voice, handwriting, and license plate recognition have been widely deployed. In the bio-genomics area, a team at BGI reached a significant milestone in 2008 by sequencing the first Asian individual's diploid genome and published the result in Nature [1]. This sequencing effort took BGI one year to complete. Subsequently, speeding up genome sequencing has been among BGI's top R&D priorities. (One cannot imagine what one billion genomic sequences and their associated disease profiles can bring to advancing human health.) Researchers led by Ruiqiang Li from BGI and researchers from Google and universities at Canada and Hong Kong have met a couple of times to discuss large-scale data mining issues and solutions in hardware, algorithms, and data transportation. There is no doubt that KDD is thriving in China in several areas and its applications are rapidly growing, thanks to the increase of both data volume and demand for intelligent information analysis and trend prediction. 2. Please describe your expertise and contribution to KDD. In 2005, my team started working on developing parallel machine learning algorithms to mine large-scale datasets. My team were made publicly available through Apache foundation, and they have been downloaded more than 4,000 times. Several Google products also use these parallel algorithms. Prior to the large-scale machine learning work, my work with Simon Tong on using active learning to refine user query concepts published in 2001 [8] has been cited 850 times. Together with my works on …","PeriodicalId":90050,"journal":{"name":"SIGKDD explorations : newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining","volume":"30 1","pages":"73-74"},"PeriodicalIF":0.0,"publicationDate":"2012-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81223108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Social network analysis and mining to support the assessment of on-line student participation 社会网络分析与挖掘，支持在线学生参与评估

SIGKDD explorations : newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining Pub Date : 2012-05-01 DOI: 10.1145/2207243.2207247

Reihaneh Rabbany, M. Takaffoli, Osmar R Zaiane

{"title":"Social network analysis and mining to support the assessment of on-line student participation","authors":"Reihaneh Rabbany, M. Takaffoli, Osmar R Zaiane","doi":"10.1145/2207243.2207247","DOIUrl":"https://doi.org/10.1145/2207243.2207247","url":null,"abstract":"There is a growing number of courses delivered using elearning environments and their online discussions play an important role in collaborative learning of students. Even in courses with a few number of students, there could be thousands of messages generated in a few months within these forums. Manually evaluating the participation of students in such case is a significant challenge, considering the fact that current e-learning environments do not provide much information regarding the structure of interactions between students. There is a recent line of research on applying social network analysis (SNA) techniques to study these interactions.\u0000 Here we propose to exploit SNA techniques, including community mining, in order to discover relevant structures in social networks we generate from student communications but also information networks we produce from the content of the exchanged messages. With visualization of these discovered relevant structures and the automated identification of central and peripheral participants, an instructor is provided with better means to assess participation in the online discussions. We implemented these new ideas in a toolbox, named Meerkat-ED, which automatically discovers relevant network structures, visualizes overall snapshots of interactions between the participants in the discussion forums, and outlines the leader/peripheral students. Moreover, it creates a hierarchical summarization of the discussed topics, which gives the instructor a quick view of what is under discussion. We believe exploiting the mining abilities of this toolbox would facilitate fair evaluation of students' participation in online courses.","PeriodicalId":90050,"journal":{"name":"SIGKDD explorations : newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining","volume":"8 1","pages":"20-29"},"PeriodicalIF":0.0,"publicationDate":"2012-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82403569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 52

A conversation with Professor Jianzhong Li 与李建忠教授的对话

SIGKDD explorations : newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining Pub Date : 2012-05-01 DOI: 10.1145/2207243.2207258

Jianzhong Li

引用次数: 0

Data mining for improving textbooks 改进教科书的数据挖掘

SIGKDD explorations : newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining Pub Date : 2012-05-01 DOI: 10.1145/2207243.2207246

R. Agrawal, Sreenivas Gollapudi, A. Kannan, K. Kenthapadi

引用次数: 40

A conversation with Professor Shan Wang et al. 与王山教授等人的对话

SIGKDD explorations : newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining Pub Date : 2012-05-01 DOI: 10.1145/2207243.2207265

Shan Wang, Cuiping Li, Hong Chen

引用次数: 1