Proceedings of the 9th Annual ACM India Conference最新文献

筛选
英文 中文
Semantic Clustering Driven Approaches to Recommender Systems 推荐系统的语义聚类驱动方法
Proceedings of the 9th Annual ACM India Conference Pub Date : 2016-10-21 DOI: 10.1145/2998476.2998487
P. Bafna, S. Shirwaikar, Dhanya Pramod
{"title":"Semantic Clustering Driven Approaches to Recommender Systems","authors":"P. Bafna, S. Shirwaikar, Dhanya Pramod","doi":"10.1145/2998476.2998487","DOIUrl":"https://doi.org/10.1145/2998476.2998487","url":null,"abstract":"Recommender Systems (RS) have increasingly evolved from novelties used by few E-commerce sites to an essential component of business tools handling the world of E-commerce. Recommender Systems have been widely used for product recommendations such as books and movies as well as, it is also gaining ground in service recommendations such as hotels, restaurants and travel attractions. Collaborative filtering based on reviews and ratings is usually applied that uses Clustering technique. The primary step of converting textual reviews into a Feature Matrix (FM) can be greatly refined by using semantic similarity between terms. In this paper Wordnet based Synset grouping approach is presented that not only reduces dimensions in FM but also generates Feature vectors (FV) for each cluster with significantly improved cluster quality. The paper presents a three step approach of validating the reviews, grouping of reviews and review based recommendations using Feature vector. Real datasets extracted from travel sites are used for experiments.","PeriodicalId":171399,"journal":{"name":"Proceedings of the 9th Annual ACM India Conference","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130130059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Role of Reduced Inputs in Flag Mining 减少投入在标志挖掘中的作用
Proceedings of the 9th Annual ACM India Conference Pub Date : 2016-10-21 DOI: 10.1145/2998476.2998485
Rishab Bansal, A. Ravindar
{"title":"Role of Reduced Inputs in Flag Mining","authors":"Rishab Bansal, A. Ravindar","doi":"10.1145/2998476.2998485","DOIUrl":"https://doi.org/10.1145/2998476.2998485","url":null,"abstract":"Typically compilers provide a wide choice of optimization flags that can be used to improve the application performance. The process of searching for the best flag combination for a given application is referred to Flag Mining. Brute force ways of flag mining are time consuming as it requires a number of runs with different combinations of flags. Flag mining techniques that are based on machine learning rely on a database consisting of measurements of application run-times obtained with a large number of combinations of binaries compiled with different flags. This work quantifies the impact of using reduced inputs in flag mining. Reduced inputs are much smaller inputs than real representative inputs and cause the application to run for less than 10 percent of original execution time. Some examples of reduced inputs are the train input used in SPEC benchmarks, MinneSPEC inputs. Using reduced inputs instead of full inputs would reduce time/space overhead of flag mining significantly when used in brute force or machine learning based methods. However inorder to use reduced inputs for flag mining, the behavior of the application compiled with a set of flags, when presented with reduced inputs should give similar benefits on full representative inputs. This can happen only if reduced inputs are an accurate representatives of ref inputs in the context of application performance. Our experiments show that reduced inputs correlate to full representative inputs for 5 out of 7 SPEC CPU2006 benchmarks on all 11 flag combinations considered with the GCC compiler and are found to reduce the experimentation time of flag mining by up to 82%. We also outline the necessary conditions that need to be satisfied by reduced inputs to qualify for use in flag mining.","PeriodicalId":171399,"journal":{"name":"Proceedings of the 9th Annual ACM India Conference","volume":"12 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132609369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SkipLPA: An Efficient Label Propagation Algorithm for Community Detection in Sparse Network 一种用于稀疏网络社区检测的高效标签传播算法SkipLPA
Proceedings of the 9th Annual ACM India Conference Pub Date : 2016-10-21 DOI: 10.1145/2998476.2998486
Sanjay B. Thakare, Arvind W. Kiwelekar
{"title":"SkipLPA: An Efficient Label Propagation Algorithm for Community Detection in Sparse Network","authors":"Sanjay B. Thakare, Arvind W. Kiwelekar","doi":"10.1145/2998476.2998486","DOIUrl":"https://doi.org/10.1145/2998476.2998486","url":null,"abstract":"The propagation phase of label propagation algorithm is a computationally intensive process and overall performance of algorithm depends on it. This phase determines the label of all the nodes by processing nodes recursively in the network. Rather processing all the nodes if it is possible to skip certain nodes from the propagation phase, then the process will speed-up. We propose an efficient algorithm SkipLPA based on label propagation algorithm for the discovering community structure in the sparse network. The initialization phase is split into two sub-phases. First sub-phase: only certain nodes are initialized with unique labels. Second sub-phase: remaining nodes will get initial labels from connected nodes and excluded from the propagation phase. The algorithm is tested not only on benchmark networks but also on the real world networks, and efficiently recovers community structure. The performance of this algorithm improves drastically without compromising the quality of community detected, as well as the number of iterations are reduced by skipping certain nodes from the propagation phase.","PeriodicalId":171399,"journal":{"name":"Proceedings of the 9th Annual ACM India Conference","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126901037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Efficient Multi-depth Querying on Provenance of Relational Queries Using Graph Database 基于图数据库的关系查询来源的高效多深度查询
Proceedings of the 9th Annual ACM India Conference Pub Date : 2016-10-21 DOI: 10.1145/2998476.2998480
A. Rani, Navneet Goyal, S. Gadia
{"title":"Efficient Multi-depth Querying on Provenance of Relational Queries Using Graph Database","authors":"A. Rani, Navneet Goyal, S. Gadia","doi":"10.1145/2998476.2998480","DOIUrl":"https://doi.org/10.1145/2998476.2998480","url":null,"abstract":"Data Provenance is the history associated with that data. It constitutes the origin, creation, processing, and archiving of data. In today's Internet era, it has gained significant importance for database analytics. Most of the provenance models store provenance information in relational databases for further querying and analysis. Although, querying of provenance in Relational Databases is very efficient for small data sets, it becomes inefficient as the provenance data grows and traversal depth of provenance query increases. This is mainly due to increase in number of join operations to search the entire provenance data. Graph Databases provide an alternative to RDBMSs for storing and analyzing provenance data as it can scale to billions of nodes and at the same time traverse thousands of relationships efficiently. In this paper, we propose efficient multi-depth querying of provenance data using graph databases. The proposed solution allows efficient querying of provenance of current as well as historical queries. A comparison between relational and graph databases is presented for varying provenance data size and traversal depths. Graph databases are found to scale well with increasing depth of provenance queries, whereas in relational databases the querying time increases exponentially.","PeriodicalId":171399,"journal":{"name":"Proceedings of the 9th Annual ACM India Conference","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128936526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Distributed Decision Tree 分布式决策树
Proceedings of the 9th Annual ACM India Conference Pub Date : 2016-10-21 DOI: 10.1145/2998476.2998478
Ankit Desai, S. Chaudhary
{"title":"Distributed Decision Tree","authors":"Ankit Desai, S. Chaudhary","doi":"10.1145/2998476.2998478","DOIUrl":"https://doi.org/10.1145/2998476.2998478","url":null,"abstract":"Decision Tree is a tree-structured plan of a set of attributes to test in order to predict the output. MapReduce and Spark is a programming model used for processing data on a distributed file system. In this paper, MapReduce and Spark implementation of Decision Tree is named as Distributed Decision Tree (DDT) and Spark Tree (ST) respectively. Decision Tree (DT), Ensemble of Trees (BT), DDT and ST are compared over accuracy, size of tree and number of leaves of tree(s) generated. DDT and ST is empirically evaluated over 10 selected datasets. Using DDT, size of tree is reduced by 71% and 82% as compared to DT and BT respectively. In case of ST size of tree is reduced by 48% and 67% as compared to DT and BT. Number of leaves is reduced by 70% and 81% with respect to DT and BT, respectively using DDT. Whereas, it is reduced by 45% and 65% with respect to DT and BT in case of ST. We evaluated DDT and ST using Yahoo! Webscope dataset. Our evaluation shows improvement in accuracy as well as reduction in size of tree and number of leaves. Hence, DDT and ST outperformed DT and BT with respect to size of tree and number of leaves with adequate classification accuracy.","PeriodicalId":171399,"journal":{"name":"Proceedings of the 9th Annual ACM India Conference","volume":"235 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123052486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Service Demand Modeling and Prediction with Single-user Performance Tests 基于单用户性能测试的服务需求建模和预测
Proceedings of the 9th Annual ACM India Conference Pub Date : 2016-10-21 DOI: 10.1145/2998476.2998483
A. Kattepur, M. Nambiar
{"title":"Service Demand Modeling and Prediction with Single-user Performance Tests","authors":"A. Kattepur, M. Nambiar","doi":"10.1145/2998476.2998483","DOIUrl":"https://doi.org/10.1145/2998476.2998483","url":null,"abstract":"Performance load tests of online transaction processing (OLTP) applications are expensive in terms of manpower, time and costs. Alternative performance modeling and prediction tools are required to generate accurate outputs with minimal input sample points. Service Demands (time needed to serve 1 request at queuing stations) are typically needed as inputs by most performance models. However, as service demands vary as a function of workload, models that input singular service demands produce erroneous predictions. The alternative, which is to collect service demands at varying workloads, require time and resource intensive load tests to estimate multiple sample points -- this defeats the purpose of performance modeling for industrial use. In this paper, we propose a service demand model as a function of concurrency that can be estimated with a single-user performance test. Further, we analyze multiple CPU performance metrics (cache hits/misses, branch prediction, context switches and so on) using Principal Component Analysis (PCA) to extract a regression function of service demand with increasing workloads. We use the service demand models as input to performance prediction algorithms such as Mean Value Analysis (MVA), to accurately predict throughput at varying workloads. This service demand prediction model uses CPU hardware counters, which is used in conjunction with a modified version of MVA with single-user service demand inputs. The predicted throughput values are within 9% deviation with measurements procured for a variety of application/hardware configurations. Such a service demand model is a step towards reducing reliance on conventional load testing for performance assurance.","PeriodicalId":171399,"journal":{"name":"Proceedings of the 9th Annual ACM India Conference","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126577572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Development of Indian Weighted Diabetic Risk Score (IWDRS) using Machine Learning Techniques for Type-2 Diabetes 使用机器学习技术开发印度加权糖尿病风险评分(IWDRS)用于2型糖尿病
Proceedings of the 9th Annual ACM India Conference Pub Date : 2016-10-21 DOI: 10.1145/2998476.2998497
Omprakash Chandrakar, Jatinderkumar R. Saini
{"title":"Development of Indian Weighted Diabetic Risk Score (IWDRS) using Machine Learning Techniques for Type-2 Diabetes","authors":"Omprakash Chandrakar, Jatinderkumar R. Saini","doi":"10.1145/2998476.2998497","DOIUrl":"https://doi.org/10.1145/2998476.2998497","url":null,"abstract":"Undetected pre-diabetes and late diagnosis is a major problem in East Asian countries. Diabetes screening tools such as Diabetes Risk Score (DRS) can effectively help in detecting and preventing the disease among pre-diabetes persons. Several Risk Scores for Type -2 Diabetes have been proposed and being used. In current research, researchers have observed certain issues in the available DRS and advocate the need to address the same. In this study researchers propose a novel Indian Weighted Diabetic Risk Score (IWDRS). Machine Learning Techniques such as distance based clustering with Euclidean distance, k-means algorithm and discretization is used to derive weighted risk score for diabetes risk factors like age, BMI, waist circumference, personal history, family history, diet, physical activity, stress and life quality. Result analysis shows that the proposed approach is better than existing approach in scientific literature.","PeriodicalId":171399,"journal":{"name":"Proceedings of the 9th Annual ACM India Conference","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116794639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Topical Authoritative Answerer Identification on Q&A Posts using Supervised Learning in CQA Sites CQA网站中使用监督学习的问答帖子的主题权威答疑人识别
Proceedings of the 9th Annual ACM India Conference Pub Date : 2016-10-21 DOI: 10.1145/2998476.2998490
T. P. Sahu, N. K. Nagwani, Shrish Verma
{"title":"Topical Authoritative Answerer Identification on Q&A Posts using Supervised Learning in CQA Sites","authors":"T. P. Sahu, N. K. Nagwani, Shrish Verma","doi":"10.1145/2998476.2998490","DOIUrl":"https://doi.org/10.1145/2998476.2998490","url":null,"abstract":"Community Question Answering (CQA) site is an online platform for hosting information in question-answer form by collaborative users worldwide. There are basically two types of user in this CQA sites: Asker -- who post their query as questions and Answerer -- who provide the answers to these questions. The semi-structured and growing size of contents in CQA sites is posing several challenges. As there is no restriction in posting the number of answers to a question, so the common challenge is to identify the authoritative answerers of a question in order to evaluate the answer quality for selecting the best answer. In this paper, we use latent dirichlet allocation (LDA) the statistical topic modelling on textual data and statistical computing on metadata to identify the features that would reflect the topical authoritative of answerer. Then these features are represented as vector for each answerer of the dataset under investigation for learning the classifier model. The various baseline classifier model are used to identify the topical authoritative answerer on Q&A posts of two real dataset extracted from StackOverflow and AskUbuntu. The correctness and effectiveness of classifier models are evaluated using various parameters like accuracy, precision, recall, and kappa statistic. The experimental result shows that Random Forest classifier outperforms over each evaluation parameter than other classification algorithms.","PeriodicalId":171399,"journal":{"name":"Proceedings of the 9th Annual ACM India Conference","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130553558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
EEQuest: An Event Extraction and Query System 一个事件提取和查询系统
Proceedings of the 9th Annual ACM India Conference Pub Date : 2016-10-21 DOI: 10.1145/2998476.2998482
Prerit Jain, H. Bendapudi, Shrisha Rao
{"title":"EEQuest: An Event Extraction and Query System","authors":"Prerit Jain, H. Bendapudi, Shrisha Rao","doi":"10.1145/2998476.2998482","DOIUrl":"https://doi.org/10.1145/2998476.2998482","url":null,"abstract":"We present EEQuest, an application that extracts events from text using natural language processing (NLP) and supervised machine-learning techniques, and provides a system to query events extracted from a text corpus. We provide a use case for the application wherein we extract business-related events from news articles. The extracted events are then categorized based on the business organization/company that they are related to. Finally, the events are added to a knowledge base using which a query system is built. The system can be used to display events related to a particular organization or a group of organizations. Although we are using the system to extract business-related events, the event extraction mechanism can be used in a more general sense with any available textual data, to extract any kind of events that have a structure that can answer the question: Who did what, when and where?","PeriodicalId":171399,"journal":{"name":"Proceedings of the 9th Annual ACM India Conference","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123005722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Search Based Test Data Generation: A Multi Objective Approach using MOPSO Evolutionary Algorithm 基于搜索的测试数据生成:一种基于MOPSO进化算法的多目标方法
Proceedings of the 9th Annual ACM India Conference Pub Date : 2016-10-21 DOI: 10.1145/2998476.2998492
P. Gopi, M. Ramalingam, C. Arumugam
{"title":"Search Based Test Data Generation: A Multi Objective Approach using MOPSO Evolutionary Algorithm","authors":"P. Gopi, M. Ramalingam, C. Arumugam","doi":"10.1145/2998476.2998492","DOIUrl":"https://doi.org/10.1145/2998476.2998492","url":null,"abstract":"Search based test data generation plays an important role in software testing. Several search based evolutionary algorithms are used to find the optimal test data. Among these algorithms, a meta-heuristic algorithm called Particle Swarm Optimization (PSO) algorithm is adopted for finding the optimal test data for the given Software Under Test (SUT) due to its simplicity and fast convergence. The success of PSO as a single objective optimizer in the literature has motivated to solve multi objective optimization problems. Hence, Multi Objective Particle Swarm Optimization (MOPSO) is adopted for solving more than one objective. This research work consider two objectives which attempts to maximize the branch coverage and reduce the test suite size. A benchmark program is used for the experimental analysis using MOPSO algorithm. The experimental analysis was performed using MOTestGen tool to extract the results. The extracted results portraits the convergence and coverage performance in producing the optimal test data as the population size increases.","PeriodicalId":171399,"journal":{"name":"Proceedings of the 9th Annual ACM India Conference","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132967717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信