SMUC '10最新文献

筛选
英文 中文
Extracting emotion topics from blog sentences: use of voting from multi-engine supervised classifiers 从博客句子中提取情感主题:使用多引擎监督分类器投票
SMUC '10 Pub Date : 2010-10-30 DOI: 10.1145/1871985.1872004
Dipankar Das, Sivaji Bandyopadhyay
{"title":"Extracting emotion topics from blog sentences: use of voting from multi-engine supervised classifiers","authors":"Dipankar Das, Sivaji Bandyopadhyay","doi":"10.1145/1871985.1872004","DOIUrl":"https://doi.org/10.1145/1871985.1872004","url":null,"abstract":"This paper presents a supervised multi-engine classifier approach followed by voting to identify emotion topic(s) from English blog sentences. Manual annotation of the English blog sentences in the training set has shown a satisfactory agreement with kappa (κ) measure of 0.85 and MASI (Measure of Agreement on Set-valued Items) measure of 0.82 for emotion topic spans. The baseline system based on object related dependency relations includes the topic oriented thematic roles present in the verb based syntactic frame of the sentences. In contrast, the supervised approach consists of three classifiers, Conditional Random Field (CRF), Support Vector Machine (SVM) and a Fuzzy Classifier (FC). The important features are incorporated based on the ablation study of all features and Information Gain Based Pruning (IGBP) on the development set. One or more emotion topics associated with focused target span are identified based on the majority voting of the classifiers. The supervised multi-engine classifier system has been evaluated with average F-scores of 70.51% and 90.44% for emotion topic and target span identification respectively on 500 gold standard test sentences and has outperformed the baseline system.","PeriodicalId":244822,"journal":{"name":"SMUC '10","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128788034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
A weighted tag similarity measure based on a collaborative weight model 基于协同权重模型的加权标签相似度度量
SMUC '10 Pub Date : 2010-10-30 DOI: 10.1145/1871985.1871999
Gokavarapu Srinivas, Niket Tandon, Vasudeva Varma
{"title":"A weighted tag similarity measure based on a collaborative weight model","authors":"Gokavarapu Srinivas, Niket Tandon, Vasudeva Varma","doi":"10.1145/1871985.1871999","DOIUrl":"https://doi.org/10.1145/1871985.1871999","url":null,"abstract":"The problem of measuring semantic relatedness between social tags remains largely open. Given the structure of social bookmarking systems, similarity measures need to be addressed from a social bookmarking systems perspective. We address the fundamental problem of weight model for tags over which every similarity measure is based. We propose a weight model for tagging systems that considers the user dimension unlike existing measures based on tag frequency. Visual analysis of tag clouds depicts that the proposed model provides intuitively better scores for weights than tag frequency. We also propose weighted similarity model that is conceptually different from the contemporary frequency based similarity measures. Based on the weighted similarity model, we present weighted variations of several existing measures like Dice and Cosine similarity measures. We evaluate the proposed similarity model using Spearman's correlation coefficient, with WordNet as the gold standard. Our method achieves 20% improvement over the traditional similarity measures like dice and cosine similarity and also over the most recent tag similarity measures like mutual information with distributional aggregation. Finally, we show the practical effectiveness of the proposed weighted similarity measures by performing search over tagged documents using Social SimRank over a large real world dataset.","PeriodicalId":244822,"journal":{"name":"SMUC '10","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128375446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
How to interpret the helpfulness of online product reviews: bridging the needs between customers and designers 如何诠释在线产品评论的帮助:架起顾客和设计师之间的桥梁
SMUC '10 Pub Date : 2010-10-30 DOI: 10.1145/1871985.1872000
Jian Jin, Ying Liu
{"title":"How to interpret the helpfulness of online product reviews: bridging the needs between customers and designers","authors":"Jian Jin, Ying Liu","doi":"10.1145/1871985.1872000","DOIUrl":"https://doi.org/10.1145/1871985.1872000","url":null,"abstract":"Helpful reviews are the valuable voice of the customer which benefit both consumers and product designers. On e-commerce websites, consumers are usually encouraged to rate whether a review is helpful or not. As consumers are not obligated to vote reviews, usually only a small proportion of product reviews eventually receive a voting. Also, existing evaluation methods that only use the review voting ratio from customers as the helpfulness are often not consistent with the designers' rating on reviews in interpreting customer needs and preferences. Thus, in this paper, the focus is on how to automatically build the connection between online customer's voting and designer's rating and predict the customer reviews' helpfulness based on the review content. We start the study by building a mapping to express product designers' rating using online helpfulness voting. Further, we propose to utilize regression algorithm to predict the online review's helpfulness with the help of several categories of features extracted from review content. Our experimental study, using a large amount of review data crawled from Amazon and real ratings from product designers confirms the effectiveness of our proposal and shows some very promising results.","PeriodicalId":244822,"journal":{"name":"SMUC '10","volume":"119 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127599625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Classifying latent user attributes in twitter 分类twitter中的潜在用户属性
SMUC '10 Pub Date : 2010-10-30 DOI: 10.1145/1871985.1871993
D. Rao, David Yarowsky, Abhishek Shreevats, Manaswi Gupta
{"title":"Classifying latent user attributes in twitter","authors":"D. Rao, David Yarowsky, Abhishek Shreevats, Manaswi Gupta","doi":"10.1145/1871985.1871993","DOIUrl":"https://doi.org/10.1145/1871985.1871993","url":null,"abstract":"Social media outlets such as Twitter have become an important forum for peer interaction. Thus the ability to classify latent user attributes, including gender, age, regional origin, and political orientation solely from Twitter user language or similar highly informal content has important applications in advertising, personalization, and recommendation. This paper includes a novel investigation of stacked-SVM-based classification algorithms over a rich set of original features, applied to classifying these four user attributes. It also includes extensive analysis of features and approaches that are effective and not effective in classifying user attributes in Twitter-style informal written genres as distinct from the other primarily spoken genres previously studied in the user-property classification literature. Our models, singly and in ensemble, significantly outperform baseline models in all cases. A detailed analysis of model components and features provides an often entertaining insight into distinctive language-usage variation across gender, age, regional origin and political orientation in modern informal communication.","PeriodicalId":244822,"journal":{"name":"SMUC '10","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126999929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 686
Characterization of the twitter @replies network: are user ties social or topical? twitter @回复网络的特征:用户关系是社交的还是话题性的?
SMUC '10 Pub Date : 2010-10-30 DOI: 10.1145/1871985.1871996
Daniel Sousa, L. Sarmento, E. M. Rodrigues
{"title":"Characterization of the twitter @replies network: are user ties social or topical?","authors":"Daniel Sousa, L. Sarmento, E. M. Rodrigues","doi":"10.1145/1871985.1871996","DOIUrl":"https://doi.org/10.1145/1871985.1871996","url":null,"abstract":"In recent years, social media services have become a global phenomenon on the Internet. The popularity of these services provides an opportunity to study the characteristics of online social networks and the communities that emerge in them. This paper presents an analysis of the users' interactions in the implicit network derived from tweet replies of a specific dataset obtained from a popular micro-blogging service, Twitter. We analyze the influence of the topics of the tweet messages on the interaction among users, to determine if the social aspect prevails over the topic in the moment of interaction. Thus, the main goal of this paper is to investigate if people selectively choose whom to reply to based on the topic or, otherwise, if they reply to anyone about anything. We found that the social aspect predominantly conditions users' interactions. For users with larger and denser ego-centric networks, we observed a slight tendency for separating their connections depending on the topics discussed.","PeriodicalId":244822,"journal":{"name":"SMUC '10","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128674834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 86
A formal study of classification techniques on entity discovery and their application to opinion mining 实体发现分类技术及其在意见挖掘中的应用研究
SMUC '10 Pub Date : 2010-10-30 DOI: 10.1145/1871985.1871992
Shadi Banitaan, Saeed Salem, Wei Jin, Ibrahim Aljarah
{"title":"A formal study of classification techniques on entity discovery and their application to opinion mining","authors":"Shadi Banitaan, Saeed Salem, Wei Jin, Ibrahim Aljarah","doi":"10.1145/1871985.1871992","DOIUrl":"https://doi.org/10.1145/1871985.1871992","url":null,"abstract":"Entity discovery has become an important topic of study in recent years due to its wide range of applications. In this paper, we focus on examining the effectiveness of various classification techniques on entity discovery and their application to the opinion mining task. The initial and most important step in opinion mining is to identify and extract highly specific product related and opinion related entities from product reviews. We formulate this problem as a classification task and present a comprehensive study of classification techniques on identifying entities of interest. The impacts of linguistic features such as part-of-speech (POS), and context features such as surrounding contextual clues of words on the classification performance are carefully evaluated. The experimental results show that good classification performance is closely related to the use of classification techniques, linguistic features, and context features. The evaluation is presented based on processing the online product reviews from Amazon.","PeriodicalId":244822,"journal":{"name":"SMUC '10","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125276692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Entity-relationship queries over wikipedia 维基百科上的实体关系查询
SMUC '10 Pub Date : 2010-10-30 DOI: 10.1145/1871985.1871991
Xiaonan Li, Chengkai Li, Cong Yu
{"title":"Entity-relationship queries over wikipedia","authors":"Xiaonan Li, Chengkai Li, Cong Yu","doi":"10.1145/1871985.1871991","DOIUrl":"https://doi.org/10.1145/1871985.1871991","url":null,"abstract":"Wikipedia is the largest user-generated knowledge base. We propose a structured query mechanism, entity-relationship query, for searching entities in Wikipedia corpus by their properties and inter-relationships. An entity-relationship query consists of arbitrary number of predicates on desired entities. The semantics of each predicate is specified with keywords. Entity-relationship query searches entities directly over text rather than pre-extracted structured data stores. This characteristic brings two benefits: (1) Query semantics can be intuitively expressed by keywords; (2) It avoids information loss that happens during extraction. We present a ranking framework for general entity-relationship queries and a position-based Bounded Cumulative Model for accurate ranking of query answers. Experiments on INEX benchmark queries and our own crafted queries show the effectiveness and accuracy of our ranking method.","PeriodicalId":244822,"journal":{"name":"SMUC '10","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121662632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Mining social tags to predict mashup patterns 挖掘社会标签来预测混搭模式
SMUC '10 Pub Date : 2010-10-30 DOI: 10.1145/1871985.1871998
Khaled Goarany, Gregory Kulczycki, M. Brian Blake
{"title":"Mining social tags to predict mashup patterns","authors":"Khaled Goarany, Gregory Kulczycki, M. Brian Blake","doi":"10.1145/1871985.1871998","DOIUrl":"https://doi.org/10.1145/1871985.1871998","url":null,"abstract":"In the past few years, tagging has gained large momentum as a user-driven approach for categorizing and indexing content on the Web. Mashups have recently joined the list of Web resources targeted for social tagging. In the context of the social Web, a mashup is a lightweight technique for integrating applications and data over the Web. Crafting new mashups is largely a subjective process motivated by the users' initial inspiration. In this paper, we propose a tag-based approach for predicting mashup patterns, thus deriving inspiration for potential new mashups from the community's consensus. Our approach applies association rule mining techniques to discover relationships between APIs and mashups based on their annotated tags. We also advocate the importance of the mined relationships as a valuable source for recommending mashup candidates while mitigating for common problems in recommender systems. We evaluate our methodology through experimentation using real-life dataset. Our results show that our approach achieves high prediction accuracy and outperforms a direct string matching approach that lacks the mining information.","PeriodicalId":244822,"journal":{"name":"SMUC '10","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114522119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Cross-media impact on twitter in japan 日本twitter的跨媒体影响
SMUC '10 Pub Date : 2010-10-30 DOI: 10.1145/1871985.1872003
Sayaka Akioka, N. Kato, Y. Muraoka, H. Yamana
{"title":"Cross-media impact on twitter in japan","authors":"Sayaka Akioka, N. Kato, Y. Muraoka, H. Yamana","doi":"10.1145/1871985.1872003","DOIUrl":"https://doi.org/10.1145/1871985.1872003","url":null,"abstract":"Twitter, a microblogging service, is now grabbing attention of people as a new channel. For deep understanding of this new service, this paper reports the characteristics of Twitter users in Japan, and the impact of media such as publications, and TV programs on Twitter community. To the best of our knowledge, this paper is the first to analyze mutual impact between Twitter, and other media quantitatively.\u0000 In order for the analyses, we crawled user profiles whose language setting is Japanese, and conducted several analysis with well-known methodologies as conventional work did. We confirmed the characteristics of the collected user profiles. We observed the distributions of the number of friends, and the number of follows both follow power-law, and there exists the correlation between the number of friends, and the number of follows.\u0000 Besides the collected user profiles, we also utilized closed caption data of TV programs in Japan, and other information on media picked up Twitter. We run a batch of matching these data outside Twitter with the collected user profiles, and concluded Twitter has been already widely spread among Japanese people, however, media have still huge impact on the growth of Twitter users. We also conjectured the impact is not one-sided, however, is mutual influence between Twitter, and other media.","PeriodicalId":244822,"journal":{"name":"SMUC '10","volume":"482 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116807196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Spam detection with a content-based random-walk algorithm 基于内容的随机漫步算法的垃圾邮件检测
SMUC '10 Pub Date : 2010-10-30 DOI: 10.1145/1871985.1871994
F. Javier Ortega, C. Macdonald, J. A. Troyano, Fermín L. Cruz
{"title":"Spam detection with a content-based random-walk algorithm","authors":"F. Javier Ortega, C. Macdonald, J. A. Troyano, Fermín L. Cruz","doi":"10.1145/1871985.1871994","DOIUrl":"https://doi.org/10.1145/1871985.1871994","url":null,"abstract":"In this work we tackle the problem of the spam detection on the Web. Spam web pages have become a problem for Web search engines, due to the negative effects that this phenomenon can cause in their retrieval results. Our approach is based on a random-walk algorithm that obtains a ranking of pages according to their relevance and their spam likelihood. We introduce the novelty of taking into account the content of the web pages to characterize the web graph and to obtain an a-priori estimation of the spam likekihood of the web pages. Our graph-based algorithm computes two scores for each node in the graph. Intuitively, these values represent how bad or good (spam-like or not) is a web page, according to its textual content and the relations in the graph. Our experiments show that our proposed technique outperforms other link-based techniques for spam detection.","PeriodicalId":244822,"journal":{"name":"SMUC '10","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130347820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信