{"title":"Manifold Learning and Recognition of Human Activity Using Body-Area Sensors","authors":"Mi Zhang, A. Sawchuk","doi":"10.1109/ICMLA.2011.92","DOIUrl":"https://doi.org/10.1109/ICMLA.2011.92","url":null,"abstract":"Manifold learning is an important technique for effective nonlinear dimensionality reduction in machine learning. In this paper, we present a manifold-based framework for human activity recognition using wearable motion sensors. In our framework, we use locally linear embedding (LLE) to capture the intrinsic structure and build nonlinear manifolds for each activity. A nearest-neighbor interpolation technique is then applied to learn the mapping function from the input space to the manifold space. Finally, activity recognition is performed by comparing trajectories of different activity manifolds in the manifold space. Experimental results validate the effectiveness of our framework and demonstrate that manifold learning is promising for the task of human activity recognition using wearable motion sensors.","PeriodicalId":439926,"journal":{"name":"2011 10th International Conference on Machine Learning and Applications and Workshops","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129385111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Relational Classifiers in a Non-relational World: Using Homophily to Create Relations","authors":"Sofus A. Macskassy","doi":"10.1109/ICMLA.2011.122","DOIUrl":"https://doi.org/10.1109/ICMLA.2011.122","url":null,"abstract":"Research in the past decade on statistical relational learning (SRL) has shown the power of the underlying network of relations in relational data. Even models built using only relations often perform comparably to models built using sophisticated relational learning methods. However, many data sets -- such as those in the UCI machine learning repository -- contain no relations. In fact, many data sets either do not contain relations or have relations which are not helpful to a specific classification task. The question we investigate in this paper is whether it is possible to construct relations such that relational inference results in better classification performance than non-relational inference. Using simple similarity-based rules to create relations and weighting the strength of these relations using homophily on instance labels, we test whether relational inference techniques are applicable -- in other words, do they perform comparably to standard machine learning algorithms. We show, in an experimental study on 31 UCI benchmark data sets, that relational inference wins more than any of the 6 classifiers we compare against, including a transductive SVM, and that it wins the majority of the time when compared against any one of them.","PeriodicalId":439926,"journal":{"name":"2011 10th International Conference on Machine Learning and Applications and Workshops","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131753252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Extraction of Basic Patterns of Household Energy Consumption","authors":"Haoyang Shen, H. Hino, Noboru Murata, S. Wakao","doi":"10.1109/ICMLA.2011.68","DOIUrl":"https://doi.org/10.1109/ICMLA.2011.68","url":null,"abstract":"Solar power, wind power, and co-generation (combined heat and power) systems are possible candidate for household power generation. These systems have their advantages and disadvantages. To propose the optimal combination of the power generation systems, the extraction of basic patterns of energy consumption of the house is required. In this study, energy consumption patterns are modeled by mixtures of Gaussian distributions. Then, using the symmetrized Kullback-Leibler divergence as a distance measure of the distributions, the basic pattern of energy consumption is extracted by means of hierarchical clustering. By an experiment using the Annex 42 dataset, it is shown that the proposed method is able to extract typical energy consumption patterns.","PeriodicalId":439926,"journal":{"name":"2011 10th International Conference on Machine Learning and Applications and Workshops","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133867325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Recognition of Segmented Online Arabic Handwritten Characters of the ADAB Database","authors":"S. Abdelazeem, Hany Ahmed","doi":"10.1109/ICMLA.2011.120","DOIUrl":"https://doi.org/10.1109/ICMLA.2011.120","url":null,"abstract":"The aim of this work is to fill a void in the literature of Arabic handwriting recognition by studying the performance of different feature extraction methods on online segmented Arabic characters. The contribution of this paper is to introduce a large database of segmented online handwritten Arabic characters and report the performance of various feature extraction techniques on the segmented characters to serve as a benchmark for any future work on the problem of online Arabic characters recognition.","PeriodicalId":439926,"journal":{"name":"2011 10th International Conference on Machine Learning and Applications and Workshops","volume":"19 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125689542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Matthew Wiley, Razvan C. Bunescu, C. Marling, J. Shubrook, F. Schwartz
{"title":"Automatic Detection of Excessive Glycemic Variability for Diabetes Management","authors":"Matthew Wiley, Razvan C. Bunescu, C. Marling, J. Shubrook, F. Schwartz","doi":"10.1109/ICMLA.2011.39","DOIUrl":"https://doi.org/10.1109/ICMLA.2011.39","url":null,"abstract":"Glycemic variability, or fluctuation in blood glucose levels, is a significant factor in diabetes management. Excessive glycemic variability contributes to oxidative stress, which has been linked to the development of long-term diabetic complications. An automated screen for excessive glycemic variability, based on the readings from continuous glucose monitoring (CGM) systems, would enable early identification of at risk patients. In this paper, we present an automatic approach for learning variability models that can routinely detect excessive glycemic variability when applied to CGM data. Naive Bayes (NB), Multilayer Perceptron (MP), and Support Vector Machine (SVM) models are trained and evaluated on a dataset of CGM plots that have been manually annotated with respect to glycemic variability by two diabetes experts. In order to alleviate the impact of noise, the CGM plots are smoothed using cubic splines. Automatic feature selection is then performed on a rich set of pattern recognition features. Empirical evaluation shows that the top performing model obtains a state of the art accuracy of 93.8%, substantially outperforming a previous NB model.","PeriodicalId":439926,"journal":{"name":"2011 10th International Conference on Machine Learning and Applications and Workshops","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131461622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient Optimization of Logistic Regression by Direct Use of Conjugate Gradient","authors":"Kenji Watanabe, Takumi Kobayashi, N. Otsu","doi":"10.1109/ICMLA.2011.63","DOIUrl":"https://doi.org/10.1109/ICMLA.2011.63","url":null,"abstract":"In classification problems, logistic regression (LR) is used to estimate posterior probabilities. The objective function of LR is usually minimized by Newton-Raphson method such as using iterative reweighted least squares (IRLS). There, the inverse Hessian matrix must be calculated in each iteration step. Thus, a computational cost in the optimization of LR significantly increases as input data becomes large. To reduce the computational cost, we propose a novel optimization method of LR by directly using the non-linear conjugate gradient (CG) method. The proposed method iteratively minimizes the objective function of LR without calculation of the Hessian matrix. Furthermore, to reduce the number of iteration efficiently, the step size in the non-linear CG iteration is optimized avoiding ad hock line search, and initial values are set by ordinary linear regression analysis. In the experimental results, our method performs about 200 times faster than the other methods for a large scale dataset.","PeriodicalId":439926,"journal":{"name":"2011 10th International Conference on Machine Learning and Applications and Workshops","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132989758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning with Guaranteed Label Quality","authors":"Eileen A. Ni, C. Ling","doi":"10.1109/ICMLA.2011.88","DOIUrl":"https://doi.org/10.1109/ICMLA.2011.88","url":null,"abstract":"In supervised learning, label quality is crucial for learning performance. However, noise is ubiquitous in labels provided by oracles in active learning. To rule out its negative influence, multiple-oracles have been proposed. However, unrealistic assumptions (such as the evenly distributed noise level of oracles) have been made to restrict the learning algorithms for real-world applications. In this paper, we propose a learning algorithm, c-certainty, to guarantee the label quality, and allow the noise level of oracles to be example-dependent. Furthermore, we develop an effective learning algorithm which is able to select the more accurate oracles to query. The experiment results show that the learning strategy developed in this paper outperforms other learning algorithms significantly.","PeriodicalId":439926,"journal":{"name":"2011 10th International Conference on Machine Learning and Applications and Workshops","volume":"126 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133150667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adaptive Time Window Size to Track Concept Drift","authors":"M. S. Mouchaweh, J. Zaytoon, P. Billaudel","doi":"10.1109/ICMLA.2011.26","DOIUrl":"https://doi.org/10.1109/ICMLA.2011.26","url":null,"abstract":"This paper proposes an approach to track concept drift in order to improve the classifier performance. This approach uses an adaptive time window size in order to detect a drift according to its dynamics (slow/moderate/fast). The goal is to update the classifier using sufficient number of patterns related to environment changes. Since the classifier may misclassify drifted patterns with its old parameters, an expert is asked to provide the true class label for these patterns. This approach is used to detect at early stage a leak in the steam generator of nuclear power generators Prototype Fast Reactors.","PeriodicalId":439926,"journal":{"name":"2011 10th International Conference on Machine Learning and Applications and Workshops","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117263076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ehsan Emadzadeh, Siddhartha R. Jonnalagadda, Graciela Gonzalez
{"title":"Evaluating Distributional Semantic and Feature Selection for Extracting Relationships from Biological Text","authors":"Ehsan Emadzadeh, Siddhartha R. Jonnalagadda, Graciela Gonzalez","doi":"10.1109/ICMLA.2011.65","DOIUrl":"https://doi.org/10.1109/ICMLA.2011.65","url":null,"abstract":"The constant flow of biomolecular findings being published each day challenges our ability to develop methods to automatically extract the knowledge expressed in text to potentially influence new discoveries. Finding relations between the biological entities (e.g. proteins and genes) in text is a challenging task. To facilitate the extraction process, a relation can be decomposed into a trigger and the complementary arguments (e.g. theme, site). Several approaches have been proposed based on machine learning which generally use a common set of features for all trigger types. Here we evaluate the impact of applying a feature selection method for trigger classification. Our proposed method uses a greedy feature selection algorithm to find an optimal set of attributes for each trigger type. We show that using the customized set of features can improve classification results significantly (up to 53.96% in f-measure). In addition, we evaluated different settings for including semantic features in the classifiers. We found that using semantic features can improve classification results and found the best setting for each trigger type.","PeriodicalId":439926,"journal":{"name":"2011 10th International Conference on Machine Learning and Applications and Workshops","volume":"33 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125097275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The ROC-Boost Design Algorithm for Asymmetric Classification","authors":"Guido Cesare, R. Manduchi","doi":"10.1109/ICMLA.2011.142","DOIUrl":"https://doi.org/10.1109/ICMLA.2011.142","url":null,"abstract":"In many situations (e.g., cascaded classification), it is desirable to design a classifier with precise constraints on its detection rate or on its false positive rate. We introduce ROC Boost, a modification of the Ada Boost design algorithm that produces asymmetric classifiers with guaranteed detection rate and low false positive rates. Tested in a visual text detection task, ROC-Boost was shown to perform competitively against other popular algorithms.","PeriodicalId":439926,"journal":{"name":"2011 10th International Conference on Machine Learning and Applications and Workshops","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130680748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}