{"title":"Predicting Binding Sites in the Mouse Genome","authors":"Yi Sun, M. Robinson, R. Adams, N. Davey, A. Rust","doi":"10.1109/ICMLA.2007.28","DOIUrl":"https://doi.org/10.1109/ICMLA.2007.28","url":null,"abstract":"The identification of cis-regulatory binding sites in DNA in multicellular eukaryotes is a particularly difficult problem in computational biology. To obtain a full understanding of the complex machinery embodied in genetic regulatory networks it is necessary to know both the identity of the regulatory transcription factors together with the location of their binding sites in the genome. We show that using an SVM together with data sampling, to integrate the results of individual algorithms specialised for the prediction of binding site locations, can produce significant improvements upon the original algorithms applied to the mouse genome. These results make more tractable the expensive experimental procedure of actually verifying the predictions.","PeriodicalId":448863,"journal":{"name":"Sixth International Conference on Machine Learning and Applications (ICMLA 2007)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115475558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Recognition of ultrasonic multi-echo sequences for autonomous symbolic indoor tracking","authors":"André Stuhlsatz","doi":"10.1109/ICMLA.2007.30","DOIUrl":"https://doi.org/10.1109/ICMLA.2007.30","url":null,"abstract":"This paper presents an autonomous symbolic indoor tracking system for ubiquitous computing applications. The proposed approach is based upon the assumption that topologically discriminable information can be assigned explicitly to different spaces of a given indoor environment. On that assumption, continuous time-of-flight (ToF) measurements of echo-bursts obtained from four orthogonally and coplanarly mounted ultrasonic transducer are used to learn a stochastic room model. While the individual acoustic representation of space is captured using Gaussian mixture densities, the stochastic variabilities in the moving direction of a person are modeled by hidden-Markov-models (HMMs). Experiments within a six room environment resulted in a room recognition rate of 92.21% and a room sequence recognition rate of 66.00% without any pre-fixed devices.","PeriodicalId":448863,"journal":{"name":"Sixth International Conference on Machine Learning and Applications (ICMLA 2007)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127497001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Web based machine learning for language identification and translation","authors":"Ş. Sağiroğlu, U. Yavanoglu, Esra Nergis Guven","doi":"10.1109/ICMLA.2007.27","DOIUrl":"https://doi.org/10.1109/ICMLA.2007.27","url":null,"abstract":"Language identification is an important task for Web information retrieval services. This paper presents the implementation of a platform for language identification in multi-lingual documents on Web. The platform consists of five modules to achieve the tasks automatically. Furthermore, artificial neural networks were used for the identification of languages in multi-lingual documents. Results for six languages including Turkish, French, Italian, Danish and Deutsch are present. The major benefit of the approach is that the ANN based language identification system could meet the expectations in real-time language identification accuracy with the help of a developed system. Experiments have shown that system achieves the tasks in high accuracy in discriminating different languages and converting them other languages on Web pages.","PeriodicalId":448863,"journal":{"name":"Sixth International Conference on Machine Learning and Applications (ICMLA 2007)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124942886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
C. H. Yamamoto, Maria Cristina Ferreira de Oliveira, M. L. Fujimoto, S. O. Rezende
{"title":"An Itemset-Driven Cluster-Oriented Approach to Extract Compact and Meaningful Sets of Association Rules","authors":"C. H. Yamamoto, Maria Cristina Ferreira de Oliveira, M. L. Fujimoto, S. O. Rezende","doi":"10.1109/ICMLA.2007.45","DOIUrl":"https://doi.org/10.1109/ICMLA.2007.45","url":null,"abstract":"Extracting association rules from large datasets typically results in a huge amount of rules. An approach to tackle this problem is to filter the resulting rule set, which reduces the rules, at the cost of also eliminating potentially interesting ones. In exploring a new dataset in search of relevant associations, it may be more useful for miners to have an overview of the space of rules obtainable from the dataset, rather than getting an arbitrary set satisfying high values for given interest measures. We describe a rule extraction approach that favors rule diversity, allowing miners to gain an overview of the rule space while reducing semantic redundancy within the rule set. This approach adopts an itemset-driven rule generation coupled with a cluster-based filtering process. The set of rules so obtained provides a starting point for a user-driven exploration of it.","PeriodicalId":448863,"journal":{"name":"Sixth International Conference on Machine Learning and Applications (ICMLA 2007)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123789043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Predicting building contamination using machine learning","authors":"Shawn Martin, S. McKenna","doi":"10.1109/ICMLA.2007.12","DOIUrl":"https://doi.org/10.1109/ICMLA.2007.12","url":null,"abstract":"Potential events involving biological or chemical contamination of buildings are of major concern in the area of homeland security. Tools are needed to provide rapid, on- site predictions of contaminant levels given only approximate measurements in limited locations throughout a building. In principal, such tools could use calculations based on physical process models to provide accurate predictions. In practice, however, physical process models are too complex and computationally costly to be used in a real-time scenario. In this paper, we investigate the feasibility of using machine learning to provide easily computed but approximate models that would be applicable in the field. We develop a machine learning method based on support vector machine regression and classification. We apply our method to problems of estimating contamination levels and contaminant source location.","PeriodicalId":448863,"journal":{"name":"Sixth International Conference on Machine Learning and Applications (ICMLA 2007)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126844120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Maximum Likelihood Quantization of Genomic Features Using Dynamic Programming","authors":"Mingzhou Song, R. Haralick, S. Boissinot","doi":"10.1109/ICMLA.2007.36","DOIUrl":"https://doi.org/10.1109/ICMLA.2007.36","url":null,"abstract":"Dynamic programming is introduced to quantize a continuous random variable into a discrete random variable. Quantization is often useful before statistical analysis or reconstruction of large network models among multiple random variables. The quantization, through dynamic programming, finds the optimal discrete representation of the original probability density function of a random variable by maximizing the likelihood for the observed data. This algorithm is highly applicable to study genomic features such as the recombination rate across the chromosomes and the statistical properties of non-coding elements such as LINE1. In particular, the recombination rate obtained by quantization is studied for LINE1 elements that are grouped also using quantization by length. The exact and density-preserving quantization approach provides an alternative superior to the inexact and distance-based k-means clustering algorithm for discretization of a single variable.","PeriodicalId":448863,"journal":{"name":"Sixth International Conference on Machine Learning and Applications (ICMLA 2007)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122641866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Feature extraction using random matrix theory approach","authors":"V. Rojkova, M. Kantardzic","doi":"10.1109/ICMLA.2007.95","DOIUrl":"https://doi.org/10.1109/ICMLA.2007.95","url":null,"abstract":"Feature extraction involves simplifying the amount of resources required to describe a large set of data accurately. In this paper, we propose to broaden the feature extraction algorithms with Random Matrix Theory methodology. Testing the cross-correlation matrix of variables against the null hypothesis of random correlations, we can derive characteristic parameters of the system, such as boundaries of eigenvalue spectra of random correlations, distribution of eigenvalues and eigenvectors of random correlations, inverse participation ratio and stability of eigenvectors of non-random correlations. We demonstrate the usefullness of these parameters for network traffic application, in particular, for network congestion control and for detection of any changes in the stable traffic dynamics.","PeriodicalId":448863,"journal":{"name":"Sixth International Conference on Machine Learning and Applications (ICMLA 2007)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122586842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An agent based system for california electricity market: a perspective of myopic machine learning","authors":"T. Sueyoshi, G. R. Tadiparthi","doi":"10.1109/ICMLA.2007.83","DOIUrl":"https://doi.org/10.1109/ICMLA.2007.83","url":null,"abstract":"In recent years, an agent based system is widely adopted to model a deregulated electricity market. [1] and [2] have developed a multi-agent intelligent simulator (MAIS) to model the structure of US wholesale market. The methodological practicality was confirmed with a simulation study and a real data set from PJM electricity market. In our proposed artificial wholesale market, the agents are equipped with limited reinforcement learning capabilities. We validate the agent based model with the help of six data sets from the California electricity market. The performance of the MAIS is compared with other well-known methods, using a real data set on power trading related to the California electricity (2000-2001).","PeriodicalId":448863,"journal":{"name":"Sixth International Conference on Machine Learning and Applications (ICMLA 2007)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129135691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Generalized Sequence Signatures through Symbolic Clustering","authors":"D. Dorr, A. Denton","doi":"10.1109/ICMLA.2007.41","DOIUrl":"https://doi.org/10.1109/ICMLA.2007.41","url":null,"abstract":"Traditionally sequence motifs and domains, also called signatures, are defined such that insertions, deletions and mismatched regions are small compared with matched regions. We introduce an algorithm for the identification of generalized sequence signatures that can be composed of windows distributed throughout the sequence. We use an approach that is based on clustering analysis of recurring subsequences, to which we refer as symbols, of a predefined length. Symbols are not required to be located in close proximity to each other. The clustering algorithm group sequences so as to maximize the number of shared symbols among sequences. We evaluate our signatures in comparison to those obtained from the InterPro database, and show that our approach has benefits for deriving sequence annotations compared with InterPro's signatures.","PeriodicalId":448863,"journal":{"name":"Sixth International Conference on Machine Learning and Applications (ICMLA 2007)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134116001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A new time series prediction algorithm based on moving average of nth-order difference","authors":"Yang Lan, D. Neagu","doi":"10.1109/ICMLA.2007.7","DOIUrl":"https://doi.org/10.1109/ICMLA.2007.7","url":null,"abstract":"As a typical research topic, time series analysis and prediction face a continuously rising interest and have been widely applied in various domains. Current approaches focus on a large number of data collections, using mathematics, statistics and artificial intelligence methods, to process and make a prediction on the next most probable value. This paper proposes a new algorithm using moving average of nth-order difference to predict the next term for pseudo- periodical time series. We use artificial neural networks (ANNs) and range evaluation for error in a hybrid model to extend our prediction method further. The algorithm performances are reported on case studies on monthly average sunspot number data set and earthquake data set.","PeriodicalId":448863,"journal":{"name":"Sixth International Conference on Machine Learning and Applications (ICMLA 2007)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127608988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}