{"title":"Algorithms for Fast Large Scale Data Mining Using Logistic Regression","authors":"Omid Rouhani-Kalleh","doi":"10.1109/CIDM.2007.368867","DOIUrl":"https://doi.org/10.1109/CIDM.2007.368867","url":null,"abstract":"This paper proposes two new efficient algorithms to train logistic regression classifiers using very large data sets. Our algorithms will lower the upper bound time complexity that the existing algorithm in the literature has and our experiments confirm that our proposed algorithms significantly improve the execution time. For our data sets, which come from Microsoft's Web logs, the execution time was reduced up to 353 times as compared to the algorithm often referenced in the literature. The improvement will be even greater for larger data sets","PeriodicalId":423707,"journal":{"name":"2007 IEEE Symposium on Computational Intelligence and Data Mining","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126161042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
V. Esmaeili, A. Assareh, M. Shamsollahi, M. Moradi, N. Arefian
{"title":"Designing a Fuzzy Rule Based System to Estimate Depth of Anesthesia","authors":"V. Esmaeili, A. Assareh, M. Shamsollahi, M. Moradi, N. Arefian","doi":"10.1109/CIDM.2007.368942","DOIUrl":"https://doi.org/10.1109/CIDM.2007.368942","url":null,"abstract":"Estimating the depth of anesthesia (DOA) is still a challenging area in anesthesia research. The objective of this study was to design a fuzzy rule based system which integrates electroencephalogram (EEG) features to quantitatively estimate the DOA. The proposed method is based on the analysis of single-channel EEG using frequency and time domain features as well as Shannon entropy measure. The fuzzy classifier is trained with features obtained from four subsets of data comprising well-defined anesthesia states: awake, moderate, general anesthesia, and isoelectric. The classifier extracts efficient fuzzy if-then rules and the DOA index is derived between 100 (full awake) to 0 (isoelectric) using fuzzy inference engine. To validate the proposed method, a clinical study has conducted on 22 patients to construct 4 subsets of reference states and also to compare the results with CSM monitor (Danmeter, Denmark), which has revealed satisfactory correlation with clinical assessments","PeriodicalId":423707,"journal":{"name":"2007 IEEE Symposium on Computational Intelligence and Data Mining","volume":"28 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125687576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Quantitative Method for Analyzing Scan Path Data Obtained by Eye Tracker","authors":"H. Takeuchi, Yoshiko Habuchi","doi":"10.1109/CIDM.2007.368885","DOIUrl":"https://doi.org/10.1109/CIDM.2007.368885","url":null,"abstract":"Scan path is one of the most important metrics measured by eye tracking systems. This paper describes a new method for analyzing scan-path data based on the string-edit method that is popular for correcting human errors made at the input stage. We defined several cost functions for the substitution costs in the string-edit method, and applied the method to the scan-path data we had collected in a series of experiments for studying Web browsing behavior. We demonstrate the usefulness of our method and discuss the appropriate cost functions for the eye-tracking data.","PeriodicalId":423707,"journal":{"name":"2007 IEEE Symposium on Computational Intelligence and Data Mining","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128565029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Finding Similarity Relations in Presence of Taxonomic Relations in Ontology Learning Systems","authors":"A. Vazifedoost, F. Oroumchian, M. Rahgozar","doi":"10.1109/CIDM.2007.368875","DOIUrl":"https://doi.org/10.1109/CIDM.2007.368875","url":null,"abstract":"Ontology learning tries to find ontological relations, by an automatic process. Similarity relationships are one of non-taxonomic relations which may be included in ontology. Our idea is that in presence of taxonomic relations we are able to extract more useful non-taxonomic similarity relations. In this paper we investigate the specifications of an implemented system for extracting these relations by means of new context extraction method which uses taxonomic relations","PeriodicalId":423707,"journal":{"name":"2007 IEEE Symposium on Computational Intelligence and Data Mining","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121501873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design of Multi-Singing Karaoke System and its Application in Customer Finance-aided Service on Internet","authors":"Jian-Hong Wang, Shih-Chuan Feng, Jen-Yi Pan","doi":"10.1109/CIDM.2007.368953","DOIUrl":"https://doi.org/10.1109/CIDM.2007.368953","url":null,"abstract":"Financial engineering nowadays is heavily depending on the real-time investment decision-support system while providing the suggestions to investors for rebalancing their portfolios. Because of the rapid growth of interactive technologies in Internet, on-line customer finance-aided service (CFAS) is playing an important role to help discussing, exchanging valuable information or enjoying entertainment each other simultaneously based on the platform of multi-singing karaoke system. Today's online singing software allows only one-microphone performance. Duo singers can only take turns and use the same microphone. Also, they cannot hear each other simultaneously. This is one big disadvantage of online multi-singing software, hence brings the problems to the quality of CFAS. Therefore, the main purpose of this study is to create a system that allows several singers to sing in different places simultaneously and hear the other singers' voices at the same time. This system catches and analyzes the singers' signals from remote computer sources through Media Server. Also by network broadcasting, this system transmits the signals from the inviter's end to the invitee's end, which is deeply improving the quality of CFAS","PeriodicalId":423707,"journal":{"name":"2007 IEEE Symposium on Computational Intelligence and Data Mining","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121543171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identifying Anatomical Phrases in Clinical Reports by Shallow Semantic Parsing Methods","authors":"Vijayaraghavan Bashyam, R. Taira","doi":"10.1109/CIDM.2007.368874","DOIUrl":"https://doi.org/10.1109/CIDM.2007.368874","url":null,"abstract":"Natural language processing (NLP) is being applied for several information extraction tasks in the biomedical domain. The unique nature of clinical information requires the need for developing an NLP system designed specifically for the clinical domain. We describe a method to identify semantically coherent phrases within clinical reports. This is an important step towards full syntactic parsing within a clinical NLP system. We use this semantic phrase chunker to identify anatomical phrases within radiology reports related to the genitourinary domain. A discriminative classifier based on support vector machines was used to classify words into one of five phrase classification categories. Training of the classifier was performed using 1000 hand-tagged sentences from a corpus of genitourinary radiology reports. Features used by the classifier include n-grams, syntactic tags and semantic labels. Evaluation was conducted on a blind test set of 250 sentences from the same domain. The system achieved overall performance scores of 0.87 (precision), 0.91 (recall) and 0.89 (balanced f-score). Anatomical phrase extraction can be rapidly and accurately accomplished","PeriodicalId":423707,"journal":{"name":"2007 IEEE Symposium on Computational Intelligence and Data Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124539932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ADMIRAL: A Data Mining Based Financial Trading System","authors":"Gil Rachlin, Mark Last, Dima Alberg, A. Kandel","doi":"10.1109/CIDM.2007.368947","DOIUrl":"https://doi.org/10.1109/CIDM.2007.368947","url":null,"abstract":"This paper presents a novel framework for predicting stock trends and making financial trading decisions based on a combination of data and text mining techniques. The prediction models of the proposed system are based on the textual content of time-stamped Web documents in addition to traditional numerical time series data, which is also available from the Web. The financial trading system based on the model predictions (ADMIRAL) is using three different trading strategies. In this paper, the ADMIRAL system is simulated and evaluated on real-world series of news stories and stocks data using the C4.5 decision tree induction algorithm. The main performance measures are the predictive accuracy of the induced models and, more importantly, the profitability of each trading strategy using these predictions","PeriodicalId":423707,"journal":{"name":"2007 IEEE Symposium on Computational Intelligence and Data Mining","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127976003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Spatial Data Mining for Optimized Selection of Facility Locations in Field-based Services","authors":"A. Zarnani, M. Rahgozar, C. Lucas, F. Taghiyareh","doi":"10.1109/CIDM.2007.368949","DOIUrl":"https://doi.org/10.1109/CIDM.2007.368949","url":null,"abstract":"Spatial data mining has been developed as the effective technique in many applications that involve large amounts of geo-spatial data. Many organizations provide field-based services such as delivery, field-services and emergency to their customers. Considering the geographical distribution of the customer request points, the location of facilities will have noticeable impact on the overall efficiency of the company's operations. The closer the facilities are to the customers, the sooner and cheaper will be the service provision transaction. In this paper, we empirically study the role of spatial clustering methods in such context. We have implemented and tuned some of the main spatial clustering algorithms to discover the best locations for facility establishment. A new spatial clustering algorithm is proposed that does not require the number of facilities as input. The new algorithm will determine the optimal number of facilities along with their locations based on the business context trade-offs. Many experiments are conducted to study the performance of the studied algorithms on real world and synthetic data sets. The results reveal valuable distinctions between the different methods and confirm the higher efficiency of the proposed algorithm.","PeriodicalId":423707,"journal":{"name":"2007 IEEE Symposium on Computational Intelligence and Data Mining","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134295703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mining Subspace Correlations","authors":"R. Harpaz, R. Haralick","doi":"10.1109/CIDM.2007.368893","DOIUrl":"https://doi.org/10.1109/CIDM.2007.368893","url":null,"abstract":"In recent applications of clustering such as gene expression microarray analysis, collaborative filtering, and Web mining, object similarity is no longer measured by physical distance, but rather by the behavior patterns objects manifest or the magnitude of correlations they induce. Current state of the art algorithms aiming at this type of clustering typically postulate specific cluster models that are able to capture only specific behavior patterns or correlations, and omit the possibility that other information carrying patterns or correlations may coexist in the data. We cast the problem of searching for pattern clusters or clusters that induce large correlations in some subset of features into the problem of searching for groups of points embedded in lines. The advantage of this approach is that is allows the clustering of different patterns or correlations simultaneously. It also allows the clustering of patterns and correlations that are overlooked by existing methods. A formal stochastic line cluster model is presented and its connection to correlation is established. Based on this model an algorithm, which uses feature selection to search for line clusters embedded in subspaces of the data is presented","PeriodicalId":423707,"journal":{"name":"2007 IEEE Symposium on Computational Intelligence and Data Mining","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127795724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exchange Rates Forecasting Using a Hybrid Fuzzy and Neural Network Model","authors":"An-Pin Chen, Hsio-Yi Lin","doi":"10.1109/CIDM.2007.368952","DOIUrl":"https://doi.org/10.1109/CIDM.2007.368952","url":null,"abstract":"Artificial neural networks (ANNs) are promising approaches for financial time series prediction and have been widely applied to handle finance problems because of its nonlinear structures. However, ANNs have some limitations in evaluating the output nodes as a result of single-point values. This study proposed a hybrid model, called fuzzy BPN, consisting of backpropagation neural network (BPN) and fuzzy membership function for taking advantage of nonlinear features and interval values instead of the shortcoming of single-point estimation. In addition, the experimental processing can demonstrate the feasibility of applying the hybrid model-fuzzy BPN and the empirical results show that fuzzy BPN provides a useful alternative to exchange rate forecasting","PeriodicalId":423707,"journal":{"name":"2007 IEEE Symposium on Computational Intelligence and Data Mining","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131382693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}