{"title":"Forecasting popularity of news article by title analyzing with BN-LSTM network","authors":"Anton Voronov, Yao Shen, Pritom Kumar Mondal","doi":"10.1145/3335656.3335679","DOIUrl":"https://doi.org/10.1145/3335656.3335679","url":null,"abstract":"In recent years, predicting the popularity of articles in the news has become a more urgent task for authors, online resources and advertisers. In the order of this task, we propose a new method based on the Online Deep Neural network with Bottleneck compression, what predicts the article popularity with only its headline. The proposed methodology evaluated on the Chinese and Russian language-based datasets with over than 800 000 samples in total. We describe the challenges and solutions related to the popularity prediction and the headline analysis. We show that the provided method can reach acceptable results even with different languages, news source popularity dynamics.","PeriodicalId":396772,"journal":{"name":"Proceedings of the 2019 International Conference on Data Mining and Machine Learning","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128984577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hypothesis and Application of Complex Truncated Stochastic Investment-Revenue Model","authors":"Li Ye, Chen Yiyan","doi":"10.1145/3335656.3335687","DOIUrl":"https://doi.org/10.1145/3335656.3335687","url":null,"abstract":"Investment is an economic behavior, which is accompanied by risk and randomness. The purpose of investment is to obtain the corresponding returns and maximize the returns. This paper establishes a complex truncated stochastic investment-revenue model based on mathematical statistics. The purpose is to better express the relationship between investment and revenue with full consideration of risk. By introducing five stochastic variables: capital, value-added interest rate, investment cycle, inflation rate and liquidation cycle, a series of hypotheses for the model are put forward, which further reveals the relationship between investment and revenue.","PeriodicalId":396772,"journal":{"name":"Proceedings of the 2019 International Conference on Data Mining and Machine Learning","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124016353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An improved FCM clustering algorithm based on cosine similarity","authors":"Minxuan Li","doi":"10.1145/3335656.3335693","DOIUrl":"https://doi.org/10.1145/3335656.3335693","url":null,"abstract":"Based on the traditional Fuzzy C-means (FCM) clustering algorithm, this study adds cosine similarity as a correction factor and optimizes the FCM algorithm by optimizing the membership degree of the objective function. The results show that the matrix estimation error obtained by the improved algorithm is smaller and the precision is higher, which can reduce the normalized mean square error by about 20.67%, and the angular deviation is reduced by about 8° on average.","PeriodicalId":396772,"journal":{"name":"Proceedings of the 2019 International Conference on Data Mining and Machine Learning","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117062066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Application of Mathematical Model in Pension: System Reform for Retired Employee in Enterprise","authors":"Haihui Wu","doi":"10.1145/3335656.3335681","DOIUrl":"https://doi.org/10.1145/3335656.3335681","url":null,"abstract":"In this paper, we describe the formatting guidelines for ACM SIG Proceedings. In this paper, the reform of the pensions system for retired workers in Chinese enterprises is studied by means of mathematical modeling. Firstly, the paper analyzes the average wage of workers in Shandong Province since 1978, which is rising as the time goes by using the statistical data and the Matlab software, combined with relevant knowledge of economics. The Malthus and Logistic models were successively adopted as empirical formulas for regression analysis, so as to obtain a more accurate prediction model. And the annual average wage of workers from 2019 to 2040 is predicted by using this model. Then, taking a certain enterprise as an example, a mathematical model is set up to calculate the pension substitution rate and the gap of the pension fund of the enterprise staff in all kinds of situations, and to give the age when the payment of the pension is in balance with the earnings. Finally, some effective measures for the government are given by using the model in order to achieve the target of 58.5% of the basic old-age insurance in the future and maintain the balance of income and expenditure of the pension insurance fund.","PeriodicalId":396772,"journal":{"name":"Proceedings of the 2019 International Conference on Data Mining and Machine Learning","volume":"442 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127605295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Application of PCA in the Porosity Evaluation of a Coal Reservoir","authors":"Jingang Wu, Guang Zhang","doi":"10.1145/3335656.3335695","DOIUrl":"https://doi.org/10.1145/3335656.3335695","url":null,"abstract":"In this paper, we study the feasibility of applying the principal component analysis (PCA) for porosity evaluation of a coal Reservoir. The geological characteristics of the No. 3 coal reservoir in Qinshui Basin (Shanxi Province, China) were analyzed at first. On this basis, vitrinite reflectance, coal macrolithotype, ash content, macro-fissure density, micro-fissure density, and coal structure were adopted as the index variables for evaluating the porosity. Three principal components were extracted by reducing dimensions of the six primitive variables. Then, a scoring model of the principal components was constructed to calculate the comprehensive scores for porosity of the coal samples. In addition, Qinshui Basin was partitioned into four regions: a (Yangquan), b (Jingfang-Wangzhuang), c (Changcun), and d (Jincheng), which were listed in a descending order as a, c, b, and d with reducing porosities. Explanation of the principal components and geological theoretical analysis revealed that the geotectonic movements and evolutions, and sedimentary environment are main controlling factors for studying the porosity based zoning of the research region. The results prove that it is feasible to apply PCA to evaluate the porosity of coal reservoirs.","PeriodicalId":396772,"journal":{"name":"Proceedings of the 2019 International Conference on Data Mining and Machine Learning","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134431850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xianrong Song, Haoliang Sun, Hong Zhang, Meng Zhang
{"title":"Detecting Domain Generation Algorithms Based on Cascade Foresets","authors":"Xianrong Song, Haoliang Sun, Hong Zhang, Meng Zhang","doi":"10.1145/3335656.3335690","DOIUrl":"https://doi.org/10.1145/3335656.3335690","url":null,"abstract":"In a botnet, the attacker often uses domain fluxing to hide communication channel. Domain Generation Algorithm (DGA) is a kind of technique which is usually utilized by domain fluxing. Each bot uses a DGA to generate plenty of domain names and one of them will be registered by the botmaster as the domain name mapping to the C&C server. The security personnel will be in trouble with a large number of DGA domain names. In this paper, we propose a DGA detection approach called Minos. This approach detects DGA-generated domain names by analyzing the word-formation of domain names with the cascaded forest algorithm. The key insight of Minos is that domain names are made up of syllables or acronyms for easy reading, and n-grams can generally represent both of them. Our experimental results show that Minos can accurately identify domain names generated by DGAs with a precision of 93.8% and a recall rate of 93.5%.","PeriodicalId":396772,"journal":{"name":"Proceedings of the 2019 International Conference on Data Mining and Machine Learning","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131323440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhanced Automatic Image Parameter setting and Segmentation Method","authors":"Kedir Kamu Sirur, Ye Peng, Zhang Qinchuan","doi":"10.1145/3335656.3335697","DOIUrl":"https://doi.org/10.1145/3335656.3335697","url":null,"abstract":"There are a lot of works done to automatically set parameters and segment images based on Pulse Coupled Neural Networks (PCNN). In this study we propose an automatic parameters setting and segmentation method based on Intersecting Cortical Mode (ICM) which enables to overcome the basic limitation of PCNN based methods. We used the ICM as base and developed an enhanced automatic method which can withstand effects of multiple background and illumination during segmentation. Characteristics pixel values of the input image are used to deduce corresponding segmentation parameters. The experiment is done on Aerial Image Segmentation Dataset and Database of Human Segmented Natural Images. Our method outperformed for subjective and objective evaluations, also shown consistent assignment of parameter values. Also the proposed method is able to reduce the segmentation time by half and overcome the limitations of the existing automatic models.","PeriodicalId":396772,"journal":{"name":"Proceedings of the 2019 International Conference on Data Mining and Machine Learning","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131349025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hierarchical Relation Extraction Based On Bidirectional Long Short-Term Memory Networks","authors":"Jia Chen, Liang Liu, Jiali Xu, Bei Hui","doi":"10.1145/3335656.3335694","DOIUrl":"https://doi.org/10.1145/3335656.3335694","url":null,"abstract":"Relation extraction is an important task in the field of natural language processing (NLP). Most of the present methods extract each relation in isolation, without considering the hierarchical semantic information between relations. A novel loss function to optimize model of relation extraction based on hierarchical relation has been proposed in this paper. The experimental results show that the proposed model outperforms most of the present methods.","PeriodicalId":396772,"journal":{"name":"Proceedings of the 2019 International Conference on Data Mining and Machine Learning","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124082818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Research on the Influence of National System along \"the Belt and Road Initiative\" on Investment Area Selection Based on Mult-Motivation Analysis of Intelligent Modeling and Information Processing","authors":"Liu Ying, Zhang Renping, Rizwan Ali","doi":"10.1145/3335656.3335689","DOIUrl":"https://doi.org/10.1145/3335656.3335689","url":null,"abstract":"Through the multi-task analysis of the intelligent modeling and information processing of the business data of each country over the years, it is possible to distinguish which countries have better business environment. From the perspective of investment motives, this paper studies the influence of national institutional mechanism on the choice of overseas investment regions of Chinese enterprises. In addition, Intelligent Modeling analysis of 65 host countries along \"the Belt and Road\" and 275 observations in the past 5 years have been conducted in the paper. Base on the information processing, we found that, compared with those enterprises seeking market and strategic assets, Chinese resource-oriented companies always tend to ignore institutional factors in site selection. However, due to the limitations of China's market economy in the development process and the internal problems of those companies, domestic market-oriented and strategic asset-oriented enterprises often pay less attention to certain institutional factors.","PeriodicalId":396772,"journal":{"name":"Proceedings of the 2019 International Conference on Data Mining and Machine Learning","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128663244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Research on offline behavior similarity of consumers based on Spatio-temporal data set mining","authors":"Zhang Renping, Liu Ying, Rizwan Ali","doi":"10.1145/3335656.3335684","DOIUrl":"https://doi.org/10.1145/3335656.3335684","url":null,"abstract":"The Spatio-temporal data set can be used in business research, For example, The user's geolocation check-in data (POI) in social media can be used to trace back the user's behavior track, however, the analysis of the similarity of LBSN users is not involved in the user's geographical location track. As a result, a density clustering method based on partition hierarchy and different neighborhood radius by users' geographical location is proposed to help explore similar measurement based on Spatio-temporal data set mining. The method observes the number of times a user visits each cluster region at different spatial location scales, and then calculates the similarity of users at each level by taking advantage of vector space model (VSM). Finally, users' similarity in Spatio-temporal(geospatial) behavior is obtained by superimposing user similarity values at different levels with different weights. The experimental results based on real user data of a large-scale social networking site in China show that the proposed method can effectively identify those users when they visit similar geographical locations.","PeriodicalId":396772,"journal":{"name":"Proceedings of the 2019 International Conference on Data Mining and Machine Learning","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121061014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}