Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis最新文献

筛选
英文 中文
ATP-OIE: An Autonomous Open Information Extraction Method ATP-OIE:一种自主开放信息提取方法
J. M. Rodríguez, H. Merlino, Patricia Pesado
{"title":"ATP-OIE: An Autonomous Open Information Extraction Method","authors":"J. M. Rodríguez, H. Merlino, Patricia Pesado","doi":"10.1145/3388142.3388166","DOIUrl":"https://doi.org/10.1145/3388142.3388166","url":null,"abstract":"This paper describes an innovative Open Information Extraction method known as ATP-OIE1. It utilizes extraction patterns to find semantic relations. These patterns are generated automatically from examples, so it has greater autonomy than methods based on fixed rules. ATP-OIE can also summon other methods, ReVerb and ClausIE, if it is unable to find valid semantic relations in a sentence, thus improving its recall. In these cases, it is capable of generating new extraction patterns online, which improves its autonomy. It also implements different mechanisms to prevent common errors in the extraction of semantic relations. Lastly, ATP-OIE was compared with other state-of-the-art methods in a well known texts database: Reuters-21578, obtaining a higher precision than with other methods.","PeriodicalId":409298,"journal":{"name":"Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117258837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptation of RF and CNN on Spark 在Spark上改编RF和CNN
Y. Kou, Zhi Hong, Yun Tian, S. Wang
{"title":"Adaptation of RF and CNN on Spark","authors":"Y. Kou, Zhi Hong, Yun Tian, S. Wang","doi":"10.1145/3388142.3388157","DOIUrl":"https://doi.org/10.1145/3388142.3388157","url":null,"abstract":"Biological images are used in many applications, most of which are important in medical field. For example, MRI scans and CT scans result in high resolution images that are critical for diagnosis of cancers and other malfunction of organs. Nowadays, high resolution ultrasound images can provide details to examine blood vessel blockage. Another type of biological images are those of mixed patterns of proteins in microscope human protein atlas images.Due to the enormous amount of image data available even in a single medical organization, Machine Learning and Deep Learning technology have been used to assist in the image data analysis.Spark is a computing framework that has been proved to speed up data analysis dramatically. However, Spark Scala doesn't fully support Deep learning algorithms. In this paper, we present a case study of adapting the Random Forest (RF) and Convolutional Neural Network (CNN) to the Spark Scala framework. These algorithms were applied to multi-classes multilabel classification on a biological dataset from Kagglers. The experimental results show that both RF and CNN can be implemented with Spark Scala and achieve extremely high throughput performance.","PeriodicalId":409298,"journal":{"name":"Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123421911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Wellhead Compressor Failure Prediction Using Attention-based Bidirectional LSTMs with Data Reduction Techniques 基于注意力的双向lstm与数据约简技术的井口压缩机故障预测
Wirasak Chomphu, B. Kijsirikul
{"title":"Wellhead Compressor Failure Prediction Using Attention-based Bidirectional LSTMs with Data Reduction Techniques","authors":"Wirasak Chomphu, B. Kijsirikul","doi":"10.1145/3388142.3388154","DOIUrl":"https://doi.org/10.1145/3388142.3388154","url":null,"abstract":"In the offshore oil and gas industry, petroleum in each well of a remote wellhead platform (WHP) is extracted naturally from the ground to the sales delivery point. However, when the oil pressure drops or the well is nearly depleted, the flow rate up to the WHP declines. Installing a Wellhead Compressor (WC) on the WHP is the solution [9]. The WC acts locally on the selected wells and reduces back pressure, thereby substantially enhancing the efficiency of oil and gas recovery [21]. The WC sensors transmit data back to the historian time series database, and intelligent alarm systems are utilized as a critical tool to minimize unscheduled downtime which adversely affects production reliability, as well as monitoring time and cost burden of operating engineers. In this paper, an Attention-Based Bidirectional Long Short-Term Memory (ABD-LSTM) model is presented for WC failure prediction. We also propose feature extraction and data reduction techniques as complementary methods to improve the effectiveness of the training process in a large-scale dataset. We evaluate our model performance based on real WC sensor data. Compared to other Machine Learning (ML) algorithms, our proposed methodology is more powerful and accurate. Our proposed ABD-LSTM achieved an optimal F1 score of 85.28%.","PeriodicalId":409298,"journal":{"name":"Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122540344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Ideology Detection of Personalized Political News Coverage: A New Dataset 个性化政治新闻报道的意识形态检测:一个新的数据集
Khudran Alzhrani
{"title":"Ideology Detection of Personalized Political News Coverage: A New Dataset","authors":"Khudran Alzhrani","doi":"10.1145/3388142.3388149","DOIUrl":"https://doi.org/10.1145/3388142.3388149","url":null,"abstract":"Words selection, writing style, stories cherry-picking, and many other factors play a role in framing news articles to fit the targeted audience or to align with the authors' beliefs. Hence, reporting facts alone is not evidence of bias-free journalism. Since the 2016 United States presidential elections, researchers focused on the media influence on the results of the elections. The news media attention has deviated from political parties to candidates. The news media shapes public perception of political candidates through news personalization. Despite its criticality, we are not aware of any studies which have examined news personalization from the machine learning or deep neural network perspective. In addition, some candidates accuse the media of favoritism which jeopardizes their chances of winning elections. Multiple methods were introduced to place news sources on one side of the political spectrum or the other, yet the mainstream media claims to be unbiased. Therefore, to avoid inaccurate assumptions, only news sources that have stated clearly their political affiliation are included in this research. In this paper, we constructed two datasets out of news articles written about the last two U.S. presidents with respect to news websites' political affiliation. Multiple intelligent models were developed to automatically predict the political affiliation of the personalized unseen article. The main objective of these models is to detect the political ideology of personalized news articles. Although the newly constructed datasets are highly imbalanced, the performance of the intelligent models is reasonably good. The results of the intelligent models are reported with a comparative analysis.","PeriodicalId":409298,"journal":{"name":"Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130914446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Text mining for incoming tasks based on the urgency/importance factors and task classification using machine learning tools 基于紧急/重要因素和使用机器学习工具的任务分类对传入任务进行文本挖掘
Y. Alshehri
{"title":"Text mining for incoming tasks based on the urgency/importance factors and task classification using machine learning tools","authors":"Y. Alshehri","doi":"10.1145/3388142.3388153","DOIUrl":"https://doi.org/10.1145/3388142.3388153","url":null,"abstract":"In workplaces, there is a massive amount of unstructured data from different sources. In this paper, we present a case study that explains how can through communications between employees, we can help to prioritize tasks requests to increase the efficiency of their works for both technical and non-technical workers. This involves managing daily incoming tasks based on their level of urgency and importance.To allow all workers to utilize the urgency-importance matrix as a time-management tool, we need to automate this tool. The textual content of incoming tasks are analyzed, and metrics related to urgency and importance are extracted. A third factor (i.e., the response variable) is defined based on the two input variables (urgency and importance). Then, machine learning applied to the data to predict the class of incoming tasks based on data outcome desired. We used ordinal regression, neural networks, and decision tree algorithms to predict the four levels of task priority. We measure the performance of all using recalls, precisions, and F-scores. All classifiers perform higher than 89% in terms of all measures.","PeriodicalId":409298,"journal":{"name":"Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130647346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automated Cyberbullying Detection in Social Media Using an SVM Activated Stacked Convolution LSTM Network 基于SVM激活的堆叠卷积LSTM网络的社交媒体网络欺凌自动检测
Thor Aleksander Buan, Raghavendra Ramachandra
{"title":"Automated Cyberbullying Detection in Social Media Using an SVM Activated Stacked Convolution LSTM Network","authors":"Thor Aleksander Buan, Raghavendra Ramachandra","doi":"10.1145/3388142.3388147","DOIUrl":"https://doi.org/10.1145/3388142.3388147","url":null,"abstract":"Cyberbullying is becoming a huge problem on social media platforms. New statistics shows that more than a fourth of Norwegiankids report that they have been cyberbullied once or more duringthe last year. In the most recent years, it has become popularto utilize Neural Networks in order to automate the detection ofcyberbullying. These Neural Networks are often based on using Long-Short-Term-Memory layers solely or in combination withother types of layers. In this thesis we present a new Neural Networkdesign that can be used to detect traces of cyberbullying intextual media. The design is based on existing designs that combinesthe power of Convolutional layers with Long-Short-Term-Memorylayers. In addition, our design features the usage of stacked corelayers, which our research shows to increases the performance ofthe Neural Network. The design also features a new kind of activationmechanism, which is referred to as \"Support-Vector-Machinelike activation\". The \"SupportVector-Machine like activation\" isachieved by applying L2 weight regularization and utilizing a linearactivation function in the activation layer together with using aHinge loss function. Our experiments show that both the stackingof the layers and the \"Support-Vector-Machine like activation\"increasesthe performance of the Neural Network over traditionalState-Of-The-Art designs.","PeriodicalId":409298,"journal":{"name":"Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134533783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Using Monte Carlo Simulation to Predict Captive Insurance Solvency 用蒙特卡罗模拟预测专属自保保险偿付能力
Lu Xiong, Don Hong
{"title":"Using Monte Carlo Simulation to Predict Captive Insurance Solvency","authors":"Lu Xiong, Don Hong","doi":"10.1145/3388142.3388171","DOIUrl":"https://doi.org/10.1145/3388142.3388171","url":null,"abstract":"The solvency of captive insurance is the key financial metric captive managers care about. We built a solvency prediction model for a captive insurance fund using Monte Carlo simulation with the fund's historical losses, current financial data and setups. This model can predict the solvency score of the current captive fund using the fund survival probability as a measurement of solvency. If the simulated future solvency ratios break the upper and lower bounds, we count it as an insolvent case; otherwise, it is counted a solvent (or survival) case. After large scale simulation, we can approximate the future survival probability, i.e. the solvency score, of the current captive fund. The predicted income statements, the balance sheets and financial ratios, will also be generated. We use a heat-map to visualize the solvency score at each retention level so that it can provide support to captive insurance managers to make their decisions. This model is implemented in Excel VBA macro and MATLAB.","PeriodicalId":409298,"journal":{"name":"Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131353181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Hybrid Model of Clustering and Neural Network Using Weather Conditions for Energy Management in Buildings 基于天气条件的聚类和神经网络混合模型在建筑能源管理中的应用
Bishnu Nepal, M. Yamaha
{"title":"A Hybrid Model of Clustering and Neural Network Using Weather Conditions for Energy Management in Buildings","authors":"Bishnu Nepal, M. Yamaha","doi":"10.1145/3388142.3388172","DOIUrl":"https://doi.org/10.1145/3388142.3388172","url":null,"abstract":"For the conservation of energy in buildings, it is essential to understand the energy consumption pattern and make efforts based on the analyzed result for energy load reduction. In this research, we proposed a method for forecasting the electricity load of university buildings using a hybrid model of clustering technique and neural network using weather conditions. The novel approach discussed in this paper includes clustering one whole year data including the forecasting day using K-means clustering and using the result as an input parameter in a neural network for forecasting the electricity peak load of university buildings. The hybrid model has proved to increase the performance of forecasting rather than neural network alone. We also developed a graphical visualization platform for the analyzed result using an interactive web application called Shiny. Using Shiny application and forecasting electricity peak load with appreciable accuracy several hours before peak hours can aware the management authorities about the energy situation and provides sufficient time for making a strategy for peak load reduction. This method can also be implemented in the demand response for reducing the electricity bills by avoiding electricity usage during the high electricity rate hours.","PeriodicalId":409298,"journal":{"name":"Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115368722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Research on Automatic Generation Method of Scenario Based on Panosim 基于Panosim的场景自动生成方法研究
Zhang Lu, Zhibin Du, Xianglei Zhu
{"title":"Research on Automatic Generation Method of Scenario Based on Panosim","authors":"Zhang Lu, Zhibin Du, Xianglei Zhu","doi":"10.1145/3388142.3388165","DOIUrl":"https://doi.org/10.1145/3388142.3388165","url":null,"abstract":"With the development of science and technology, L3 intelligent vehicles are gradually entering the mass production phase. Traditional testing tools and methods can hardly meet the requirements for multiple dimensions, high standard and big data of self-driving vehicles. The scenario-based simulation test method has great technical advantages in terms of test efficiency, verification cost and versatility, and is an important means for automatic driving test verification. However, it has shortcomings such as long scenario construction period and large repeatability. This paper is compiled based on secondary development of the automatic driving simulation software Panosim and presenting the automatic inputting of scenario and rapid adjustment of parameters through the digital twinning technology. In addition, the natural driving scenario database of China Automotive Technology and Research Center is used for verification. The results show that this method can improve the efficiency and accuracy of scenario construction, and greatly shorten the cycle of simulation test.","PeriodicalId":409298,"journal":{"name":"Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122609241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
How Different Genders Use Profanity on Twitter? 不同性别的人如何在推特上使用脏话?
S. Wong, P. Teh, Chi-Bin Cheng
{"title":"How Different Genders Use Profanity on Twitter?","authors":"S. Wong, P. Teh, Chi-Bin Cheng","doi":"10.1145/3388142.3388145","DOIUrl":"https://doi.org/10.1145/3388142.3388145","url":null,"abstract":"Social media, is often the go-to place where people discuss their opinions and share their feelings. As some platforms provide more anonymity than others, users have taken advantage of that privilege, by sitting behind the screen, the use of profanity has been able to create a toxic environment. Although not all profanities are used to offend people, it is undeniable that the anonymity has allowed social media users to express themselves more freely, increasing the likelihood of swearing. In this study, the use of profanity by different gender classes is compiled, and the findings showed that different genders often employ swear words from different hate categories, e.g. males tend to use more terms from the \"disability\" hate group. Classification models have been developed to predict the gender of tweet authors, and results showed that profanity could be used to uncover the gender of anonymous users. This shows the possibility that profiling of cyberbullies can be done from the aspect of gender based on profanity usage.","PeriodicalId":409298,"journal":{"name":"Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116444465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信