{"title":"Tweet Sentiment Extraction Using Byte Level Pretrained Language Model∗","authors":"Haowei Liu, Enhao Tan","doi":"10.1145/3529836.3529941","DOIUrl":"https://doi.org/10.1145/3529836.3529941","url":null,"abstract":"Research on sentiment analysis developed rapidly in recent years, and twitter sentiment analysis is one of the most popular topics. Besides classifying the sentiment, it is also important to find out the decisive phrases or words of the text to the classified sentimental category. In this paper, we proposed and developed byte-level pretained RoBERTa models, they are designed to extract phrases from tweet data with sentiment labels. We compared RoBERTa model and its’ variants, including RoBERTa-base, RoBERTa-large, XLM-RoBERTa-base, and RoBERTa-large-mnli. We build the model with RoBERTa model and CNN, then train the model with given tweet text and sentiment labels so that the deciding phrases of sentiments can be predicted. Our results show that RoBERTa-base obtains Jaccard score of 0.712 and training time of 240 minutes in total, which is the best performance among all the models.","PeriodicalId":285191,"journal":{"name":"2022 14th International Conference on Machine Learning and Computing (ICMLC)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122236581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nianze Wu, Bozhi Hao, Jiahao Ma, Tianhong Gao, Yancong Deng
{"title":"Conditional Generative Adversarial Networks for Hyperspectral Image Classification","authors":"Nianze Wu, Bozhi Hao, Jiahao Ma, Tianhong Gao, Yancong Deng","doi":"10.1145/3529836.3529859","DOIUrl":"https://doi.org/10.1145/3529836.3529859","url":null,"abstract":"Though Hyperspectral Image (HSI) classification has been extensively investigated over recent decades, it is still a challenging task especially when the labeled samples are extremely limited. In this paper, we overcome the obstacle by using Conditional Generative Adversarial Networks (CGAN) to generate trainable data set with complete spectral and spatial information. Through comparing generated images of different shape and classification map for Indian pines, the most suitable data are selected and used to train the common model of neural network. Second, three common and latest neural network methods including two-dimensional Convolution (Conv2D), three-dimensional Convolution (Conv3D), Hybrid spectral CNN (Hybrid SN) used for HSI classification, are proposed. After repeating experiments and cross-validation, we have found that the proposed method, enhancing original data, can make model achieve better and robust performance for HSI classification compared to complete original data set, especially when the labeled data is limited.","PeriodicalId":285191,"journal":{"name":"2022 14th International Conference on Machine Learning and Computing (ICMLC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130468232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimization and Reconstruction for EPMA Image Compressed Sensing Based on Chaotic Measurement Matrix","authors":"Li Zhang, Jun Zhang, Anan Jin","doi":"10.1145/3529836.3529846","DOIUrl":"https://doi.org/10.1145/3529836.3529846","url":null,"abstract":"Image reconstruction is an important part of today's image processing. The quality and efficiency of image re-construction are one of the research hotspots in today's image processing field. Image reconstruction using compressed sensing can greatly reduce the sampling rate and break the constraint of traditional Nyquist sampling law, which is of great breakthrough significance for image reconstruction. The quality of the measurement matrix in compressed sensing has a great influence on the effect of the reconstruction. Therefore, the construction of the measurement matrix is an important direction of the current research on compressed sensing. This paper is using the deterministic Monte Carlo pseudo-random number sampling method to construct a chaotic measurement matrix for compressed sensing. This measurement matrix can solve the uncertainty of the random matrix. The experimental results show that the reconstruction effect of this method on the EPMA image has a better reconstruction performance than other measurement matrices and achieves the super-resolution recovery.","PeriodicalId":285191,"journal":{"name":"2022 14th International Conference on Machine Learning and Computing (ICMLC)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132136006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mengna Nie, Lianglun Cheng, Haiming Ye, Weiwen Zhang
{"title":"Chinese NER with High-Level Features in Specific Domain","authors":"Mengna Nie, Lianglun Cheng, Haiming Ye, Weiwen Zhang","doi":"10.1145/3529836.3529937","DOIUrl":"https://doi.org/10.1145/3529836.3529937","url":null,"abstract":"In recent years, the character-word lattice structure has achieved good performance in Chinese named entity recognition (NER). However, in some specific domain, like marine industry domain, there are many specialized words that are hard to be segmented to utilize. Facing this challenge, it is necessary to employ a method to better identify the domain-specific entities with advanced features. In this paper, we develop a new method based on multivariate data embedding which further extracts higher-level character features in the embedding layer. Specifically, we extract higher-level character features by CNN and integrate them with the lattice representation to obtain enhanced embedding representation. Our model exploits the character information that can better capture the morphological and semantic information of characters to provide information support for NER. Experimental results on our marine industry dataset demonstrate the superiority of our approach. Specially, it outperforms the most previous model. And the ablation study validates the effect of the advanced features extraction.","PeriodicalId":285191,"journal":{"name":"2022 14th International Conference on Machine Learning and Computing (ICMLC)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122200646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Improved Calibration Method for the Structured Light System Based on Self-correction of Reprojection Error","authors":"Beilei Li, Peng Han, Li Peng, Kaiqing Luo, Dongmei Liu, Jian-jua Qiu","doi":"10.1145/3529836.3529905","DOIUrl":"https://doi.org/10.1145/3529836.3529905","url":null,"abstract":"Structured light 3D reconstruction is widely applied in various fields. During the calibration process of structured light 3D reconstruction, it is very important to raise the calibration accuracy on the parameters of the structured light system. In this paper, we propose a method based on re-projection error self-correction to obtain more accurate corner positions by screening the re-projection error values of DMD images. Experimental results show that this method can improve the calibration accuracy by 64.17%. We also propose an effective standard for the placement of calibration plate, which is of great significance to reduce the number of iterations of the program. According to a series of experiments based on the above standard, the number of iterations of the proposed re-projection error self-correction method is no more than 5 times. It proves that the proposed self-correction method and placement standard are feasible, the calibration process of structured light 3D reconstruction is optimized, and its calibration efficiency is improved.","PeriodicalId":285191,"journal":{"name":"2022 14th International Conference on Machine Learning and Computing (ICMLC)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116955515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SRT: Shape Reconstruction Transformer for 3D Reconstruction of Point Cloud from 2D MRI","authors":"Bowen Hu, Yanyan Shen, Guocheng Wu, Shuqiang Wang","doi":"10.1145/3529836.3529902","DOIUrl":"https://doi.org/10.1145/3529836.3529902","url":null,"abstract":"There has been some work on 3D reconstruction of organ shapes based on medical images in minimally invasive surgeries. They aim to help lift visualization limitations for procedures with poor visual environments. However, extant models are often based on deep convolutional neural networks and complex, hard-to-train generative adversarial networks; their problems about stability and real-time plague the further development of the technique. In this paper, we propose the Shape Reconstruction Transformer (SRT) based on the self-attentive mechanism and up-down-up generative structure to design fast and accurate 3D brain reconstruction models through fully connected layer networks only. Point clouds are used as the 3D representation of the model. Considering the specificity of the surgical scene, a single 2D image is limited as the input to the model. Qualitative demonstrations and quantitative experiments based on multiple metrics show the generative capability of the proposed model and demonstrate the advantages of the proposed model over other state-of-the-art models.","PeriodicalId":285191,"journal":{"name":"2022 14th International Conference on Machine Learning and Computing (ICMLC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124009572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A method of Evaluation for Small and Medium-sized Enterprises","authors":"Zhuang Kui, Xie Yu, W. Wei, Yan Chun Gang","doi":"10.1145/3529836.3529901","DOIUrl":"https://doi.org/10.1145/3529836.3529901","url":null,"abstract":"Abstract-Small and medium-sized enterprises (SMEs) have characteristics of small scale of development, poor anti-risk ability, and imperfect management. Timely and accurate evaluation of these enterprises is of great significance to corporate management, market supervision departments and social investors. Existing evaluation methods are mostly based on the internal financial information of mature enterprises, which are not suitable for small and medium-sized enterprises which have not yet achieved revenue capabilities, and have imperfect financial indicators. In this paper, we propose an evaluation model of the development trend of SMEs based on the enterprise knowledge graph. We obtain the information of these enterprises on the financial websites and then use entity recognition and relationship techniques to explore major events and relationships between enterprises, and build enterprise knowledge graph. We construct the feature sets of these enterprises, and use cluster analysis to classify and evaluate the enterprises. A graph-based neural network method is proposed to capture the deeper influence on the development trend among enterprises. The proposed method evaluates the development trend of SMEs from a new perspective.","PeriodicalId":285191,"journal":{"name":"2022 14th International Conference on Machine Learning and Computing (ICMLC)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129842549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yesid L. Lopez, D. Grimaldi, Sebastian Garcia, Jonatan Ordoez, Carlos Carrasco-Farré, Andres A. Aristizabal
{"title":"Artificial Intelligence Model to Predict the Virality of Press Articles","authors":"Yesid L. Lopez, D. Grimaldi, Sebastian Garcia, Jonatan Ordoez, Carlos Carrasco-Farré, Andres A. Aristizabal","doi":"10.1145/3529836.3529953","DOIUrl":"https://doi.org/10.1145/3529836.3529953","url":null,"abstract":"Currently, many people share news, links, or videos, without being aware of the impact they can have on people's decisions or ways of acting. A clear example, recently experienced in Colombia, corresponds to the national strike which happened at the time of this research. Due to these unexpected circumstances, colombians experienced the influence news have on decision making that can affect the country, not only economically but politically, and socially. It showed how news can generate fear in people, or even misinform, as is the case of fake news. For these reasons, it is key to determine the relevance a story can have. Predicting the impact, will allow us to pay more attention to those news that can affect people more, avoiding misinformation and fake news. However, the problem is that there is no way of predicting the impact that a press article can have. Therefore, the aim of this work is to implement a machine learning model that allows us to predict, with the best possible accuracy, the virality of online press articles (defining virality as the amount of clicks that an article receives when it is opened). In order to achieve this goal, we followed the CRISP-DM methodology, which focuses on machine learning projects. The best obtained result corresponds to the model where the core of the architecture was based on BERT, a pre-trained model, which, for a pair of press articles headlines, predicted whether the first headline would be more viral than the second one. On the other hand, the evaluation was carried out by comparing the amount of clicks for a pair of articles. For a practitioner point of view, digital marketers can use our results to select the best words for their online marketing campaign. For a theoretical point of view, our results present an innovative natural language processing approach based on one of the best breed of Neural network models (BERT).","PeriodicalId":285191,"journal":{"name":"2022 14th International Conference on Machine Learning and Computing (ICMLC)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129896923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Lads: Deep Survival Analysis for Churn Prediction Analysis in the Contract User Domain","authors":"Feng Xu, Hao Zhang, Juan Zheng, Tingxuan Zhao, X. Wang, Zhiquan Zeng","doi":"10.1145/3529836.3529853","DOIUrl":"https://doi.org/10.1145/3529836.3529853","url":null,"abstract":"Survival analysis methods are currently used in the fields of medicine, economics, biology and engineering, and focus on the relationship between covariates and the timing of events. Survival analysis makes such use of survival time that it is better suited to the user domain than other machine learning models, In order to mine information about the user's time-series characteristics, improve the ability of neural networks to process information and the efficiency of network training with large data sets. Based on deep survival analysis, the Lads (LSTM Attention Deep Survival) model was designed to combine Long Short-Term Memory networks and Attention mechanisms to predict the occurrence of events of interest in the contract user domain. The LSTM acts as a feature extractor and performs pre-processing of time-series characteristics information, while the Attention mechanism mainly enhances the interpretability of the model. The final experimental results show that the Lads survival analysis model is a better predictor in the contract user domain than survival analysis methods such as CPH (Cox Proportional Hazard Model) and DeepSurv.","PeriodicalId":285191,"journal":{"name":"2022 14th International Conference on Machine Learning and Computing (ICMLC)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127949124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"NTP-VFL - A New Scheme for Non-3rd Party Vertical Federated Learning","authors":"Di Zhao, Ming Yao, Wanwan Wang, Hao He, Xin Jin","doi":"10.1145/3529836.3529841","DOIUrl":"https://doi.org/10.1145/3529836.3529841","url":null,"abstract":"Vertical Federated Learning (FL) handles decentralized and partitioned vertically data about common entities. While most existing privacy-preserving federated learning algorithms require a third party (TP) as an intermediary data accessor to coordinate model training, we propose a new private-preserving scheme named NTP-VFL (Non-3rd Party Vertical Federated Learning). Utilizing Paillier homomorphic encryption, our algorithm strategy allows for multi-party model training and guarantees clients’ privacy against honest-but-curious adversaries. To the best of our knowledge, this is the first non- TP method that solves multi-party computation problems in Logistic Regression tasks. Our theoretical analysis and extensive experiments show outstanding performance with an average increase in efficiency of about 25% baselines with the traditional federated learning approach.","PeriodicalId":285191,"journal":{"name":"2022 14th International Conference on Machine Learning and Computing (ICMLC)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134517982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}