Xiaohong W. Gao, X. Wen, Dong Li, Weiping Liu, Jichun Xiong, Bin Xu, Juan Liu, Heng Zhang, Xuefeng Liu
{"title":"Evaluation of GAN Architectures For Visualisation of HPV Viruses From Microscopic Images","authors":"Xiaohong W. Gao, X. Wen, Dong Li, Weiping Liu, Jichun Xiong, Bin Xu, Juan Liu, Heng Zhang, Xuefeng Liu","doi":"10.1109/ICMLA52953.2021.00137","DOIUrl":"https://doi.org/10.1109/ICMLA52953.2021.00137","url":null,"abstract":"Human papillomavirus (HPV) remains a leading cause of virus-induced cancers and has a typical size of 52 to 55nm in diameter. Hence conventional light microscopy that usually sustains a resolution at $sim$ 100nm per pixel falls short of detecting it. This study explores four state of the art generative adversarial networks (GANs) for visualising HPV. The evaluation is achieved by counting the HPV clusters that are corrected identified as well as drug treated cultured cells, i.e. no HPVs. The average sensitivity and specificity are 78.81%, 76.37%, 76.62% and 84.71% for CycleGAN, Pix2pix, ESRGAN and Pix2pixHD respectively. For ESRGAN, the training takes place by matching pairs between low and high resolution (x4) images. For the other three networks, the translation is performed from original raw images to their coloured maps that have undertaken Gaussian filtering in order to discern HPV clusters visually. Pix2pixHD appears to perform the best.","PeriodicalId":6750,"journal":{"name":"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"9 1","pages":"829-833"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73403651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lerina Aversano, M. Bernardi, Marta Cimitile, R. Pecori
{"title":"Anomaly Detection of actual IoT traffic flows through Deep Learning","authors":"Lerina Aversano, M. Bernardi, Marta Cimitile, R. Pecori","doi":"10.1109/ICMLA52953.2021.00275","DOIUrl":"https://doi.org/10.1109/ICMLA52953.2021.00275","url":null,"abstract":"The detection and classification of Internet traffic was studied in depth in the last twenty years, but this is still an open research issue as pertains the Internet of Things (IoT), mainly because real IoT traffic dataset are not very widespread. With this paper, we make public an integrated dataset, made of actual IoT network flows, built using six different network sources, which could represent a research reference for further investigations. Furthermore, we exploited it to optimize the hyper-parameters of a deep neural network and evaluate its performance for both distinguishing normal and abnormal traffic and discriminating different types of attacks, achieving very good results.","PeriodicalId":6750,"journal":{"name":"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"233 1","pages":"1736-1741"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77494507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
F. Zagatti, L. C. Silva, Lucas Nildaimon Dos Santos Silva, B. S. Sette, Helena de Medeiros Caseli, D. Lucrédio, D. F. Silva
{"title":"MetaPrep: Data preparation pipelines recommendation via meta-learning","authors":"F. Zagatti, L. C. Silva, Lucas Nildaimon Dos Santos Silva, B. S. Sette, Helena de Medeiros Caseli, D. Lucrédio, D. F. Silva","doi":"10.1109/ICMLA52953.2021.00194","DOIUrl":"https://doi.org/10.1109/ICMLA52953.2021.00194","url":null,"abstract":"Data preparation is a mandatory phase in the machine learning pipeline. The goal of data preparation is to convert noisy and disordered data into refined data that can be used by the algorithms. However, data preparation is time-consuming and requires specialized knowledge about the data and algorithms. Therefore, automating data preparation is essential to decrease the effort made by data scientists to develop satisfactory models. Despite its relevance, current AutoML platforms disregard or make simple hardcoded data preparation pipelines. Trying to fill this gap, we present a meta-learning-based recommendation system for data preparation. Our system recommends five pipelines, ranked by their relevance, making it useful for users with varying degrees of experience. Using the top-1 pipeline we demonstrated that our proposal allows a better performance of an AutoML system. Furthermore, the accuracy rates of our method were comparable to those achieved by a reinforcement-learning-based algorithm with the same goal, but it was up to two orders of magnitude faster. Moreover, we tested our method in a real-world application and evaluated its benefits and limitations in this scenario.","PeriodicalId":6750,"journal":{"name":"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"26 1","pages":"1197-1202"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81413978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Elastic distributed training with fast convergence and efficient resource utilization","authors":"Guojing Cong","doi":"10.1109/ICMLA52953.2021.00160","DOIUrl":"https://doi.org/10.1109/ICMLA52953.2021.00160","url":null,"abstract":"Distributed learning is now routinely conducted on cloud as well as dedicated clusters. Training with elastic resources brings new challenges and design choices. Prior studies focus on runtime performance and assume a static algorithmic behavior. In this work, by analyzing the impact of of resource scaling on convergence, we introduce schedules for synchronous stochastic gradient descent that proactively adapt the number of learners to reduce training time and improve convergence. Our approach no longer assumes a constant number of processors throughout training. In our experiment, distributed stochastic gradient descent with dynamic schedules and reduction momentum achieves better convergence and significant speedups over prior static ones. Numerous distributed training jobs running on cloud may benefit from our approach.","PeriodicalId":6750,"journal":{"name":"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"29 1","pages":"972-979"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84373472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Text Mining Approach To Predict Non-Adherence","authors":"Yufan Wang, Mahsa Mohaghegh","doi":"10.1109/ICMLA52953.2021.00236","DOIUrl":"https://doi.org/10.1109/ICMLA52953.2021.00236","url":null,"abstract":"Companies operating patient support programs for chronic diseases have been dedicated to enhancing treatment adherence by utilizing data from various interventions of the programs. The purpose of this paper is to examine whether the textual patient notes recorded by program coordinators can be beneficial to predict non-adherence and provide useful insights. In this paper we show work in processing and analyzing over 20,000 patient notes corresponding to 1313 Psoriasis patients using statistical analysis and several NLP methods, such as term representation, sentiment analysis and topic modelling. To build predictive models, Support Vector Machine (SVM), Random Forest (RF) and Logistic Regression (LR) are tested with different feature subsets. The best performing model is SVM with 93% accuracy and 91% recall of non-adherent. Additionally, we also present patterns to differentiate non-adherent and adherent patients in terms of completion efficiency of call objectives and uncontactable problem. Accordingly, high-risk patients can be targeted to take interventions.","PeriodicalId":6750,"journal":{"name":"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"12 1","pages":"1468-1471"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85866814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Eric Hans Messias Da Silva, J. Laterza, Marcos Paulo Pereira Da Silva, M. Ladeira
{"title":"A proposal to identify stakeholders from news for the institutional relationship management activities of an institution based on Named Entity Recognition using BERT","authors":"Eric Hans Messias Da Silva, J. Laterza, Marcos Paulo Pereira Da Silva, M. Ladeira","doi":"10.1109/ICMLA52953.2021.00251","DOIUrl":"https://doi.org/10.1109/ICMLA52953.2021.00251","url":null,"abstract":"For an organization’s institutional relationship activities, it is strategic that there is an efficient process of identification and characterization of stakeholders based on available information. Given the increasing volume of data currently available, this strategic process has commonly been supported by information technology solutions, with high potential for the use of data mining techniques such as textual analysis and natural language processing (NLP). In this work we analyzed the possibility of using a mechanism of Named Entity Recognition (NER) based on the use of Bidirectional Encoder Representations from Transformers (BERT) with Conditional Random Field (CRF), which in the future can be used as the stakeholder identification solution as a replacement of the rule based identification. We applied the proposed solution in news dataset to evaluate its performance. The experiment results showed us that pre-trained Portuguese models performed better than Multilingual ones by a good margin of at least 3.43 percentage points on Test Dataset. We also added a post processing Prediction Masking to correct invalid tagging scheme transitions to improve Micro F1 Score in both datasets ranging from 0.38 percentage points to 1.29 percentage points of improvement. Thus, we achieved the objective of improving stakeholder detection by proposing a NER model that far surpasses the naive rules-based approach of current application, which consisted of an exact text match of stakeholders based on a dictionary built manually.","PeriodicalId":6750,"journal":{"name":"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"633 1","pages":"1569-1575"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77083273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hashtags: an essential aspect of topic modeling of city events through social media.","authors":"Mikhail V. Kovalchuk, D. Nasonov","doi":"10.1109/ICMLA52953.2021.00255","DOIUrl":"https://doi.org/10.1109/ICMLA52953.2021.00255","url":null,"abstract":"Today, the city is full of digital information, which can be extremely useful in various applications. Instagram, Facebook, VKontakte, and other popular social networks contain a vast amount of valuable data. This information reflects individual stories of people and the background of the city, its events, and current activities in different areas and places of attraction. City events have essential attributes like the time of occurrence, geographical coverage, audience, and often expressed interests or topics. Owning the subject of events, you can solve a whole range of tasks - from individual recommendation systems for leisure activities for citizens and tourists to providing services in the field of food (food trucks) and transport (taxis). To determine the topic (subject) of events, it is necessary to solve two crucial tasks: to identify the events themselves from a variety of city posts and to develop an approach based on modern natural language processing methods for identifying events topics. To determine the events, we suggest an improved algorithm that we had previously developed that integrates time window and area coverage strategy. However, the focus of the work is on the analysis of different approaches to identifying topics, considering the heterogeneity of posts, both in semantic meaning and in size and structure. The focus of this paper is the importance of using post hashtags in various variations to set up more accurate models. In addition, the analysis of features for different language groups was carried out.","PeriodicalId":6750,"journal":{"name":"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"1 1","pages":"1594-1599"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82039740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Influence of Training Data on the Invertability of Neural Networks for Handwritten Digit Recognition","authors":"Antonia Adler, Michaela Geierhos, Eleanor Hobley","doi":"10.1109/ICMLA52953.2021.00122","DOIUrl":"https://doi.org/10.1109/ICMLA52953.2021.00122","url":null,"abstract":"Model inversion attacks aim to extract details of training data from a trained model, potentially revealing sensitive information about a person’s identity. To abide with protection of personal privacy requirements, it is important to understand the mechanisms that increase the privacy of training data. In this work, we systematically investigated the impact of the training data on a model’s susceptibility to model inversion attacks for models trained at the task of hand-written digit recognition with the openly available MNIST dataset. Using an optimization-based inversion approach, we studied the impacts of the quantity and diversity of training data, and the number and selection of classes on the susceptibility of models to inversion. Our model inversion attack strategy was less successful for models with a larger number of training data and greater training data diversity. Moreover, atypical training records provided additional protection against model inversion. We discovered that not every class was equally susceptible to model inversion attacks and that the inversion results of one class were changed when models were trained with a different selection of classes. However, we did not detect a clear relationship between the number of classes and a model’s susceptibility to inversion. Our study shows that the inversion susceptibility of a model depends on the training data-not only the data used to train the class that is inverted, but also the data used to train the other classes.","PeriodicalId":6750,"journal":{"name":"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"61 1","pages":"730-737"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84195234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Augmented Image Captioning Model: Incorporating Hierarchical Image Information","authors":"Nathan Funckes, Erin Carrier, Greg Wolffe","doi":"10.1109/ICMLA52953.2021.00257","DOIUrl":"https://doi.org/10.1109/ICMLA52953.2021.00257","url":null,"abstract":"Despite published accessibility standards many websites remain nan-compliant, containing images lacking accompanying textual descriptions. This leaves visually-impaired individuals unable to fully enjoy the rich wonders of the web. To help address this inequity, our research seeks to improve the ability of autonomous systems to generate accurate, relevant image descriptions. Our model enhances training efficacy by incorporating the use of category labels, high-level object superclasses, which are derivable using modern object-detection models. We show that this simple augmentation to an existing architecture results in a statistically significant improvement in caption quality.","PeriodicalId":6750,"journal":{"name":"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"34 1","pages":"1608-1614"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87834203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Paul Audin, I. Jorge, T. Mesbahi, Ahmed Samet, F. D. Beuvron, R. Boné
{"title":"Auto-encoder LSTM for Li-ion SOH prediction: a comparative study on various benchmark datasets","authors":"Paul Audin, I. Jorge, T. Mesbahi, Ahmed Samet, F. D. Beuvron, R. Boné","doi":"10.1109/ICMLA52953.2021.00246","DOIUrl":"https://doi.org/10.1109/ICMLA52953.2021.00246","url":null,"abstract":"Lithium-ion batteries are used in most battery powered devices. Today’s research on Lithium-ion batteries mainly focuses on better energy management strategies and predictive maintenance. In this paper, a new approach based on auto-encoders and long short-term memory neural networks applied to usage data (voltage, current, temperature) is used to make a State of Health prediction. Encouraging results are obtained when conducting tests on various battery ageing datasets published by Sandia National Laboratories, the Massachusetts Institute of Technology and NASA’s Prognostics Center of Excellence.","PeriodicalId":6750,"journal":{"name":"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"5 1","pages":"1529-1536"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87645903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}