{"title":"Author Identification using Traditional Machine Learning Models","authors":"Ojaswi Binnani","doi":"10.5121/csit.2022.121402","DOIUrl":"https://doi.org/10.5121/csit.2022.121402","url":null,"abstract":"The Internet has many useful resources with bountiful information at our fingertips. However, there are nefarious uses to this resource, and can be misused in cybercrime, fake emails, stealing content, plagiarism etc. In many cases, the text is anonymously written, and it is important to accurately find the author to bring the criminal to justice. The topic of author identification helps with this task, where from a set of suspect authors, the writer of a given text will be determined. We aim to create a computationally non-complex model that works to find the author of a given text. The model will not require as much data as deep learning methods. This paper focuses on the use of various stylometric and word-based features as well as different machine learning models to create a classifier that gives the best accuracy. We find that the XGBoosting algorithm performs this task with a good accuracy.","PeriodicalId":244453,"journal":{"name":"Computer Science and Information Technology Trends","volume":"2015 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128998761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gustavo Chichanoski, Maria Bernadete de Morais França
{"title":"System for Assistance in Diagnosis of Diseases Pulmonary","authors":"Gustavo Chichanoski, Maria Bernadete de Morais França","doi":"10.5121/csit.2022.121408","DOIUrl":"https://doi.org/10.5121/csit.2022.121408","url":null,"abstract":"Covid-19 is caused by the SARS-COV2 virus, where most people experience a mild to moderate respiratory crisis. To assist in diagnosing and triaging patients, this work developed a Covid-19 classification system through chest radiology images. For this purpose, the neural network models ResNet50V2, ResNet101V2, DenseNet121, DenseNet169, DenseNet201, InceptionResnetV2, VGG-16, and VGG-19 were used, comparing their precision, accuracy, recall, and specificity. For this, the images were segmented by a U-Net network, and packets of the lung image were generated, which served as input for the different classification models. Finally, the probabilistic Grad-CAM was generated to assist in the interpretation of the results of the neural networks. The segmentation obtained a Jaccard similarity of 94.30%, while for the classification the parameters of precision, specificity, accuracy, and revocation were evaluated, compared with the reference literature. Where DenseNet121 obtained an accuracy of 99.28%, while ResNet50V2 presented a specificity of 99.72%, both for Covid-19.","PeriodicalId":244453,"journal":{"name":"Computer Science and Information Technology Trends","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129246742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Giant Components in Texts Generated by a Stream","authors":"Achraf Lassoued","doi":"10.5121/csit.2022.121401","DOIUrl":"https://doi.org/10.5121/csit.2022.121401","url":null,"abstract":"Given a text stream, we associate a stream of edges in a graph G and study its large clusters by analysing the giant components of random subgraphs, obtained by sampling some edges with different distributions. For a stream of Tweets, we show that the large giant components of uniform sampled edges of the Twitter graph reflect the large clusters of G. For a stream of text, the uniform sampling is inefficient but the weighted sampling where the weight is proportional to the Word2vec similarity provides good results. Nodes of high degree of the giant components define the central words and central sentences of the text.","PeriodicalId":244453,"journal":{"name":"Computer Science and Information Technology Trends","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133754907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Optimized Method for Massive Sensitive Data Classification in an Industry Environment","authors":"Qi Zhong, Shichang Gao, Bo Yi","doi":"10.5121/csit.2022.121405","DOIUrl":"https://doi.org/10.5121/csit.2022.121405","url":null,"abstract":"In the era of big data, data is endowed with higher potential value. However, new challenges are also brought to data security, especially for the sensitive data in an industrial environment. Nowadays, with the development of industrial internet, enterprises connect each other, under which a slight carelessness may lead to the leakage of sensitive data, which will bring inestimable losses to enterprises. Hence, sensitive data classification is required as a secure way to avoid such situation. This paper presents a sensitive data classification method based on an improved ID3 decision algorithm. Firstly, we introduce the idea of attribute weighting to optimize the basic structure of traditional ID3. Secondly, we use the weighted information gain to select nodes during tree construction, which improves multi-value bias defect compared with the traditional algorithm. Experimental results show that we can achieve branching accuracy up to 97.38%.","PeriodicalId":244453,"journal":{"name":"Computer Science and Information Technology Trends","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122023643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bookshelf – A Document Categorization for Library using Text Mining","authors":"Carlo Petalver, Roderick Bandalan, G. V. Gabison","doi":"10.5121/csit.2022.121403","DOIUrl":"https://doi.org/10.5121/csit.2022.121403","url":null,"abstract":"Categorizing books and other archaic paper sources to a course reference or syllabus is a challenge in library science. The traditional way of categorization is manually done by professionals and the process of seeking and retrieving information can be frustrating. It needs intellectual tasks and conceptual analysis of a human effort to recognize similarities of items in determining the subject to the correct category. Unlike the traditional categorization process, the author implemented the concept of automatic document categorization for libraries using text mining. The project involves the creation of a web app and mobile app. This can be accomplished through the use of a supervised machine learning classification model using the Support Vector Machine algorithm that can predict the given category of data from the book or other archaic paper sources to the course syllabus they belong to.","PeriodicalId":244453,"journal":{"name":"Computer Science and Information Technology Trends","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124418990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bashir Adewale Sanusi, Emmanuel Kayode Akinshola Ogunshile, M. Aydin, Stephen Olatunde Olabiyisi, Mayowa Oyedepo Oyediran
{"title":"Development of Communicating Stream X-Machine Tool for Modeling and Generating Test Cases for Automated Teller Machine","authors":"Bashir Adewale Sanusi, Emmanuel Kayode Akinshola Ogunshile, M. Aydin, Stephen Olatunde Olabiyisi, Mayowa Oyedepo Oyediran","doi":"10.5121/csit.2022.121407","DOIUrl":"https://doi.org/10.5121/csit.2022.121407","url":null,"abstract":"The improvement of this paper takes advantage of the existing formal method called Stream XMachine by optimizing the theory and applying it to practice in a large-scale system. This optimized formal approach called Communicating Stream X-Machine (CSXM) applied in software testing based on its formal specifications to a distributed system as it points out its advantages and limits of the use of the existing formal methods to this level. However, despite the tremendous works that has been done in the software testing research area, the origin of bugs or defects in a software is still cost and takes more time to detect. Therefore, this paper has proven that the current state of art challenge is due to that lack of a formal specification of what exactly a software system is supposed to do. In this paper, CSXM principles was used for the development of Automated Teller Machine (ATM) given formal specification which outputs conforms with the implementation. Moreso, the computational strength of Remote Method Invocation (RMI) network interface in Java programming was used to provide communication between the stand-alone systems i.e., the client (ATM) and server (Bank) in the context of this paper. The results of this paper have been proven and helps software developers and researchers takes early action on bugs or defects discovered by software testing.","PeriodicalId":244453,"journal":{"name":"Computer Science and Information Technology Trends","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127732269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Prognosis of Indian Stock Price through Machine Learning Algorithm and Sentiment Analysis","authors":"Harshwardhan Patil, Rahul Patil","doi":"10.5121/csit.2022.121409","DOIUrl":"https://doi.org/10.5121/csit.2022.121409","url":null,"abstract":"Unpredictable stock price forecasting is a difficult task due to the markets' flexible and unconditionally volatile nature. views into the machine learning meadow with the impending emotive and quantitative strategy. Increasing computational capabilities, software-based statistical medium of prognosis, and inventive method of prognosticating the model are all combined. In this study, the next due day-closing prices of Indian equities SBIN and Tatamotors are used to compare the long short term memory, Random Forest, and linear regression algorithms. Utilizing the RMSE standard layout indication, the prototypes are assessed. The low value of this indicator results in the most accurate closing price prediction model when compared to others.","PeriodicalId":244453,"journal":{"name":"Computer Science and Information Technology Trends","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125489643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Progress on Protein Structure Prediction using Various Soft Computing Techniques","authors":"Niharika Chaudhary, S. Saini","doi":"10.5121/csit.2022.121410","DOIUrl":"https://doi.org/10.5121/csit.2022.121410","url":null,"abstract":"In molecular and computational biology, predicting the three-dimensional structure of a protein from its amino acid sequence has long been an outstanding goal. Soft computing techniques for solving protein structure prediction problems have been gaining the attention of researchers because of their capacity to accommodate imprecision and uncertainty in vast and complicated search spaces. This paper provides a comprehensive overview of recent protein structure prediction efforts and progress using various soft computing techniques. This paper summarises key research in the field of protein structure prediction that has been published in the recent decade. Despite significant research efforts in recent decades, there is still a lot of room for improvement in this field.","PeriodicalId":244453,"journal":{"name":"Computer Science and Information Technology Trends","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121316315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Anchor Density Minimization for Localization in Wireless Sensor Network (WSN)","authors":"N. Zaarour, N. Hakem, NahiKandil","doi":"10.5121/csit.2021.112201","DOIUrl":"https://doi.org/10.5121/csit.2021.112201","url":null,"abstract":"In wireless sensor networks (WSN) high-accuracy localization is crucial for both of WNS management and many other numerous location-based applications. Only a subset of nodes in a WSN is deployed as anchor nodes with their locations a priori known to localize unknown sensor nodes. The accuracy of the estimated position depends on the number of anchor nodes. Obviously, increasing the number or ratio of anchors will undoubtedly increase the localization accuracy. However, it severely constrains the flexibility of WSN deployment while impacting costs and energy. This paper aims to drastically reduce anchor number or ratio of anchor in WSN deployment and ensures a good trade-off for localization accuracy. Hence, this work presents an approach to decrease the number of anchor nodes without compromising localization accuracy. Assuming a random string WSN topology, the results in terms of anchor rates and localization accuracy are presented and show significant reduction in anchor deployment rates from 32% to 2%.","PeriodicalId":244453,"journal":{"name":"Computer Science and Information Technology Trends","volume":"165 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121702514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Emotions in Virtual Reality","authors":"Darlene Barker, H. Levkowitz","doi":"10.5121/csit.2021.112204","DOIUrl":"https://doi.org/10.5121/csit.2021.112204","url":null,"abstract":"One of the first senses we learn about at birth is touch, and the one sense that can deepen our experience of many situations is touch. In this paper we propose the use of emotions including touch within virtual reality (VR) to create a simulated closeness that currently can only be achieved with in-person interactions and communications. With the simulation of nonverbal cues, we can enhance a conversation or interaction in VR. Using haptic devices to deliver the simulation of touch between users via sensors and machine learning for emotion recognition based on data collected; all working towards simulated closeness in communication despite distance or being in VR. We present a direction for further research on how to simulate inperson communication within VR with the use of emotion recognition and touch to achieve a close-to-real interaction.","PeriodicalId":244453,"journal":{"name":"Computer Science and Information Technology Trends","volume":"198 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131979003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}