{"title":"A Sinhala and Tamil Extension to Generic Environment for Context-aware Correction","authors":"Lakshikka Sithamparanathan, T. Uthayasanker","doi":"10.1109/NITC48475.2019.9114399","DOIUrl":"https://doi.org/10.1109/NITC48475.2019.9114399","url":null,"abstract":"There are several types of research available on spell checkers for European languages and Indian languages. However, low resourced languages like Tamil & Sinhala have limited research in this problem space, maybe, because of its highly inflectional and morphologically rich nature. There is no fully functional context-aware spell-checking system, especially as an open source. A Generic Environment for context-aware spell correction approach is extended for resource-scarce languages: Sinhala and Tamil in this paper. Experimental results show that our system detects the error in spelling well and provides the most suitable suggestions for correcting the misspelled words with a minimum of 85% accuracy for Tamil and 70% for the Sinhala Language. This is the first ever context-aware spell corrector for the Sinhala language. Compared to prior Tamil language context-aware spell correctors this leaps in 1) modularized architecture and 2) increased coverage and accuracy. Moreover, this study produced a Tamil and Sinhala spell correction benchmark dataset. Both the dataset and the tools are available for public use.","PeriodicalId":386923,"journal":{"name":"2019 National Information Technology Conference (NITC)","volume":"36 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131328803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Demonstration and Validation of an Advanced Symptom Checker","authors":"A. Perera","doi":"10.1109/NITC48475.2019.9114500","DOIUrl":"https://doi.org/10.1109/NITC48475.2019.9114500","url":null,"abstract":"Symptom checkers are becoming very popular. Yet a recent review on them concluded that the performance of most of these are suboptimal despite much hype. Taking into account lessons learned from the development and deployment of some of the most current popular symptom checkers available in the Internet, an advanced symptom checker was designed and developed with the following features. They were - to diagnose the common problems encountered in the medical self-care in the community, to provide evidence based treatment mostly with over the counter drugs and to provide timely and appropriate referral to other specialty physicians. The proposed solution will be referred to as Computer Assisted Evaluation of Symptoms based symptom checker (CAMEOS - CHECKER).","PeriodicalId":386923,"journal":{"name":"2019 National Information Technology Conference (NITC)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129193932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Measuring Software Integration Effort: Identifying Factors Affecting Integration of Software Systems","authors":"H. Edirisinghe, S. Thelijjagoda, D. Nawinna","doi":"10.1109/NITC48475.2019.9114511","DOIUrl":"https://doi.org/10.1109/NITC48475.2019.9114511","url":null,"abstract":"This paper investigates the key challenges affecting the software integration success. An exploratory factor analysis was carried out to statistically derive significant factors that impact system integration. The analysis was done based on ten aspects of system integration challenges derived from literature and interviews. The study was conducted in Sri Lanka using qualitative and quantitative tools. Ten aspects of integration challenges were divided into three groups, namely Pre Integration challenges; Ongoing integration challenges; and Post integration challenges. The resulting measurement model could be used by software vendors to assess the effort required for software integration in projects.","PeriodicalId":386923,"journal":{"name":"2019 National Information Technology Conference (NITC)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132406400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhanced Tokenizer for Sinhala Language","authors":"S. Y. Senanayake, K. Kariyawasam, P. Haddela","doi":"10.1109/NITC48475.2019.9114420","DOIUrl":"https://doi.org/10.1109/NITC48475.2019.9114420","url":null,"abstract":"Tokenization process plays a prominent role in natural language processing (NLP) applications. It chops the content into the smallest meaningful units. However, there is a limited number of tokenization approaches for Sinhala language. Standard analyzer in apache software library and natural language toolkit (NLTK) are the main existing approaches to tokenize Sinhala language content. Since these are language independent, there are some limitations when it applies to Sinhala. Our proposed Sinhala tokenizer is mainly focusing on punctuation-based tokenization. It precisely tokenizes the content by identifying the use case of punctuation mark. In our research, we have proved that our punctuation-based tokenization approach outperforms the word tokenization in existing approaches.","PeriodicalId":386923,"journal":{"name":"2019 National Information Technology Conference (NITC)","volume":"460 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130605991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dynamic Stopword Removal for Sinhala Language","authors":"A.A.V.A Jayaweera, Y.N Senanayake, P. Haddela","doi":"10.1109/NITC48475.2019.9114476","DOIUrl":"https://doi.org/10.1109/NITC48475.2019.9114476","url":null,"abstract":"In the modern era of information retrieval, text summarization, text analytics, extraction of redundant (noise) words that contain a little information with low or no semantic meaning must be filtered out. Such words are known as stopwords. There are more than 40 languages which have identified their language specific stopwords. Most researchers use various techniques to identify their language specific stopword lists. But most of them try to define a magical cut-off point to the list, which they identify without any proof. In this research, the focus is to prove that the cut-off point depends on the source data and the machine learning algorithm, which will be proved by using Newton's iteration method of root finding algorithm. To achieve this, the research focuses on creating a stopword list for Sinhala language using the term frequency-based method by processing more than 90000 Sinhala documents. This paper presents the results received and new datasets prepared for text preprocessing.","PeriodicalId":386923,"journal":{"name":"2019 National Information Technology Conference (NITC)","volume":"130 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128607929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Jayakody, S. Lokuliyana, V.N.N. Weerawardene, K.D.P.S. Somathilake, A.M.D.D.U. Ishara
{"title":"Context Rich Hybrid Navigation Using WiFi and Geomagnetic Sensors in Smartphones and Map Generation Using Lidar","authors":"A. Jayakody, S. Lokuliyana, V.N.N. Weerawardene, K.D.P.S. Somathilake, A.M.D.D.U. Ishara","doi":"10.1109/NITC48475.2019.9114502","DOIUrl":"https://doi.org/10.1109/NITC48475.2019.9114502","url":null,"abstract":"Navigation systems perform a huge role in traveling component of life. Most importantly it helps people to get to places even in foreign or unfamiliar environments. This research introduces a way of mapping environments with less effort, which shows that mapping an indoor environment is an easy task that could be performed by any tech savvy individual. This has been done by examining several projects and researches conducted by various personnel and organizations including NASA. It has become clear that the technology ‘LIDAR,’ is clearly feasible for the requirement of indoor map generation. A software was later built to accommodate the device which is built using LIDAR and to give the user a better experience in map generation. The software helps to overcome the limitations that are imposed by the device. The overall product with the device and software integrated provides an ideal low-budget solution for the users. The proposed system service features three highly desirable properties, namely accuracy, scalability, and crowdsourcing. IPS is implemented with a set of crowdsourcing-supportive mechanisms to handle the collective amount of raw data, filter incorrect user contributions and exploit Wi-Fi data from diverse mobile devices. Furthermore, it uses a big-data architecture for efficient storage and retrieval of localization and mapping data. In this research, the service relies on the sensitive data collected by smartphones (Wi-Fi signal strength and geomagnetic measurements) to deliver reliable indoor geolocation information.","PeriodicalId":386923,"journal":{"name":"2019 National Information Technology Conference (NITC)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114634216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Himeshika Abayaratne, Shalindri Perera, Erandi De Silva, Pramadhi Atapattu, M. Wijesundara
{"title":"A Real-Time Cardiac Arrhythmia Classifier","authors":"Himeshika Abayaratne, Shalindri Perera, Erandi De Silva, Pramadhi Atapattu, M. Wijesundara","doi":"10.1109/NITC48475.2019.9114464","DOIUrl":"https://doi.org/10.1109/NITC48475.2019.9114464","url":null,"abstract":"Cardiovascular diseases (CVD) have increased drastically among Non-Communicable diseases, which have peaked over the past recent years. In 2018, around 17.9 million which is an estimated 31% of the people have died worldwide due to CVDs. A novel machine learning algorithm for continuous monitoring, identification and classification of cardiac arrhythmias from Electrocardiogram (ECG) data is presented here. The proposed solution has two stages where the first stage is a rule based cardiac abnormality identification which has an individual 97.55% ± 0.3% of accuracy (Acc) for a dataset of 705,000 and the second stage is a Neural Network (NN) based classification model which is trained and tested to identify 15 different classes recommended by ANSI/AAMI standard [1], and has 97.1% of individual accuracy for MIT-BIH Arrhythmia dataset [2] of 96265 beat samples. The combined real-time cardiac arrhythmia classifier is parallelized with CUDA in order to utilize the GPU and increase the execution speed by 4.86 times.","PeriodicalId":386923,"journal":{"name":"2019 National Information Technology Conference (NITC)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126062708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Categorical Classification Approach for Identifying Multi-SIM Users from Call Detail Records","authors":"Charith Soysa, Savindi Karunathilaka, Amali Matharaarachchi, Himashi Rodrigo, Uthayasanker Thayasivam","doi":"10.1109/NITC48475.2019.9114444","DOIUrl":"https://doi.org/10.1109/NITC48475.2019.9114444","url":null,"abstract":"In this paper, we present a categorical classification approach for identifying multi-SIM users from Call Detail Records. Multi-SIM user classification is an unexplored domain in research literature and remains a challenging problem due to the diversity in telecom user population. This paper presents a subpopulation-based classification approach which incorporates this variety into the model, which is able to identify multi-SIM usage with higher precision and recall. A comparison of our approach to other baseline approaches (Gaussian Naive Bayes, Bernoulli Naive Bayes & Linear SVC) shows the effectiveness of subsample modelling for detecting multi-SIM usage. Additionally, we present an empirical study with which we quantify the contribution of oversampling and feature selection for multi-SIM detection. Further, using feature importance, we are able to identify possible rationales behind multi-SIM usage.","PeriodicalId":386923,"journal":{"name":"2019 National Information Technology Conference (NITC)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124602617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lakshika Sammani Chandradeva, I. Jayasooriya, A. Aponso
{"title":"Fraud Detection Solution for Monetary Transactions with Autoencoders","authors":"Lakshika Sammani Chandradeva, I. Jayasooriya, A. Aponso","doi":"10.1109/NITC48475.2019.9114519","DOIUrl":"https://doi.org/10.1109/NITC48475.2019.9114519","url":null,"abstract":"Fraud has turned into a trillion-dollar industry which may lead to risk of financial loss as well as the loss of customers' and stakeholders' confidence on financial organizations. Nowadays, online transactions, mobile wallets and payment card transactions are becoming more popular within society. With the growth of such cashless transactions, the number of fraudulent activities in the world is also increasing. According to the current global economic context, efforts being made to detect and prevent frauds are also increasing. Having an effective financial transaction fraud detection system could save trillions of dollars from fraudulent activities. Supervised machine learning based fraud detection solution is the trending mechanism used in fraud detection solutions. Nevertheless, such supervised machine learning based solutions need a labelled dataset in order to train the machine learning model. The reason for the existence of current fraudulent actions is that labelled datasets are hard to find in real-world environments, and if such labelled datasets are available, thereafter such fraud detection solutions would detect fraudulent patterns based on the fraudulent patterns of the fraudulent events in the training labelled dataset. Therefore, there is an extensive business requirement of having a fraud detection solution which can be trained using a raw financial transaction dataset, in other words using an unlabelled dataset which is commonly available in financial transaction systems in order to detect accurate fraudulent events. Test results obtained for the synthetically generated dataset shows that autoencoder is able to detect fraudulent transaction events with 83% of AUC score which represents high capability of binary classification as fraudulent or genuine transactions.","PeriodicalId":386923,"journal":{"name":"2019 National Information Technology Conference (NITC)","volume":"IM-25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126609818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"UniOntBot: Semantic Natural Language Generation based API approach for Chatbot Communication","authors":"Lakindu Gunasekara, Kaneeka. Vidanage","doi":"10.1109/NITC48475.2019.9114440","DOIUrl":"https://doi.org/10.1109/NITC48475.2019.9114440","url":null,"abstract":"Natural Language Generation is a sub task of the natural language processing where the machine represents texts in human understandable language. Although there have been various researches carried out for Natural Language Generation since the 1970s, there are only few available work carried out with semantic technologies compared to linguistic surface oriented structures. This paper presents the available work which combine semantic technologies with Natural Language Generation, and it identifies opportunities and drawbacks of such systems. The research is carried out on how to use semantic natural language generation with chatbots with lower computational cost and on the ability to reuse it for similar domains with less coding on natural language generation component for small scale level domains. Additionally, a new architecture for chatbot using an ontology and a new domain ontology with natural language resource bind are contributed by the researchers.","PeriodicalId":386923,"journal":{"name":"2019 National Information Technology Conference (NITC)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122231505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}