2017 International Conference on Asian Language Processing (IALP)最新文献

筛选
英文 中文
Filipino and english clickbait detection using a long short term memory recurrent neural network 菲律宾和英语标题党检测使用长短期记忆递归神经网络
2017 International Conference on Asian Language Processing (IALP) Pub Date : 2017-12-01 DOI: 10.1109/IALP.2017.8300597
Philogene Kyle Dimpas, Royce Vincent Po, M. J. Sabellano
{"title":"Filipino and english clickbait detection using a long short term memory recurrent neural network","authors":"Philogene Kyle Dimpas, Royce Vincent Po, M. J. Sabellano","doi":"10.1109/IALP.2017.8300597","DOIUrl":"https://doi.org/10.1109/IALP.2017.8300597","url":null,"abstract":"The Filipinos are very active users on social media which makes them the perfect candidate to gain revenue from posts, blogs, and news from their clicks. These contents usually use tempting headlines to drag users into clicking on them. Especially in the Philippines where fake news is rampant, spreading false news with the use of clickbait headlines can cause a lot of damage and confusion in the country. This research has gathered Filipino and English Headlines (English because it is one of the official languages of the Philippines) and determines if it is clickbait. A neural network architecture based on a Bidirectional Long Short Term Memory (BiLSTM) was used. The model uses Word2Vec to provide word representation and embedding from the corpora. The experimental results showed a 91.5% accuracy using the model.","PeriodicalId":183586,"journal":{"name":"2017 International Conference on Asian Language Processing (IALP)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133908881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Adapting monolingual resources for code-mixed hindi-english speech recognition 适应单语言资源的代码混合印地语-英语语音识别
2017 International Conference on Asian Language Processing (IALP) Pub Date : 2017-12-01 DOI: 10.1109/IALP.2017.8300583
Ayushi Pandey, B. M. L. Srivastava, S. Gangashetty
{"title":"Adapting monolingual resources for code-mixed hindi-english speech recognition","authors":"Ayushi Pandey, B. M. L. Srivastava, S. Gangashetty","doi":"10.1109/IALP.2017.8300583","DOIUrl":"https://doi.org/10.1109/IALP.2017.8300583","url":null,"abstract":"The paper presents an automatic speech recognition (ASR) system for code-mixed read speech in Hindi-English, developed upon the extrapolation of monolingual training resources. A monolingual Hindi acoustic model, mixed with code-mixed speech data has been implemented to train a neural network based speech recognition framework. The testing corpus also follows a similar structure: containing data from both monolingual and code-mixed speech. The shared phonetic transcription, captured in WX notation has been exploited to harness the commonality between the pooled phonesets of Hindi and English. The experiments have been conducted in two separate formulations of a trigram based language model 1) In the first experiment, the language model contains no out-of-vocabulary words, as the test utterances are included in the training of the language model. The word error rate in this case has been obtained to be 10.63 %. 2) In the second experiment, the testing utterances have been excluded from the training language model. The word error rate in this case has been obtained to be 41.66 %.","PeriodicalId":183586,"journal":{"name":"2017 International Conference on Asian Language Processing (IALP)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127879864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Semantic-frame representation for event detection on Twitter Twitter上事件检测的语义框架表示
2017 International Conference on Asian Language Processing (IALP) Pub Date : 2017-12-01 DOI: 10.1109/IALP.2017.8300594
Yanxia Qin, Yue Zhang, Min Zhang, De-Kui Zheng
{"title":"Semantic-frame representation for event detection on Twitter","authors":"Yanxia Qin, Yue Zhang, Min Zhang, De-Kui Zheng","doi":"10.1109/IALP.2017.8300594","DOIUrl":"https://doi.org/10.1109/IALP.2017.8300594","url":null,"abstract":"Unsupervised methods for detecting news events from tweet streams cluster feature representations via their burstiness, and filter out more news worthy clusters as outputs. Words, segments and tweets have been used as event feature representations, with segments being state-of-the-art due to their balance of expressive power and non-sparsity. However, segments do not convey structural event information, making output clusters difficult to understand. We investigate the use of semantic frame elements instead of segments as event features, observing not only better readability, but improvements in both precision and recall thanks to the effect of noise-filtering in frame extraction.","PeriodicalId":183586,"journal":{"name":"2017 International Conference on Asian Language Processing (IALP)","volume":"168 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117201229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Joint bi-affine parsing and semantic role labeling 联合双仿射解析和语义角色标注
2017 International Conference on Asian Language Processing (IALP) Pub Date : 2017-12-01 DOI: 10.1109/IALP.2017.8300612
Peng Shi, Yue Zhang
{"title":"Joint bi-affine parsing and semantic role labeling","authors":"Peng Shi, Yue Zhang","doi":"10.1109/IALP.2017.8300612","DOIUrl":"https://doi.org/10.1109/IALP.2017.8300612","url":null,"abstract":"We propose a simple encoder-decoder model for joint learning of dependency parsing and semantic role labeling (SRL). Experiments on CoNLL-2009 datasets show that our model is competitive with the state-of-the-art ensemble model on SRL task and significantly outperforms state-of-the-art joint models on joint evaluation metrics. Results show that with the implicit encoding, the syntax information can further improve a state-of-the-art semantic role labeler.","PeriodicalId":183586,"journal":{"name":"2017 International Conference on Asian Language Processing (IALP)","volume":"129 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114647037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Joint learning of contextal and global features for named entity disambiguation 上下文特征和全局特征的联合学习用于命名实体消歧
2017 International Conference on Asian Language Processing (IALP) Pub Date : 2017-12-01 DOI: 10.1109/IALP.2017.8300533
Bo Ma, Tonghai Jiang, Yating Yang, Xi Zhou, Lei Wang
{"title":"Joint learning of contextal and global features for named entity disambiguation","authors":"Bo Ma, Tonghai Jiang, Yating Yang, Xi Zhou, Lei Wang","doi":"10.1109/IALP.2017.8300533","DOIUrl":"https://doi.org/10.1109/IALP.2017.8300533","url":null,"abstract":"Named entity disambiguation (NED) is an important stage in Natural Language Processing (NLP) which automatically resolves mentions to entities in a given knowledge base (KB) like Wikipedia. NED is a complex and challenging problem due to the inherent ambiguity between real world mentions and the entities they refer to. Most existing studies use hand-crafted features to represent mentions, context and entities, which is labor intensive. In this paper, we address this problem by presenting a new NED model which combining local, context and global evidence. By leveraging the learned mixed dense word-level and topic-level representations and the graph-based disambiguation approach, context and global features are well captured. Experiments for NED are conducted on AIDA dataset, which show that the proposed model can obtain state-of-the-art results.","PeriodicalId":183586,"journal":{"name":"2017 International Conference on Asian Language Processing (IALP)","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126414075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correcting misuse of Japanese visually similar characters 纠正日语视觉相似字符的误用
2017 International Conference on Asian Language Processing (IALP) Pub Date : 2017-12-01 DOI: 10.1109/IALP.2017.8300545
Youichiro Ogawa, Kazuhide Yamamoto
{"title":"Correcting misuse of Japanese visually similar characters","authors":"Youichiro Ogawa, Kazuhide Yamamoto","doi":"10.1109/IALP.2017.8300545","DOIUrl":"https://doi.org/10.1109/IALP.2017.8300545","url":null,"abstract":"We present a misuse correction method of visually similar Japanese characters, Kanji, based on the language model. While methods for error correction in Japanese learners' writings have been proposed, however the misuse of visually similar Kanji has not been explored yet. We collected pairs or groups of visually similar Kanji and created the similar Kanji set. Then, candidate sentences are generated by replacing the misuse Kanji with similar Kanji extracted from the similar Kanji set, and select the candidate with the highest language model probability. The experimental results suggest that our method showed high performance in many cases of misuse. In addition, using a morphological analyzer, we developed an unknown word filter which excludes candidates that constitute unknown words when generating candidates. We have found that this filter is effective to prevent erroneous corrections.","PeriodicalId":183586,"journal":{"name":"2017 International Conference on Asian Language Processing (IALP)","volume":"127 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123071059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Isolated digit filipino speech recognition through spectrogram image classification: Towards application in a disaster preparedness participatory toolkit 通过光谱图图像分类的孤立数字菲律宾语音识别:在备灾参与式工具包中的应用
2017 International Conference on Asian Language Processing (IALP) Pub Date : 2017-12-01 DOI: 10.1109/IALP.2017.8300539
Julie Ann A. Salido, Nathaniel Oco, R. Roxas, Emmanuel Malaay, Michael Simora, R. J. Cabatic
{"title":"Isolated digit filipino speech recognition through spectrogram image classification: Towards application in a disaster preparedness participatory toolkit","authors":"Julie Ann A. Salido, Nathaniel Oco, R. Roxas, Emmanuel Malaay, Michael Simora, R. J. Cabatic","doi":"10.1109/IALP.2017.8300539","DOIUrl":"https://doi.org/10.1109/IALP.2017.8300539","url":null,"abstract":"In this paper, we present our work on isolated digit speech recognition: by classifying spectrogram images and for use in a disaster preparedness participatory toolkit. To achieve higher inclusivity, we included a voice component for a wider coverage of respondents especially those who have low literacy and those vision impaired individuals. Our methodology is through speech recognition which is a deviation from usual approaches which normally work on acoustic coefficients and features. As our initial test bed, we focused on the Filipino language — a member of the Malayo-Polynesian language family and is the national language in the Philippines. Our data covers 4,297 utterances of the Filipino digits 0 to 9 collected from 262 speakers, and divided the data into 3 parts: 70% for training, 20% for testing, and 10% for validation. We applied short-time Fourier transform on our training data and we used convolution neural networks in MatLab to classify the spectrogram images. The lowest accuracy rate during our tests is 93.02%. Analyses of the results show that background noises are the cause of the misclassified utterances which will further discussed on this paper. While the results are promising, the work can be extended to include closely related languages.","PeriodicalId":183586,"journal":{"name":"2017 International Conference on Asian Language Processing (IALP)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121076077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Named entity transliteration with sequence-to-sequence neural network 序列到序列神经网络的命名实体音译
2017 International Conference on Asian Language Processing (IALP) Pub Date : 2017-12-01 DOI: 10.1109/IALP.2017.8300621
Zhongwei Li, Chng Eng Siong, Haizhou Li
{"title":"Named entity transliteration with sequence-to-sequence neural network","authors":"Zhongwei Li, Chng Eng Siong, Haizhou Li","doi":"10.1109/IALP.2017.8300621","DOIUrl":"https://doi.org/10.1109/IALP.2017.8300621","url":null,"abstract":"Named Entities are often rare words, and their transliteration across languages has been a challenging task. In this paper, we study a novel technique that segments a named entity into a sequence sub-words or characters. We propose to learn the transliteration mechanism using a sequence-to-sequence neural network. Applying the proposed technique to personal named transliteration on LDC dataset, we show impressive results with more than 10 BLEU score improvement over the competing statistic method on the same corpus.","PeriodicalId":183586,"journal":{"name":"2017 International Conference on Asian Language Processing (IALP)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116506588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Analyzing word embeddings and improving POS tagger of tigrinya tigrinya的词嵌入分析及POS标注器改进
2017 International Conference on Asian Language Processing (IALP) Pub Date : 2017-12-01 DOI: 10.1109/IALP.2017.8300559
Yemane Tedla, Kazuhide Yamamoto
{"title":"Analyzing word embeddings and improving POS tagger of tigrinya","authors":"Yemane Tedla, Kazuhide Yamamoto","doi":"10.1109/IALP.2017.8300559","DOIUrl":"https://doi.org/10.1109/IALP.2017.8300559","url":null,"abstract":"In this paper, we analyze word embeddings for a morphologically rich language, Tigrinya. Tigrinya is a Semitic language spoken natively in Eritrea and Ethiopia by over seven million people. The unique and complex morphology of Semitic languages, which includes Arabic, Amharic, and Hebrew, is commonly known as 'root and template pattern' morphology. This morphology generates a large number of inflected forms that often cause out-of-vocabulary (OOV) challenges in language processing. This problem is more challenging for low resource languages, such as Tigrinya, that offers very little support of annotated resources. Word embedding methods, given a large raw text corpus, form semantic and syntactic vector representation of words. Therefore, we construct a new text corpus and investigate the optimal settings for generating word vectors for Tigrinya. We also utilize word embeddings to improve the performance of a Tigrinya part-of-speech tagger created from a small tagged corpus.","PeriodicalId":183586,"journal":{"name":"2017 International Conference on Asian Language Processing (IALP)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131038246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Qualitative data analysis of disaster risk reduction suggestions assisted by topic modeling and word2vec 借助主题建模和word2vec对减灾建议进行定性数据分析
2017 International Conference on Asian Language Processing (IALP) Pub Date : 2017-12-01 DOI: 10.1109/IALP.2017.8300601
Ken Gorro, J. Ancheta, Kris Capao, Nathaniel Oco, R. Roxas, M. J. Sabellano, Brandie Nonnecke, Shrestha Mohanty, Camille Crittenden, Ken Goldberg
{"title":"Qualitative data analysis of disaster risk reduction suggestions assisted by topic modeling and word2vec","authors":"Ken Gorro, J. Ancheta, Kris Capao, Nathaniel Oco, R. Roxas, M. J. Sabellano, Brandie Nonnecke, Shrestha Mohanty, Camille Crittenden, Ken Goldberg","doi":"10.1109/IALP.2017.8300601","DOIUrl":"https://doi.org/10.1109/IALP.2017.8300601","url":null,"abstract":"In this study, we examine suggestions for disaster risk reduction strategies provided by residents in selected disaster-prone areas in the Philippines. The study utilizes 976 suggestions on how their barangay can help them better prepare for a disaster. These were collected through Malasakit, an e-participation platform designed by University of California, Berkeley and National University (Philippines) to engage community participation in gathering qualitative and quantitative data. Analyses were conducted through biterm topic modeling (BTM) and word embedding using gensim. For better accuracy, data preprocessing was performed to remove irrelevant or noisy data. Based on the BTM result, we identified the following important codes: preparedness, disaster, awareness, community, help, seminars, kanal (canal), linisin (clean), drainage, garbage, basura (garbage). Analyses of the topic models show that disaster preparedness is an integral part in disaster risk reduction by improving solid waste management, providing seminars for public awareness and evacuation preparation. A word intrusion test was conducted where BTM scored 55.71% which implies strong cohesion of the words with their topics. For word embedding, we drilled down on the following words: community, preparedness, emergency, barangay (village), help, kanal (drainage), basura (garbage), awareness, seminars, information. The word2vec results has a cosine similarity score of 0.902 which implies strong relatedness of each word. The result shows that the participants give importance to community preparedness for emergency, helping the barangay in clean-up drive, and awareness through seminars and information dissemination.","PeriodicalId":183586,"journal":{"name":"2017 International Conference on Asian Language Processing (IALP)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134143855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信