Atomic Data and Nuclear Data Tables最新文献

筛选
英文 中文
Classification of Cocoa Pod Maturity Using Similarity Tools on an Image Database: Comparison of Feature Extractors and Color Spaces 在图像数据库上使用相似工具对可可荚成熟度进行分类:特征提取器和颜色空间的比较
IF 1.8 3区 物理与天体物理
Atomic Data and Nuclear Data Tables Pub Date : 2023-05-30 DOI: 10.3390/data8060099
Kacoutchy Jean Ayikpa, Diarra Mamadou, P. Gouton, Kablan Jérôme Adou
{"title":"Classification of Cocoa Pod Maturity Using Similarity Tools on an Image Database: Comparison of Feature Extractors and Color Spaces","authors":"Kacoutchy Jean Ayikpa, Diarra Mamadou, P. Gouton, Kablan Jérôme Adou","doi":"10.3390/data8060099","DOIUrl":"https://doi.org/10.3390/data8060099","url":null,"abstract":"Côte d’Ivoire, the world’s largest cocoa producer, faces the challenge of quality production. Immature or overripe pods cannot produce quality cocoa beans, resulting in losses and an unprofitable harvest. To help farmer cooperatives determine the maturity of cocoa pods in time, our study evaluates the use of automation tools based on similarity measures. Although standard techniques, such as visual inspection and weighing, are commonly used to identify the maturity of cocoa pods, the use of automation tools based on similarity measures can improve the efficiency and accuracy of this process. We set up a database of cocoa pod images and used two feature extractors: one based on convolutional neural networks (CNN), in particular, MobileNet, and the other based on texture analysis using a gray-level co-occurrence matrix (GLCM). We evaluated the impact of different color spaces and feature extraction methods on our database. We used mathematical similarity measurement tools, such as the Euclidean distance, correlation distance, and chi-square distance, to classify cocoa pod images. Our experiments showed that the chi-square distance measurement offered the best accuracy, with a score of 99.61%, when we used GLCM as a feature extractor and the Lab color space. Using automation tools based on similarity measures can improve the efficiency and accuracy of cocoa pod maturity determination. The results of our experiments prove that the chi-square distance is the most appropriate measure of similarity for this task.","PeriodicalId":55580,"journal":{"name":"Atomic Data and Nuclear Data Tables","volume":"45 1","pages":"99"},"PeriodicalIF":1.8,"publicationDate":"2023-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85680359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring the Evolution of Sentiment in Spanish Pandemic Tweets: A Data Analysis Based on a Fine-Tuned BERT Architecture 探索西班牙流行病推文中情绪的演变:基于微调BERT架构的数据分析
IF 1.8 3区 物理与天体物理
Atomic Data and Nuclear Data Tables Pub Date : 2023-05-29 DOI: 10.3390/data8060096
Carlos Miranda, G. Sanchez-Torres, Dixon Salcedo Morillo
{"title":"Exploring the Evolution of Sentiment in Spanish Pandemic Tweets: A Data Analysis Based on a Fine-Tuned BERT Architecture","authors":"Carlos Miranda, G. Sanchez-Torres, Dixon Salcedo Morillo","doi":"10.3390/data8060096","DOIUrl":"https://doi.org/10.3390/data8060096","url":null,"abstract":"The COVID-19 pandemic has had a significant impact on various aspects of society, including economic, health, political, and work-related domains. The pandemic has also caused an emotional effect on individuals, reflected in their opinions and comments on social media platforms, such as Twitter. This study explores the evolution of sentiment in Spanish pandemic tweets through a data analysis based on a fine-tuned BERT architecture. A total of six million tweets were collected using web scraping techniques, and pre-processing was applied to filter and clean the data. The fine-tuned BERT architecture was utilized to perform sentiment analysis, which allowed for a deep-learning approach to sentiment classification. The analysis results were graphically represented based on search criteria, such as “COVID-19” and “coronavirus”. This study reveals sentiment trends, significant concerns, relationship with announced news, public reactions, and information dissemination, among other aspects. These findings provide insight into the emotional impact of the COVID-19 pandemic on individuals and the corresponding impact on social media platforms.","PeriodicalId":55580,"journal":{"name":"Atomic Data and Nuclear Data Tables","volume":"46 1","pages":"96"},"PeriodicalIF":1.8,"publicationDate":"2023-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75047136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Dataset of Scalp EEG Recordings of Alzheimer's Disease, Frontotemporal Dementia and Healthy Subjects from Routine EEG 阿尔茨海默病、额颞叶痴呆和健康受试者常规脑电记录的头皮脑电数据集
IF 1.8 3区 物理与天体物理
Atomic Data and Nuclear Data Tables Pub Date : 2023-05-27 DOI: 10.3390/data8060095
Andreas Miltiadous, Katerina D. Tzimourta, Theodora Afrantou, P. Ioannidis, N. Grigoriadis, D. Tsalikakis, P. Angelidis, M. Tsipouras, E. Glavas, N. Giannakeas, A. Tzallas
{"title":"A Dataset of Scalp EEG Recordings of Alzheimer's Disease, Frontotemporal Dementia and Healthy Subjects from Routine EEG","authors":"Andreas Miltiadous, Katerina D. Tzimourta, Theodora Afrantou, P. Ioannidis, N. Grigoriadis, D. Tsalikakis, P. Angelidis, M. Tsipouras, E. Glavas, N. Giannakeas, A. Tzallas","doi":"10.3390/data8060095","DOIUrl":"https://doi.org/10.3390/data8060095","url":null,"abstract":"Recently, there has been a growing research interest in utilizing the electroencephalogram (EEG) as a non-invasive diagnostic tool for neurodegenerative diseases. This article provides a detailed description of a resting-state EEG dataset of individuals with Alzheimer’s disease and frontotemporal dementia, and healthy controls. The dataset was collected using a clinical EEG system with 19 scalp electrodes while participants were in a resting state with their eyes closed. The data collection process included rigorous quality control measures to ensure data accuracy and consistency. The dataset contains recordings of 36 Alzheimer’s patients, 23 frontotemporal dementia patients, and 29 healthy age-matched subjects. For each subject, the Mini-Mental State Examination score is reported. A monopolar montage was used to collect the signals. A raw and preprocessed EEG is included in the standard BIDS format. For the preprocessed signals, established methods such as artifact subspace reconstruction and an independent component analysis have been employed for denoising. The dataset has significant reuse potential since Alzheimer’s EEG Machine Learning studies are increasing in popularity and there is a lack of publicly available EEG datasets. The resting-state EEG data can be used to explore alterations in brain activity and connectivity in these conditions, and to develop new diagnostic and treatment approaches. Additionally, the dataset can be used to compare EEG characteristics between different types of dementia, which could provide insights into the underlying mechanisms of these conditions.","PeriodicalId":55580,"journal":{"name":"Atomic Data and Nuclear Data Tables","volume":"1 1","pages":"95"},"PeriodicalIF":1.8,"publicationDate":"2023-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87165190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Electron scattering cross sections for the ground and excited states of tin 锡基态和激发态的电子散射截面
IF 1.8 3区 物理与天体物理
Atomic Data and Nuclear Data Tables Pub Date : 2023-05-26 DOI: 10.1016/j.adt.2023.101586
Haadi Umer , Yuri Ralchenko , Igor Bray , Dmitry V. Fursa
{"title":"Electron scattering cross sections for the ground and excited states of tin","authors":"Haadi Umer ,&nbsp;Yuri Ralchenko ,&nbsp;Igor Bray ,&nbsp;Dmitry V. Fursa","doi":"10.1016/j.adt.2023.101586","DOIUrl":"10.1016/j.adt.2023.101586","url":null,"abstract":"<div><p><span>A comprehensive set of cross sections for electron scattering from the ground and first four excited states of tin has been calculated using the Relativistic Convergent Close-Coupling method. Elastic scattering, momentum transfer, total scattering, and total-inelastic scattering cross sections have been produced for the </span><span><math><mrow><mn>5</mn><msup><mrow><mi>p</mi></mrow><mrow><mn>2</mn></mrow></msup></mrow></math></span>\u0000<span><math><mrow><msup><mrow></mrow><mrow><mn>3</mn></mrow></msup><msub><mrow><mi>P</mi></mrow><mrow><mn>0</mn><mo>,</mo><mn>1</mn><mo>,</mo><mn>2</mn></mrow></msub></mrow></math></span>, <span><math><mrow><msup><mrow></mrow><mrow><mn>1</mn></mrow></msup><msub><mrow><mi>D</mi></mrow><mrow><mn>2</mn></mrow></msub></mrow></math></span> and <span><math><mrow><msup><mrow></mrow><mrow><mn>1</mn></mrow></msup><msub><mrow><mi>S</mi></mrow><mrow><mn>0</mn></mrow></msub></mrow></math></span> states of atomic tin over a projectile energy range of 0.1 eV to 1000 eV. Over the same projectile energy range, state-resolved cross sections for excitations to the <span><math><mrow><mn>5</mn><msup><mrow><mi>p</mi></mrow><mrow><mn>2</mn></mrow></msup></mrow></math></span>, <span><math><mrow><mn>5</mn><mi>p</mi><mn>6</mn><mi>s</mi></mrow></math></span>, <span><math><mrow><mn>5</mn><mi>p</mi><mn>5</mn><mi>d</mi></mrow></math></span> and <span><math><mrow><mn>5</mn><mi>p</mi><mn>6</mn><mi>p</mi></mrow></math></span> manifolds from the ground and first four excited states of tin are presented. Total single-ionisation cross sections have been calculated which account for the direct ionisation of electrons in the valence <span><math><mrow><mn>5</mn><mi>p</mi></mrow></math></span> and closed <span><math><mrow><mn>5</mn><mi>s</mi></mrow></math></span><span><span> shells, as well as indirect contributions from excitation auto-ionisation. These ionisation cross sections are presented for projectile energies up to 1000 eV. Maxwellian rate coefficients have been calculated for all studied transitions over </span>electron temperatures ranging from 0.5 eV to 200 eV and fitted with simple formulas. The fit coefficients are tabulated for use in modelling applications.</span></p></div>","PeriodicalId":55580,"journal":{"name":"Atomic Data and Nuclear Data Tables","volume":"154 ","pages":"Article 101586"},"PeriodicalIF":1.8,"publicationDate":"2023-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44142778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
MicroRNA Profiling of Fresh Lung Adenocarcinoma and Adjacent Normal Tissues from Ten Korean Patients Using miRNA-Seq 使用miRNA-Seq分析10例韩国患者新鲜肺腺癌和邻近正常组织的MicroRNA
IF 1.8 3区 物理与天体物理
Atomic Data and Nuclear Data Tables Pub Date : 2023-05-25 DOI: 10.3390/data8060094
Jihye Park, S. Na, Jung-Sook Yoon, Seoree Kim, S. Chun, Jae Jun Kim, Young-Du Kim, Young‐Ho Ahn, Keunsoo Kang, Y. Ko
{"title":"MicroRNA Profiling of Fresh Lung Adenocarcinoma and Adjacent Normal Tissues from Ten Korean Patients Using miRNA-Seq","authors":"Jihye Park, S. Na, Jung-Sook Yoon, Seoree Kim, S. Chun, Jae Jun Kim, Young-Du Kim, Young‐Ho Ahn, Keunsoo Kang, Y. Ko","doi":"10.3390/data8060094","DOIUrl":"https://doi.org/10.3390/data8060094","url":null,"abstract":"MicroRNA transcriptomes from fresh tumors and the adjacent normal tissues were profiled in 10 Korean patients diagnosed with lung adenocarcinoma using a next-generation sequencing (NGS) technique called miRNA-seq. The sequencing quality was assessed using FastQC, and low-quality or adapter-contaminated portions of the reads were removed using Trim Galore. Quality-assured reads were analyzed using miRDeep2 and Bowtie. The abundance of known miRNAs was estimated using the reads per million (RPM) normalization method. Subsequently, using DESeq2 and Wx, we identified differentially expressed miRNAs and potential miRNA biomarkers for lung adenocarcinoma tissues compared to adjacent normal tissues, respectively. We defined reliable miRNA biomarkers for lung adenocarcinoma as those detected by both methods. The miRNA-seq data are available in the Gene Expression Omnibus (GEO) database under accession number GSE196633, and all processed data can be accessed via the Mendeley data website.","PeriodicalId":55580,"journal":{"name":"Atomic Data and Nuclear Data Tables","volume":"98 1","pages":"94"},"PeriodicalIF":1.8,"publicationDate":"2023-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89110342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Target Screening of Chemicals of Emerging Concern (CECs) in Surface Waters of the Swedish West Coast 瑞典西海岸地表水中新出现的关注化学品(CECs)的目标筛选
IF 1.8 3区 物理与天体物理
Atomic Data and Nuclear Data Tables Pub Date : 2023-05-25 DOI: 10.3390/data8060093
P. Inostroza, Eric Carmona, Å. Arrhenius, M. Krauss, W. Brack, T. Backhaus
{"title":"Target Screening of Chemicals of Emerging Concern (CECs) in Surface Waters of the Swedish West Coast","authors":"P. Inostroza, Eric Carmona, Å. Arrhenius, M. Krauss, W. Brack, T. Backhaus","doi":"10.3390/data8060093","DOIUrl":"https://doi.org/10.3390/data8060093","url":null,"abstract":"The aquatic environment faces increasing threats from a variety of unregulated organic chemicals originating from human activities, collectively known as chemicals of emerging concern (CECs). These include pharmaceuticals, personal-care products, pesticides, surfactants, industrial chemicals, and their transformation products. CECs enter aquatic environments through various sources, including effluents from wastewater treatment plants, industrial facilities, runoff from agricultural and residential areas, as well as accidental spills. Data on the occurrence of CECs in the marine environment are scarce, and more information is needed to assess the chemical and ecological status of water bodies, and to prioritize toxic chemicals for further studies or risk assessment. In this study, we describe a monitoring campaign targeting CECs in surface waters at the Swedish west coast using, for the first time, an on-site large volume solid phase extraction (LVSPE) device. We detected up to 80 and 227 CECs in marine sites and the wastewater treatment plant (WWTP) effluent, respectively. The dataset will contribute to defining pollution fingerprints and assessing the chemical status of marine and freshwater systems affected by industrial hubs, agricultural areas, and the discharge of urban wastewater.","PeriodicalId":55580,"journal":{"name":"Atomic Data and Nuclear Data Tables","volume":"1 1","pages":"93"},"PeriodicalIF":1.8,"publicationDate":"2023-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90874411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Low-Dose Radiation-Induced Transcriptomic Changes in Diabetic Aortic Endothelial Cells 低剂量辐射诱导的糖尿病主动脉内皮细胞转录组学改变
IF 1.8 3区 物理与天体物理
Atomic Data and Nuclear Data Tables Pub Date : 2023-05-18 DOI: 10.3390/data8050092
Jihye Park, Kyuho Kang, Y. Son, Kwang Seok Kim, Keunsoo Kang, Hae-June Lee
{"title":"Low-Dose Radiation-Induced Transcriptomic Changes in Diabetic Aortic Endothelial Cells","authors":"Jihye Park, Kyuho Kang, Y. Son, Kwang Seok Kim, Keunsoo Kang, Hae-June Lee","doi":"10.3390/data8050092","DOIUrl":"https://doi.org/10.3390/data8050092","url":null,"abstract":"Low-dose radiation refers to exposure to ionizing radiation at levels that are generally considered safe and not expected to cause immediate health effects. However, the effects of low-dose radiation are still not fully understood, and research in this area is ongoing. In this study, we investigated the alterations in gene expression profiles of human aortic endothelial cells (HAECs) and diabetic human aortic endothelial cells (T2D-HAECs) derived from patients with type 2 diabetes. To this end, we used RNA-seq to profile the transcriptomes of cells exposed to varying doses of low-dose radiation (0.1 Gy, 0.5 Gy, and 2.0 Gy) and compared them to a control group with no radiation exposure. Differentially expressed genes and enriched pathways were identified using the DESeq2 and gene set enrichment analysis (GSEA) methods, respectively. The data generated in this study are publicly available through the gene expression omnibus (GEO) database with the accession number GSE228572. This study provides a valuable resource for examining the effects of low-dose radiation on HAECs and T2D-HAECs, thereby contributing to a better understanding of the potential human health risks associated with low-dose radiation exposure.","PeriodicalId":55580,"journal":{"name":"Atomic Data and Nuclear Data Tables","volume":"8 1","pages":"92"},"PeriodicalIF":1.8,"publicationDate":"2023-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80145840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Set of Geophysical Fields for Modeling of the Lithosphere Structure and Dynamics in the Russian Arctic Zone 俄罗斯北极地区岩石圈结构和动力学模拟的一组地球物理场
IF 1.8 3区 物理与天体物理
Atomic Data and Nuclear Data Tables Pub Date : 2023-05-14 DOI: 10.3390/data8050091
A. Soloviev, A. Petrunin, Sofia Gvozdik, R. Sidorov
{"title":"A Set of Geophysical Fields for Modeling of the Lithosphere Structure and Dynamics in the Russian Arctic Zone","authors":"A. Soloviev, A. Petrunin, Sofia Gvozdik, R. Sidorov","doi":"10.3390/data8050091","DOIUrl":"https://doi.org/10.3390/data8050091","url":null,"abstract":"This paper presents a set of various geological and geophysical data for the Arctic zone, including some detailed models for the eastern part of the Russian Arctic zone. This hard-to-access territory has a complex geological structure, which is poorly studied by direct geophysical methods. Therefore, these data can be used in an integrative analysis for different purposes. These are the gravity field, heat flow, and various seismic tomography models. The gravity field data include several reductions calculated during our preceding studies, which are more appropriate for the study of the Earth’s interiors than the initial free air anomalies. Specifically, these are the Bouguer, isostatic, and decompensative gravity anomalies. A surface heat flow map included in the dataset is based on a joint inversion of multiple geophysical data constrained by the observations from the International Heat Flow Commission catalog. Available seismic tomography models were analyzed to select the best one for further investigation. We provide the models for the sedimentary cover and the Moho depth, which are significantly improved compared to the existing ones. The database provides a basis for qualitative and quantitative analysis of the region.","PeriodicalId":55580,"journal":{"name":"Atomic Data and Nuclear Data Tables","volume":"1 1","pages":"91"},"PeriodicalIF":1.8,"publicationDate":"2023-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84657817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Efficient Deep Learning for Thai Sentiment Analysis 泰国语情感分析的高效深度学习
IF 1.8 3区 物理与天体物理
Atomic Data and Nuclear Data Tables Pub Date : 2023-05-13 DOI: 10.3390/data8050090
Nattawat Khamphakdee, Pusadee Seresangtakul
{"title":"An Efficient Deep Learning for Thai Sentiment Analysis","authors":"Nattawat Khamphakdee, Pusadee Seresangtakul","doi":"10.3390/data8050090","DOIUrl":"https://doi.org/10.3390/data8050090","url":null,"abstract":"The number of reviews from customers on travel websites and platforms is quickly increasing. They provide people with the ability to write reviews about their experience with respect to service quality, location, room, and cleanliness, thereby helping others before booking hotels. Many people fail to consider hotel bookings because the numerous reviews take a long time to read, and many are in a non-native language. Thus, hotel businesses need an efficient process to analyze and categorize the polarity of reviews as positive, negative, or neutral. In particular, low-resource languages such as Thai have greater limitations in terms of resources to classify sentiment polarity. In this paper, a sentiment analysis method is proposed for Thai sentiment classification in the hotel domain. Firstly, the Word2Vec technique (the continuous bag-of-words (CBOW) and skip-gram approaches) was applied to create word embeddings of different vector dimensions. Secondly, each word embedding model was combined with deep learning (DL) models to observe the impact of each word vector dimension result. We compared the performance of nine DL models (CNN, LSTM, Bi-LSTM, GRU, Bi-GRU, CNN-LSTM, CNN-BiLSTM, CNN-GRU, and CNN-BiGRU) with different numbers of layers to evaluate their performance in polarity classification. The dataset was classified using the FastText and BERT pre-trained models to carry out the sentiment polarity classification. Finally, our experimental results show that the WangchanBERTa model slightly improved the accuracy, producing a value of 0.9225, and the skip-gram and CNN model combination outperformed other DL models, reaching an accuracy of 0.9170. From the experiments, we found that the word vector dimensions, hyperparameter values, and the number of layers of the DL models affected the performance of sentiment classification. Our research provides guidance for setting suitable hyperparameter values to improve the accuracy of sentiment classification for the Thai language in the hotel domain.","PeriodicalId":55580,"journal":{"name":"Atomic Data and Nuclear Data Tables","volume":"08 1","pages":"90"},"PeriodicalIF":1.8,"publicationDate":"2023-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86201794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Comprehensive Dataset of Spelling Errors and Users' Corrections in Croatian Language 克罗地亚语拼写错误和用户更正的综合数据集
IF 1.8 3区 物理与天体物理
Atomic Data and Nuclear Data Tables Pub Date : 2023-05-12 DOI: 10.3390/data8050089
G. Gledec, M. Horvat, M. Mikuc, B. Blašković
{"title":"A Comprehensive Dataset of Spelling Errors and Users' Corrections in Croatian Language","authors":"G. Gledec, M. Horvat, M. Mikuc, B. Blašković","doi":"10.3390/data8050089","DOIUrl":"https://doi.org/10.3390/data8050089","url":null,"abstract":"This paper presents a unique and extensive dataset containing over 33 million entries with pairs in the form “spelling error → correction” from ispravi.me, the most popular Croatian online spellchecking service, collected since 2008. The dataset, compiled from the contribution of nearly 900,000 users, is a valuable resource for researchers and developers in the field of natural language processing (NLP), improving spellcheck accuracy, and language learning applications. The dataset may be used to accomplish several goals: (1) improving spellchecking accuracy by incorporating common user corrections and reducing false positives and negatives; (2) helping language learners identify common errors and learn correct spelling through targeted feedback; (3) analyzing data trends and patterns to uncover the most common spelling errors and their underlying causes; (4) identifying and evaluating factors that influence typing input; (5) improving NLP applications such as text recognition and machine translation. Tasks specific to the Croatian language include the creation of a letter-level confusion matrix and the refinement of word suggestions based on historical usage of the service. This comprehensive dataset provides researchers and practitioners with a wealth of information, opening the path for advancements in spellchecking, language learning, and NLP applications in the Croatian language.","PeriodicalId":55580,"journal":{"name":"Atomic Data and Nuclear Data Tables","volume":"22 1","pages":"89"},"PeriodicalIF":1.8,"publicationDate":"2023-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78999940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信