MethodsX最新文献

筛选
英文 中文
Establishing a quantitative link between crystal violet absorbance and biomass in biofilms 建立结晶紫吸光度与生物膜生物量之间的定量联系
IF 1.9
MethodsX Pub Date : 2025-09-16 DOI: 10.1016/j.mex.2025.103630
Sheida Stephens , Radhakrishnan Mahadevan , D. Grant Allen
{"title":"Establishing a quantitative link between crystal violet absorbance and biomass in biofilms","authors":"Sheida Stephens ,&nbsp;Radhakrishnan Mahadevan ,&nbsp;D. Grant Allen","doi":"10.1016/j.mex.2025.103630","DOIUrl":"10.1016/j.mex.2025.103630","url":null,"abstract":"<div><div>Quantifying biomass is essential for studying biofilms in biomanufacturing, yet widely used assays such as crystal violet (CV) often yield results that are difficult to compare across laboratories due to its reliance on absorbance as a subjective proxy for biomass. To address this, we present a standardized method that calibrates CV absorbance against cellular optical density (OD) and dry cell weight (DCW) using centrifuged planktonic cultures. By establishing a three-way correlation among OD, DCW, and CV absorbance, this simple method allows for quantitative, reproducible, and comparable biomass measurements. Validation across <em>Escherichia coli</em> strains and <em>Rhodopseudomonas palustris</em> showed strong linearity, particularly when using 10% acetic acid as the solvent. Seasonal and instrument-based variability were evaluated, supporting the method’s robustness and broad utility. This method enables researchers to normalize CV data, improving inter-laboratory comparability and supporting more accurate assessment of biofilm productivity.</div></div>","PeriodicalId":18446,"journal":{"name":"MethodsX","volume":"15 ","pages":"Article 103630"},"PeriodicalIF":1.9,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145094861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SilhouetteScoreinR: Beyond traditional network layouts by leveraging local cohesion and nearest neighbor separation 廓形escoreinr:通过利用局部凝聚力和最近邻分离,超越传统的网络布局
IF 1.9
MethodsX Pub Date : 2025-09-11 DOI: 10.1016/j.mex.2025.103622
Hua-Ying Chuang , Willy Chou
{"title":"SilhouetteScoreinR: Beyond traditional network layouts by leveraging local cohesion and nearest neighbor separation","authors":"Hua-Ying Chuang ,&nbsp;Willy Chou","doi":"10.1016/j.mex.2025.103622","DOIUrl":"10.1016/j.mex.2025.103622","url":null,"abstract":"<div><div>The silhouette score (SS) quantifies how well each entity fits its assigned cluster by contrasting within‐cluster cohesion with nearest other–cluster separation. Although common in other fields, SS is rarely used in bibliometrics. Using 2,252 <em>MethodsX</em> articles (2020–2024), we show how SS evaluates clustering quality in co-word networks and author collaborations, independent of the chosen algorithm. We provide R scripts to compute SS for explicit (geographic/known coordinates) and implicit (PCA/UMAP) layouts and introduce a two-axis visualization that plots publication count against SS. The framework highlights coherent clusters (high SS) and flags boundary or misassigned entities (low/negative SS) that standard network plots can obscure. This improves interpretability at term, cluster, and corpus levels and supports more defensible decisions about labels, membership, and follow-up analysis. Code is released for replication and reuse; sensitivity to distance metrics and data regimes is discussed to guide application across bibliometrics and related domains.<ul><li><span>•</span><span><div>Silhouette Scores Reveal Outliers: Silhouette scores not only validate cluster cohesion but also uncover meaningful outliers—insights often missed in traditional network layouts.</div></span></li><li><span>•</span><span><div>Novel Visualization Approach: Combining silhouette scores with publication counts enables a more nuanced visualization of co-word and collaboration networks.</div></span></li><li><span>•</span><span><div>Applied to Bibliometrics: This study applies silhouette analysis to 2252 MethodsX articles, offering new tools for evaluating clustering quality in bibliometric research.</div></span></li></ul></div></div>","PeriodicalId":18446,"journal":{"name":"MethodsX","volume":"15 ","pages":"Article 103622"},"PeriodicalIF":1.9,"publicationDate":"2025-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145060634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Protocol for data collection on the language of poor students in Malaysia 马来西亚贫困学生语言数据收集议定书
IF 1.9
MethodsX Pub Date : 2025-09-11 DOI: 10.1016/j.mex.2025.103616
Wan Athirah Adilah Wan Halim , Muhamad Fadzllah Zaini , Mazura Mastura Muhammad , Mohd Haniff Md.Tahir , Nur Farahkhanna Mohd Rusli , Nurshafawati Ahmad Sani , Habibah Ismail , Suriati Zakaria , Md. Zahril Nizam Md. Yusoff , Norliza Jamaluddin , Dahlia Janan , Darwalis Sazan
{"title":"Protocol for data collection on the language of poor students in Malaysia","authors":"Wan Athirah Adilah Wan Halim ,&nbsp;Muhamad Fadzllah Zaini ,&nbsp;Mazura Mastura Muhammad ,&nbsp;Mohd Haniff Md.Tahir ,&nbsp;Nur Farahkhanna Mohd Rusli ,&nbsp;Nurshafawati Ahmad Sani ,&nbsp;Habibah Ismail ,&nbsp;Suriati Zakaria ,&nbsp;Md. Zahril Nizam Md. Yusoff ,&nbsp;Norliza Jamaluddin ,&nbsp;Dahlia Janan ,&nbsp;Darwalis Sazan","doi":"10.1016/j.mex.2025.103616","DOIUrl":"10.1016/j.mex.2025.103616","url":null,"abstract":"<div><div>Poverty is defined as the lack of financial resources to meet basic needs, including insufficient goods, poor physical health, inadequate food and clothing, the absence of proper housing, and the lack of employment that provides a stable income. This deprivation affects the educational development of students from impoverished backgrounds. To identify and analyze data on extremely poor students, a protocol was developed. This protocol is named the Protocol for Data Collection on the Language of Poor Students in Malaysia. It is divided into three main phases and nine steps to facilitate the data collection process. Collecting data from this vulnerable group requires adherence to specific ethical guidelines, which encompass both educational and human ethics. Compliance with these ethical standards ensures a clearer understanding of the data collected. The data gathered is multi-sourced, involving language (oral and written), economics (income, etc.), social factors (gender, etc.), and geography (location). All these elements contribute to achieving SDG 4 (Quality Education) and support SDG 1 (No Poverty).</div><div>Overall, this methodology:</div><div>A protocol was designed to collect language data from extremely poor students in Malaysia.</div><div>The data collection process adheres to specific ethical guidelines, including educational and human ethics, to ensure the respectful and responsible gathering of sensitive data from vulnerable groups.</div><div>The collected data includes various sources such as language (oral and written), socioeconomic factors (income), social factors (gender), and geographical location, allowing for a comprehensive analysis of the factors influencing extreme poverty in educational contexts.</div></div>","PeriodicalId":18446,"journal":{"name":"MethodsX","volume":"15 ","pages":"Article 103616"},"PeriodicalIF":1.9,"publicationDate":"2025-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145154428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards safer environments: A YOLO and MediaPipe-based human fall detection system with alert automation 迈向更安全的环境:基于YOLO和mediapie的自动报警人体跌倒检测系统
IF 1.9
MethodsX Pub Date : 2025-09-11 DOI: 10.1016/j.mex.2025.103623
Virag Pradip Kothari , Priti S. Chakurkar
{"title":"Towards safer environments: A YOLO and MediaPipe-based human fall detection system with alert automation","authors":"Virag Pradip Kothari ,&nbsp;Priti S. Chakurkar","doi":"10.1016/j.mex.2025.103623","DOIUrl":"10.1016/j.mex.2025.103623","url":null,"abstract":"<div><div>Detecting human falls is essential to ensuring public safety, particularly in public areas like transit terminals. This study provides a precise and effective real-time fall detection system by utilising pose estimation and deep learning-based object detection approaches. The system correctly detects falls in dynamic circumstances by combining MediaPipe's pose estimate for in-depth body posture analysis with the YOLOv8 model for human recognition. The paper provides a novel method that improves the system's scalability and robustness in real-world scenarios by utilising position landmarks and activity identification algorithms. To enable accurate fall detection and reduce false positives, the system also uses anomaly detection techniques. The system uses Twilio to send real-time warnings as soon as a fall is detected, send out video footage of the incident, and alert the appropriate authorities. The system is an excellent option for enhancing safety in sizable public areas because to its effectiveness, scalability, and privacy-preserving features. Metrics like accuracy, precision, recall, and F1-score are used in the study to assess the system's performance and show its usefulness. The system outperformed existing fall detection approaches, achieving 96.06 % accuracy and 100 % recall, confirming its robustness in real-world scenarios.</div></div>","PeriodicalId":18446,"journal":{"name":"MethodsX","volume":"15 ","pages":"Article 103623"},"PeriodicalIF":1.9,"publicationDate":"2025-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145094862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LFF-POS: A linguistic fusion method to handle out-of-vocabulary words in low-resource part-of-speech tagging LFF-POS:一种处理低资源词性标注中词汇外词的语言融合方法
IF 1.9
MethodsX Pub Date : 2025-09-10 DOI: 10.1016/j.mex.2025.103615
Muhammad Alfian , Umi Laili Yuhana , Daniel Siahaan , Harum Munazharoh , Eric Pardede
{"title":"LFF-POS: A linguistic fusion method to handle out-of-vocabulary words in low-resource part-of-speech tagging","authors":"Muhammad Alfian ,&nbsp;Umi Laili Yuhana ,&nbsp;Daniel Siahaan ,&nbsp;Harum Munazharoh ,&nbsp;Eric Pardede","doi":"10.1016/j.mex.2025.103615","DOIUrl":"10.1016/j.mex.2025.103615","url":null,"abstract":"<div><div>Accurate part-of-speech (POS) tagging is needed for classroom learning evaluation in order to improve the quality of education. However, accurate POS tagging is hampered by the limited amount of training data and the high proportion of out-of-vocabulary (OOV) tokens. We present LFF-POS, a linguistic feature fusion method that overcomes these limitations for Indonesian. The procedure consists of four sequential steps: (1) tokenizing raw text; (2) extracting three complementary features; (3) merging the resulting vectors; (4) applying self-attention; and (4) training a BiLSTM sequence labeler. By combining the three features, LFF-POS improves tagging accuracy without relying on an external lexicon. Experimental results show that the combined features are able to improve the proposed model's ability to handle OOV words and achieve higher POS Tagging accuracy compared to baseline and existing methods.</div><div>OOV cannot be recognized by the model, thus reducing the accuracy of the POS Tagging model</div><div>This study aims to overcome OOV by combining linguistic features such as orthography, morphology, and characters to improve word representation</div><div>The LFF-POS has been proven to improve POS Tagging performance, especially OOV F1 Score by ±14% over baseline.</div></div>","PeriodicalId":18446,"journal":{"name":"MethodsX","volume":"15 ","pages":"Article 103615"},"PeriodicalIF":1.9,"publicationDate":"2025-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145094863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Grapevine disease detection using (q,τ)-nabla calculus quantum deformation with deep learning features 基于深度学习特征的(q,τ)-nabla微积分量子变形的葡萄病害检测
IF 1.9
MethodsX Pub Date : 2025-09-10 DOI: 10.1016/j.mex.2025.103619
Ahmad Sami Al-Shamayleh , Rabha W. Ibrahim
{"title":"Grapevine disease detection using (q,τ)-nabla calculus quantum deformation with deep learning features","authors":"Ahmad Sami Al-Shamayleh ,&nbsp;Rabha W. Ibrahim","doi":"10.1016/j.mex.2025.103619","DOIUrl":"10.1016/j.mex.2025.103619","url":null,"abstract":"<div><div>Today, one of the most important first steps in attaining sustainable agriculture and guaranteeing food security is the detection of plant diseases. Quantitative analysis of plant physiology is now feasible thanks to developments in computer vision and imaging technologies. On the other hand, manual diagnosis requires a lot of work and in-depth plant pathology knowledge. Numerous innovative methods for identifying and classifying plant diseases have been widely used. In this study, we propose a novel hybrid classification method that combines (q,τ)-Nabla calculus quantum deformation-based features with deep learning feature representations to classify diseases in grapevine leaves. The methodology of this study relies on:<ul><li><span>•</span><span><div>Nabla calculus quantum deformation features are utilized to extract robust handcrafted features that capture local texture and structural variations associated with disease symptoms.</div></span></li><li><span>•</span><span><div>Deep features are extracted using a pre-trained convolutional neural network, which captures high-level semantic information from leaf images.</div></span></li></ul>The concatenated feature vectors are then fed into a machine learning classifier for final prediction. Test results on a dataset of grapevine leaf disease show that the proposed method outperforms individual approaches, in accuracy. The proposed method helps minimize financial losses and support effective plant disease management, thereby improving crop yield and contributing to food security.</div></div>","PeriodicalId":18446,"journal":{"name":"MethodsX","volume":"15 ","pages":"Article 103619"},"PeriodicalIF":1.9,"publicationDate":"2025-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145094864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A hybrid approach for real-time hand tracking using fiducial markers and inertial sensors 一种基于基准标记和惯性传感器的手部实时跟踪混合方法
IF 1.9
MethodsX Pub Date : 2025-09-08 DOI: 10.1016/j.mex.2025.103609
Ranjeet Bidwe , Shubhangi Deokar , Yash Parkhi , Tanisha Vyas , Nimita Jestin , Utkarsh Kumar , Satviki Budhia , Armaan Jeswani
{"title":"A hybrid approach for real-time hand tracking using fiducial markers and inertial sensors","authors":"Ranjeet Bidwe ,&nbsp;Shubhangi Deokar ,&nbsp;Yash Parkhi ,&nbsp;Tanisha Vyas ,&nbsp;Nimita Jestin ,&nbsp;Utkarsh Kumar ,&nbsp;Satviki Budhia ,&nbsp;Armaan Jeswani","doi":"10.1016/j.mex.2025.103609","DOIUrl":"10.1016/j.mex.2025.103609","url":null,"abstract":"<div><div>This paper presents a cost-effective hybrid hand-tracking technique that integrates fiducial marker detection, capacitive touch sensing, and inertial measurement for real-time gesture recognition in immersive environments. The system is implemented on lightweight hardware comprising a Raspberry Pi Zero 2 W and an ESP32, with OpenCV’s ArUco marker detection enabling 3D hand pose estimation, capacitive sensors supporting finger-state recognition, and an Inertial Measurement Unit (IMU) providing orientation tracking. Optimizations such as exposure adjustment and region-of-interest processing ensure robust marker detection under variable illumination, while sensor data is transmitted via Bluetooth Low Energy (BLE) and WebSocket protocols for synchronization with external devices.</div><div>The methodological novelty of this work is highlighted as follows:</div><div>•High Accuracy Across Modalities: Achieved 3.4 mm localization accuracy, 85–91% orientation accuracy, and ∼2.9 mm hand pose keypoint accuracy, with trajectory fidelity maintained at 80–81%.</div><div>•Robust Finger-State Recognition: The capacitive sensing module consistently delivered 96.1% accuracy in detecting finger states across multiple runs.</div><div>•Validated Communication Trade-offs: Latency testing established complementary roles of Wi-Fi (high throughput, ∼467 msg/s) and BLE (low latency, ∼50 ms, &gt;98% reliability) for real-time applications.</div><div>By fusing multiple sensing modalities, the method delivers enhanced accuracy, responsiveness, and stability while minimizing computational overhead. The system provides a reproducible, modular, and scalable solution suitable for VR/AR interaction, assistive technology, education, and human–computer interaction.</div></div>","PeriodicalId":18446,"journal":{"name":"MethodsX","volume":"15 ","pages":"Article 103609"},"PeriodicalIF":1.9,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145044248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A combined modelling approach to predicting injury severity in rear-end collisions 一种预测追尾碰撞损伤严重程度的组合建模方法
IF 1.9
MethodsX Pub Date : 2025-09-08 DOI: 10.1016/j.mex.2025.103612
Shufeng Wang, Shixuan Jiang, Zhengli Wang, Lingyi Meng
{"title":"A combined modelling approach to predicting injury severity in rear-end collisions","authors":"Shufeng Wang,&nbsp;Shixuan Jiang,&nbsp;Zhengli Wang,&nbsp;Lingyi Meng","doi":"10.1016/j.mex.2025.103612","DOIUrl":"10.1016/j.mex.2025.103612","url":null,"abstract":"<div><div>Rear-end collisions constitute the most prevalent category of urban road traffic accidents, resulting in severe traffic congestion, casualties, and substantial economic losses. To mitigate the impact of such accidents effectively, this study proposes a severity prediction model that integrates Convolutional Neural Networks (CNN) and Extreme Gradient Boosting (XGBoost). The model employs the U.S. Department of Transportation's Fatality Analysis Reporting System (FARS) accident dataset, which undergoes preliminary preprocessing. Subsequently, Principal Component Analysis (PCA) is applied to reduce the dimensionality of the influencing factors prior to their input into the combined model for classification. CNN is utilized to extract features, while XGBoost is responsible for classification. Experimental results demonstrate that the combined model achieves a classification accuracy of 96.2 %, with superior AUC and F1 scores compared to traditional models, indicating excellent predictive performance.<ul><li><span>•</span><span><div>This paper proposes a hybrid CNN-XGBoost algorithm that combines the superior feature extraction capability of CNN with the powerful structured data processing and precise prediction ability of XGBoost, resulting in a significant performance improvement over traditional algorithms.</div></span></li></ul></div></div>","PeriodicalId":18446,"journal":{"name":"MethodsX","volume":"15 ","pages":"Article 103612"},"PeriodicalIF":1.9,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145117860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
From scores to insights: Predicting MT errors using reliable metrics and linguistic typology in slavic languages 从分数到洞察力:使用可靠的度量和斯拉夫语言的语言类型学预测机器翻译错误
IF 1.9
MethodsX Pub Date : 2025-09-08 DOI: 10.1016/j.mex.2025.103613
Dasa Munkova , Lucia Benkova , Michal Munk , Ľubomír Benko , Petr Hajek
{"title":"From scores to insights: Predicting MT errors using reliable metrics and linguistic typology in slavic languages","authors":"Dasa Munkova ,&nbsp;Lucia Benkova ,&nbsp;Michal Munk ,&nbsp;Ľubomír Benko ,&nbsp;Petr Hajek","doi":"10.1016/j.mex.2025.103613","DOIUrl":"10.1016/j.mex.2025.103613","url":null,"abstract":"<div><div>Machine Translation (MT) evaluation plays a crucial role in advancing systems translating into morphologically rich, low-resource languages such as Slovak. Existing automatic evaluation methods typically offer a single quality score, lacking insight into specific error types. A novel linguistically informed methodology that predicts the probability of MT error categories by integrating manual annotation with automatic evaluation metrics is proposed. The method builds on a modified MQM framework adapted for Slovak and employs a dataset of English-to-Slovak translations, combining outputs from statistical and neural MT systems with human reference translations. Manual annotations identified five linguistically motivated error categories. Reliability of 68 automatic metrics was assessed using Cronbach’s alpha, correlation coefficients, coefficient of determination (R²), and entropy. Bootstrapped logistic regression models were then developed to predict error occurrence probabilities. The proposed methodology improves the explainability and reliability of automatic MT evaluation by bridging the gap between holistic scoring and detailed error categorization. It significantly reduces the human effort required for quality assessment while maintaining a high degree of linguistic relevance, particularly for complex target languages like Slovak.<ul><li><span>•</span><span><div>Predicts probabilities of specific MT error categories</div></span></li><li><span>•</span><span><div>Integrates linguistic expertise with statistical reliability analysis</div></span></li><li><span>•</span><span><div>Reduces human effort in MT evaluation while preserving linguistic precision</div></span></li></ul></div></div>","PeriodicalId":18446,"journal":{"name":"MethodsX","volume":"15 ","pages":"Article 103613"},"PeriodicalIF":1.9,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145044117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A systematical procedure to extracting legal entities from Indonesian judicial decisions 将法律实体从印尼司法判决中分离出来的系统程序
IF 1.9
MethodsX Pub Date : 2025-09-06 DOI: 10.1016/j.mex.2025.103610
Eka Qadri Nuranti , Naili Suri Intizhami , Evi Yulianti , A. Muh. Iqbal Latief , Osama Iyad Al Ghozy
{"title":"A systematical procedure to extracting legal entities from Indonesian judicial decisions","authors":"Eka Qadri Nuranti ,&nbsp;Naili Suri Intizhami ,&nbsp;Evi Yulianti ,&nbsp;A. Muh. Iqbal Latief ,&nbsp;Osama Iyad Al Ghozy","doi":"10.1016/j.mex.2025.103610","DOIUrl":"10.1016/j.mex.2025.103610","url":null,"abstract":"<div><div>This article presents a systematic method of extracting legal entities from Indonesian judicial decisions with a well-structured named entity recognition (NER) approach. The procedure was implemented by gathering and annotating court decisions for theft cases at three court levels: first instance (2478 files), appeal (147 files), and cassation (62 files), amounting to 2687 annotated files. The data were harvested from the official website of the Supreme Court of the Republic of Indonesia using automated web scraping, followed by manual filtering for relevance and completeness.</div><div>Manual annotation was performed with the Label Studio platform by three independent annotators. Annotation consistency was considered using Fleiss' Kappa, yielding an average agreement score of 0.705 across all levels, indicating good inter-annotator reliability. The method uses a hierarchical structure and a BIO tagging scheme to tag &gt;50 types of legal entities, including defendants, judges, legal articles, charges, and verdict decisions.</div><div>This approach is proper for text processes such as legal information extraction, classification, and legal analysis. From a legal perspective, this process will improve legal transparency and research on Indonesian judicial data.<ul><li><span>•</span><span><div>Structured pipeline for gathering, filtering, and annotating Indonesian court judgments based on legal metadata and web scraping.</div></span></li><li><span>•</span><span><div>Manual annotation of 2687 court documents with annotation rules and inter-annotator agreement using Fleiss' Kappa.</div></span></li><li><span>•</span><span><div>Token-level translation and BIO tagging for &gt;50 legal entities, enabling downstream NLP tasks such as named entity recognition.</div></span></li></ul></div></div>","PeriodicalId":18446,"journal":{"name":"MethodsX","volume":"15 ","pages":"Article 103610"},"PeriodicalIF":1.9,"publicationDate":"2025-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145044252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信