{"title":"Rule Based Conversion of L A T E X Math Equations into Content MathML (CMML)","authors":"Sharaf Hussain, Samita Bai, S. Khoja","doi":"10.6688/JISE.202009_36(5).0006","DOIUrl":null,"url":null,"abstract":"This paper discusses the formation of math grammar rules for LATEX math equations. These rules are used to generate Abstract Syntax Tree (AST) which extracts structural information from mathematical expressions given in LATEX format. Later AST is used to generate XML structure of mathematical expressions that make mathematical expressions machine-readable in heterogeneous environments. A rule-based algorithm is also proposed that converts LATEX math expressions into Content MathML (CMML), which produces semantic enrichment in web documents. The rules for writing LATEX math equations are formulated and implemented as LATEX Math Grammar (LMG), which are used for generating AST. Further, AST is converted into XML structure which is used to generate CMML encoding. Initially, the conversion algorithm is tested on 20 equations used in an NTCIR-12 math competition, then the algorithm is tested on NTCIR-12 Wikipedia-MathIR and ArXiv data sets. The results show that our algorithm is capable of converting LATEX complex equations into CMML extensively as compared to the existing ones as well as its time efficiency is better than contemporary systems.","PeriodicalId":50177,"journal":{"name":"Journal of Information Science and Engineering","volume":null,"pages":null},"PeriodicalIF":0.5000,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Information Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.6688/JISE.202009_36(5).0006","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 1
Abstract
This paper discusses the formation of math grammar rules for LATEX math equations. These rules are used to generate Abstract Syntax Tree (AST) which extracts structural information from mathematical expressions given in LATEX format. Later AST is used to generate XML structure of mathematical expressions that make mathematical expressions machine-readable in heterogeneous environments. A rule-based algorithm is also proposed that converts LATEX math expressions into Content MathML (CMML), which produces semantic enrichment in web documents. The rules for writing LATEX math equations are formulated and implemented as LATEX Math Grammar (LMG), which are used for generating AST. Further, AST is converted into XML structure which is used to generate CMML encoding. Initially, the conversion algorithm is tested on 20 equations used in an NTCIR-12 math competition, then the algorithm is tested on NTCIR-12 Wikipedia-MathIR and ArXiv data sets. The results show that our algorithm is capable of converting LATEX complex equations into CMML extensively as compared to the existing ones as well as its time efficiency is better than contemporary systems.
期刊介绍:
The Journal of Information Science and Engineering is dedicated to the dissemination of information on computer science, computer engineering, and computer systems. This journal encourages articles on original research in the areas of computer hardware, software, man-machine interface, theory and applications. tutorial papers in the above-mentioned areas, and state-of-the-art papers on various aspects of computer systems and applications.