{"title":"A domain-specific language for describing machine learning datasets","authors":"Joan Giner-Miguelez , Abel Gómez , Jordi Cabot","doi":"10.1016/j.cola.2023.101209","DOIUrl":"https://doi.org/10.1016/j.cola.2023.101209","url":null,"abstract":"<div><p>Datasets are essential for training and evaluating machine learning (ML) models. However, they are also at the root of many undesirable model behaviors, such as biased predictions. To address this issue, the machine learning community is proposing a <em>data-centric cultural shift</em>, where data issues are given the attention they deserve and more standard practices for gathering and describing datasets are discussed and established.</p><p>So far, these proposals are mostly high-level guidelines described in natural language and, as such, they are difficult to formalize and apply to particular datasets. In this sense, and inspired by these proposals, we define a new domain-specific language (DSL) to precisely describe machine learning datasets in terms of their structure, provenance, and social concerns. We believe this DSL will facilitate any ML initiative to leverage and benefit from this data-centric shift in ML (e.g., selecting the most appropriate dataset for a new project or better replicating other ML results). The DSL is implemented as a Visual Studio Code plugin, and it has been published under an open-source license.</p></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"76 ","pages":"Article 101209"},"PeriodicalIF":2.2,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49891921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nelson Gregório , João Bispo , João Paulo Fernandes , Sérgio Queiroz de Medeiros
{"title":"E-APK: Energy pattern detection in decompiled android applications","authors":"Nelson Gregório , João Bispo , João Paulo Fernandes , Sérgio Queiroz de Medeiros","doi":"10.1016/j.cola.2023.101220","DOIUrl":"https://doi.org/10.1016/j.cola.2023.101220","url":null,"abstract":"<div><p>Energy efficiency is a non-functional requirement that developers must consider, particularly when building software for battery-operated devices like mobile ones: a long-lasting battery is an essential requirement for an enjoyable user experience.</p><p>In previous studies, it has been shown that many mobile applications include inefficiencies that cause battery to be drained faster than necessary. Some of these inefficiencies result from software patterns that have been catalogued, and for which more energy-efficient alternatives are also known.</p><p>The existing catalogues, however, assume as a fundamental requirement that one has access to the source code of an application in order to be able to analyse it. This requirement makes independent energy analysis challenging, or even impossible, e.g. for a mobile user or, most significantly, an App Store trying to provide information on how efficient an application being submitted for publication is.</p><p>We study the viability of looking for known energy patterns in applications by decompiling them and analysing the resulting code. For this, we decompiled and analysed 420 open-source applications by extending an existing tool, which is now capable of transparently decompiling and analysing android applications. With the collected data, we performed a comparative study of the presence of four energy patterns between the source code and the decompiled code.</p><p>We performed two types of analysis: (i) comparing the total number of energy pattern detections; (ii) comparing the similarity between energy pattern detections. When comparing the total number of detections in source code against decompiled code, we found that 79.29% of the applications reported the same number of energy pattern detections.</p><p>To test the similarity between source code and APKs, we calculated, for each application, a similarity score based on our four implemented detectors. Of all applications, 35.76% achieved a perfect similarity score of 4, and 89.40% got a score of 3 or more out of 4. Furthermore, only two applications got a score of 0.</p><p>When viewed in tandem, the results of the two analyses we performed point in a promising direction. They provide initial evidence that static analysis techniques, typically used in source code, can be a viable method to inspect APKs when access to source code is restricted, and further research in this area is worthwhile.</p></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"76 ","pages":"Article 101220"},"PeriodicalIF":2.2,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49891923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jérôme Pfeiffer , Bernhard Rumpe , David Schmalzing , Andreas Wortmann
{"title":"Composition operators for modeling languages: A literature review","authors":"Jérôme Pfeiffer , Bernhard Rumpe , David Schmalzing , Andreas Wortmann","doi":"10.1016/j.cola.2023.101226","DOIUrl":"https://doi.org/10.1016/j.cola.2023.101226","url":null,"abstract":"<div><p>Efficiently engineering modeling languages demands their reuse through composition. Research in language engineering has produced many different operators to reuse and compose languages and language parts. Unfortunately, these operate on different dimensions of languages, produce diverse results, and are distributed across various technological spaces and publications, which hampers understanding the state of language composition for researchers and practitioners. To mitigate this, we report the results of a literature review on modeling language composition operators. In this review, we identify operators, their properties, and supported language dimensions, and relate them to categories of language composition. Through this, our survey draws a new, detailed map of modeling language composition operators that can guide researchers in software language engineering in identifying uncharted territory and practitioners in employing the most suitable composition operators.</p></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"76 ","pages":"Article 101226"},"PeriodicalIF":2.2,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49891924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Minimum modulus visualization of algebraic fractals","authors":"Severino F. Galán","doi":"10.1016/j.cola.2023.101222","DOIUrl":"https://doi.org/10.1016/j.cola.2023.101222","url":null,"abstract":"<div><p>Fractals are a family of shapes formed by irregular and fragmented patterns. They can be classified into two main groups: geometric and algebraic. Whereas the former are characterized by a fixed geometric replacement rule, the latter are defined by a recurrence function in the complex plane. The classical method for visualizing algebraic fractals considers the sequence of complex numbers originated from each point in the complex plane. Thus, each original point is colored depending on whether its generated sequence escapes to infinity. The present work introduces a novel visualization method for algebraic fractals. This method colors each original point by taking into account the complex number with minimum modulus within its generated sequence. The advantages of the novel method are twofold: on the one hand, it preserves the fractal view that the classical method offers of the escape set boundary and, on the other hand, it additionally provides interesting visual details of the prisoner set (the complement of the escape set). The novel method is comparatively evaluated with other classical and non-classical visualization methods of fractals, giving rise to aesthetic views of prisoner sets.</p></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"76 ","pages":"Article 101222"},"PeriodicalIF":2.2,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49891920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Amleto Di Salle , Ludovico Iovino , Alfonso Pierantonio , Juha-Pekka Tolvanen
{"title":"Introduction to the special issue on foundations and practice of visual modeling (FPVM)","authors":"Amleto Di Salle , Ludovico Iovino , Alfonso Pierantonio , Juha-Pekka Tolvanen","doi":"10.1016/j.cola.2023.101227","DOIUrl":"https://doi.org/10.1016/j.cola.2023.101227","url":null,"abstract":"","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"76 ","pages":"Article 101227"},"PeriodicalIF":2.2,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49891922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Francisco Martínez-Lasaca , Pablo Díez , Esther Guerra , Juan de Lara
{"title":"Dandelion: A scalable, cloud-based graphical language workbench for industrial low-code development","authors":"Francisco Martínez-Lasaca , Pablo Díez , Esther Guerra , Juan de Lara","doi":"10.1016/j.cola.2023.101217","DOIUrl":"https://doi.org/10.1016/j.cola.2023.101217","url":null,"abstract":"<div><p>There is an increasing demand nowadays for low-code development platforms (LCDPs). As they rely heavily on graphical languages rather than writing code, these platforms enable citizen developers to participate in software development. However, creating new LCDPs is very costly, since it requires building support for graphical modelling and its integration with services like model validation, recommendation systems, or code generation. While Model-driven Engineering (MDE) has developed technologies to create these components, most of them are not cloud-based, as required by LCDPs. In particular, a cloud-based graphical workbench capable of providing the scalability required by industrial applications and adequately supporting technological heterogeneity is currently missing.</p><p>To fill this gap we introduce <em>Dandelion</em>, a cloud-based graphical language workbench for LCDPs built following an MDE approach. The tool handles model heterogeneity by using a harmonising meta-model to uniformly represent models from diverse technologies, and supports a customisable level of conformance between models and meta-models. Scalability is addressed by persisting models in a distributed, highly flexible database whose infrastructure is designed to conform to the harmonising meta-model, thus favouring model retrieval. Additionally, a customisable scalability component is introduced for lazy model loading.</p><p>This paper describes the concepts and principles behind the tool design and reports on an evaluation on large synthetic process mining models, and on domain-specific languages and large industrial models used within the UGROUND company, showing promising results.</p></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"76 ","pages":"Article 101217"},"PeriodicalIF":2.2,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49891925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Joeri Exelmans , Jakob Pietron , Alexander Raschke , Hans Vangheluwe , Matthias Tichy
{"title":"A new versioning approach for collaboration in blended modeling","authors":"Joeri Exelmans , Jakob Pietron , Alexander Raschke , Hans Vangheluwe , Matthias Tichy","doi":"10.1016/j.cola.2023.101221","DOIUrl":"https://doi.org/10.1016/j.cola.2023.101221","url":null,"abstract":"<div><p>The complexity of modern software-intensive systems and the need for flexibility in their development process forces developers to collaborate using the most appropriate language(s) for each given task, view and component. Blended modeling is the ability to edit a model through multiple concrete syntaxes simultaneously.</p><p>To support collaborative blended modeling, we present a variation of operation-based versioning that allows bi-directional propagation of changes between concrete and abstract syntaxes. This allows us to support layout continuity between different versions, and to handle information that is not (yet) available (e.g., layout information) when rendering changes from abstract to concrete syntax. Finally, our approach does not enforce immediate conflict resolution. Rather, different merge options and their consequences can be presented to the users, who may choose to only perform partial conflict resolution, deferring final resolution till later.</p><p>In this article, we present the general approach and describe salient parts of an implementation.</p></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"76 ","pages":"Article 101221"},"PeriodicalIF":2.2,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49891918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wafa Ben Slama Souei , Chiraz El Hog , Raoudha Ben Djemaa , Layth Sliman , Ikram Amous Ben Amor
{"title":"Towards smart contract distributed directory based on the uniform description language","authors":"Wafa Ben Slama Souei , Chiraz El Hog , Raoudha Ben Djemaa , Layth Sliman , Ikram Amous Ben Amor","doi":"10.1016/j.cola.2023.101225","DOIUrl":"10.1016/j.cola.2023.101225","url":null,"abstract":"<div><p>A Smart Contract (SC) is a piece of code executed on the blockchain to automatically trigger transactions upon the occurrence of predefined events. Due to the intrinsic features regarding traceability and data immutability, many companies started using blockchain Smart Contracts to perform collaborative processes. Despite their promising features, there is a lack of Smart Contacts management platforms that enable blockchain participants to describe and publish their smart contacts or “search and match” already deployed ones. In this paper, a new Distributed Smart Directory (DSD) where providers can publish their SCs description is proposed. The SCs descriptions include metadata covering functional, and non-functional properties of the SC. Hence, users can find SCs according to their non-functional preferences, needs, and constraints. The proposed DSD is an extension of the ebXML directory. It was fully implemented on-chain. The SCs descriptions are generated based on the Uniform Description language for SC (UDL-SC). The proposed solution is implemented on the Ethereum blockchain. It was then tested and evaluated.</p></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"77 ","pages":"Article 101225"},"PeriodicalIF":2.2,"publicationDate":"2023-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48820061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Machine learning with word embedding for detecting web-services anti-patterns","authors":"Lov Kumar , Sahithi Tummalapalli , Sonika Chandrakant Rathi , Lalita Bhanu Murthy , Aneesh Krishna , Sanjay Misra","doi":"10.1016/j.cola.2023.101207","DOIUrl":"https://doi.org/10.1016/j.cola.2023.101207","url":null,"abstract":"<div><p>Software design Anti-pattern is the common feedback to a recurring problem that is ineffective and has a high risk of failure. Early prediction of these Anti-patterns helps reduce the design process’s efforts, resources, and costs. In earlier research, static code or Web Service Description Language (WSDL) metrics were used to develop anti-pattern prediction models. These source code metrics are calculated at either file-level or system-level. So, the values of these metrics are frequently dependent on assumptions that are not defined or standardized and might vary depending on the tools available. This study aims to develop a machine learning-based Anti-patterns prediction model using natural language processing techniques for representing the WSDL file as an input. In this research, the four-word embedding methods have been used to process the WSDL file. The processed outputs are used as input to the models trained using thirty-three classifier techniques. This study also uses eight feature selection techniques to remove ineffective features and five data sampling techniques to handle the class imbalance nature of the datasets. The results indicate that the developed models using text metrics perform better than the static code or WSDL metrics. Additionally, the results suggest that selecting features using feature selection and balancing data using sampling techniques helps improve the models’ performance.</p></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"75 ","pages":"Article 101207"},"PeriodicalIF":2.2,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50187251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A methodology for refactoring ORM-based monolithic web applications into microservices","authors":"Francisco Freitas , André Ferreira , Jácome Cunha","doi":"10.1016/j.cola.2023.101205","DOIUrl":"https://doi.org/10.1016/j.cola.2023.101205","url":null,"abstract":"<div><p>In the last few years we have been seeing a drastic change in the way software is developed. Large-scale software projects are being assembled by a flexible composition of many (small) components possibly written in different programming languages and deployed anywhere in the cloud — the so-called microservices-based applications.</p><p>The dramatic growth in popularity of microservices-based applications has pushed several companies to apply major refactorings to their software systems. However, this is a challenging task that may take several months or even years.</p><p>We propose a methodology to automatically evolve monolithic web applications that use object-relational mapping into microservices-based ones. Our methodology receives the source code and a microservices proposal and refactors the original code to create each microservice. Our methodology creates an API for each method call to classes that are in other services. The database entities are also refactored to be included in the corresponding service. The evaluation performed in 120 applications shows that our tool can successfully refactor about 72% of them. The execution of the unit tests in both versions of the applications yield exactly the same results.</p></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"75 ","pages":"Article 101205"},"PeriodicalIF":2.2,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50187257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}