{"title":"Enhancing neural network classification using fractional-order activation functions","authors":"Meshach Kumar , Utkal Mehta , Giansalvo Cirrincione","doi":"10.1016/j.aiopen.2023.12.003","DOIUrl":"https://doi.org/10.1016/j.aiopen.2023.12.003","url":null,"abstract":"<div><p>In this paper, a series of novel activation functions is presented, which is derived using the improved Riemann–Liouville conformable fractional derivative (<span><math><msup><mrow></mrow><mrow><mi>R</mi><mi>L</mi></mrow></msup></math></span>CFD). This study investigates the use of fractional activation functions in Multilayer Perceptron (MLP) models and their impact on the performance of classification tasks, verified using the IRIS, MNIST and FMNIST datasets. Fractional activation functions introduce a non-integer power exponent, allowing for improved capturing of complex patterns and representations. The experiment compares MLP models employing fractional activation functions, such as fractional sigmoid, hyperbolic tangent and rectified linear units, against traditional models using standard activation functions, their improved versions and existing fractional functions. The numerical studies have confirmed the theoretical observations mentioned in the paper. The findings highlight the potential usage of new functions as a valuable tool in deep learning in classification. The study suggests incorporating fractional activation functions in MLP architectures can lead to superior accuracy and robustness.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"5 ","pages":"Pages 10-22"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S266665102300030X/pdfft?md5=2be839945dd6c63499655950e9809539&pid=1-s2.0-S266665102300030X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139090006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AI OpenPub Date : 2024-01-01DOI: 10.1016/j.aiopen.2024.01.005
Long Ding , Chunping Ouyang , Yongbin Liu , Zhihua Tao , Yaping Wan , Zheng Gao
{"title":"Few-shot Named Entity Recognition via encoder and class intervention","authors":"Long Ding , Chunping Ouyang , Yongbin Liu , Zhihua Tao , Yaping Wan , Zheng Gao","doi":"10.1016/j.aiopen.2024.01.005","DOIUrl":"10.1016/j.aiopen.2024.01.005","url":null,"abstract":"<div><p>In the real world, the large and complex nature of text increases the difficulty of tagging and results in a limited amount of tagged text. Few-shot Named Entity Recognition(NER) only uses a small amount of annotation data to identify and classify entities. It avoids the above problems. Few-shot learning methods usually use prior knowledge to achieve good results. However, prior knowledge may become a confounding factor affecting the relation between sample features and real labels. This problem leads to bias and difficulty accurately capturing class. To solve this problem, a new model, Few-shot Named Entity Recognition via Encoder and Class Intervention, is proposed based on causality. We show that we can steer the model to manufacture interventions on encoder and class, and reduce the interference of confounding factors. Specifically, while cross-sample attention perturbation is used in the encoder layer, a practical causal relation between feature and classification label is developed in the class layer. This way is an attempt of causal methodology in the Few-shot Named Entity Recognition task, which improves the discrimination ability of the NER classifier. Experimental results demonstrate that our model outperforms baseline models in both 5-way and 10-way on two NER datasets.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"5 ","pages":"Pages 39-45"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651024000068/pdfft?md5=737ba44f6bb38a965193bee8501a6eb7&pid=1-s2.0-S2666651024000068-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139884960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AI OpenPub Date : 2024-01-01DOI: 10.1016/j.aiopen.2024.01.004
Yuan Yao , Ao Zhang , Zhengyan Zhang , Zhiyuan Liu , Tat-Seng Chua , Maosong Sun
{"title":"CPT: Colorful Prompt Tuning for pre-trained vision-language models","authors":"Yuan Yao , Ao Zhang , Zhengyan Zhang , Zhiyuan Liu , Tat-Seng Chua , Maosong Sun","doi":"10.1016/j.aiopen.2024.01.004","DOIUrl":"10.1016/j.aiopen.2024.01.004","url":null,"abstract":"<div><p>Vision-Language Pre-training (VLP) models have shown promising capabilities in grounding natural language in image data, facilitating a broad range of cross-modal tasks. However, we note that there exists a significant gap between the objective forms of model pre-training and fine-tuning, resulting in a need for large amounts of labeled data to stimulate the visual grounding capability of VLP models for downstream tasks. To address the challenge, we present <strong>C</strong>olor-based <strong>P</strong>rompt <strong>T</strong>uning (CPT), a novel paradigm for tuning VLP models, which reformulates visual grounding into a fill-in-the-blank problem with color-based co-referential markers in image and text, maximally mitigating the gap. In this way, CPT enables strong few-shot and even zero-shot visual grounding capabilities of VLP models. Comprehensive experimental results show that CPT achieves state-of-the-art performance on zero/few-shot visual grounding (e.g., 75.1 zero-shot accuracy in RefCOCO evaluation), outperforming fine-tuned and other prompt-tuned models by a large margin. Moreover, CPT can also be easily extended to achieve promising zero/few-shot performance on other vision-language tasks, such as visual relation detection, visual commonsense reasoning and visual question answering. We make the data and codes publicly available at <span>https://github.com/thunlp/CPT</span><svg><path></path></svg>.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"5 ","pages":"Pages 30-38"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651024000056/pdfft?md5=a0b3ea3b64a989f20cbd8db1f84428c6&pid=1-s2.0-S2666651024000056-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139686627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AI OpenPub Date : 2024-01-01DOI: 10.1016/j.aiopen.2024.01.003
Martin G. Skjæveland, Krisztian Balog, Nolwenn Bernard, Weronika Łajewska, Trond Linjordet
{"title":"An ecosystem for personal knowledge graphs: A survey and research roadmap","authors":"Martin G. Skjæveland, Krisztian Balog, Nolwenn Bernard, Weronika Łajewska, Trond Linjordet","doi":"10.1016/j.aiopen.2024.01.003","DOIUrl":"https://doi.org/10.1016/j.aiopen.2024.01.003","url":null,"abstract":"<div><p>This paper presents an ecosystem for personal knowledge graphs (PKGs), commonly defined as resources of structured information about entities related to an individual, their attributes, and the relations between them. PKGs are a key enabler of secure and sophisticated personal data management and personalized services. However, there are challenges that need to be addressed before PKGs can achieve widespread adoption. One of the fundamental challenges is the very definition of what constitutes a PKG, as there are multiple interpretations of the term. We propose our own definition of a PKG, emphasizing the aspects of (1) data ownership by a single individual and (2) the delivery of personalized services as the primary purpose. We further argue that a holistic view of PKGs is needed to unlock their full potential, and propose a unified framework for PKGs, where the PKG is a part of a larger ecosystem with clear interfaces towards data services and data sources. A comprehensive survey and synthesis of existing work is conducted, with a mapping of the surveyed work into the proposed unified ecosystem. Finally, we identify open challenges and research opportunities for the ecosystem as a whole, as well as for the specific aspects of PKGs, which include population, representation and management, and utilization.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"5 ","pages":"Pages 55-69"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651024000044/pdfft?md5=a12ec1f170570bcf4e71b8ae5c11e512&pid=1-s2.0-S2666651024000044-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139986315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Generating graph perturbations to enhance the generalization of GNNs","authors":"Sofiane Ennadir , Giannis Nikolentzos , Michalis Vazirgiannis , Henrik Boström","doi":"10.1016/j.aiopen.2024.10.001","DOIUrl":"10.1016/j.aiopen.2024.10.001","url":null,"abstract":"<div><div>Graph neural networks (GNNs) have become the standard approach for performing machine learning on graphs. Such models need large amounts of training data, however, in several graph classification and regression tasks, only limited training data is available. Unfortunately, due to the complex nature of graphs, common augmentation strategies employed in other settings, such as computer vision, do not apply to graphs. This work aims to improve the generalization ability of GNNs by increasing the size of the training set of a given problem. The new samples are generated using an iterative contrastive learning procedure that augments the dataset during the training, in a task-relevant approach, by manipulating the graph topology. The proposed approach is general, assumes no knowledge about the underlying architecture, and can thus be applied to any GNN. We provided a theoretical analysis regarding the equivalence of the proposed approach to a regularization technique. We demonstrate instances of our framework on popular GNNs, and evaluate them on several real-world benchmark graph classification datasets. The experimental results show that the proposed approach, in several cases, enhances the generalization of the underlying prediction models reaching in some datasets state-of-the-art performance.</div></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"5 ","pages":"Pages 216-223"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142704286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AI OpenPub Date : 2024-01-01DOI: 10.1016/j.aiopen.2023.08.012
{"title":"GPT understands, too","authors":"","doi":"10.1016/j.aiopen.2023.08.012","DOIUrl":"10.1016/j.aiopen.2023.08.012","url":null,"abstract":"<div><div>Prompting a pretrained language model with natural language patterns has been proved effective for natural language understanding (NLU). However, our preliminary study reveals that manual discrete prompts often lead to unstable performance—<em>e.g.</em>, changing a single word in the prompt might result in substantial performance drop. We propose a novel method P-Tuning that employs trainable continuous prompt embeddings in concatenation with discrete prompts. Empirically, P-Tuning not only stabilizes training by minimizing the gap between various discrete prompts, but also improves performance by a sizeable margin on a wide range of NLU tasks including LAMA and SuperGLUE. P-Tuning is generally effective for both frozen and tuned language models, under both the fully-supervised and few-shot settings.</div></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"5 ","pages":"Pages 208-215"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84420866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AI OpenPub Date : 2024-01-01DOI: 10.1016/j.aiopen.2024.08.003
Zhonghui Shao , Jing Zhang , Haoyang Li , Xinmei Huang , Chao Zhou , Yuanchun Wang , Jibing Gong , Cuiping Li , Hong Chen
{"title":"Authorship style transfer with inverse transfer data augmentation","authors":"Zhonghui Shao , Jing Zhang , Haoyang Li , Xinmei Huang , Chao Zhou , Yuanchun Wang , Jibing Gong , Cuiping Li , Hong Chen","doi":"10.1016/j.aiopen.2024.08.003","DOIUrl":"10.1016/j.aiopen.2024.08.003","url":null,"abstract":"<div><p>Authorship style transfer aims to modify the style of neutral text to match the unique speaking or writing style of a particular individual. While Large Language Models (LLMs) present promising solutions, their effectiveness is limited by the small number of in-context learning demonstrations, particularly for authorship styles not frequently seen during pre-training. In response, this paper proposes an inverse transfer data augmentation (<span>ITDA</span>) method, leveraging LLMs to create (neutral text, stylized text) pairs. This method involves removing the existing styles from stylized texts, a process made more feasible due to the prevalence of neutral texts in pre-training. We use this augmented dataset to train a compact model that is efficient for deployment and adept at replicating the targeted style. Our experimental results, conducted across four datasets with distinct authorship styles, establish the effectiveness of <span>ITDA</span> over traditional style transfer methods and forward transfer using GPT-3.5. For further research and application, our dataset and code are openly accessible at <span><span>https://github.com/Vicky-Shao/ITDA</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"5 ","pages":"Pages 94-103"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651024000135/pdfft?md5=3a5bc730b200d5992d33b797c1afbf4f&pid=1-s2.0-S2666651024000135-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142075773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AI OpenPub Date : 2024-01-01DOI: 10.1016/j.aiopen.2024.09.002
Jinqi Lai , Wensheng Gan , Jiayang Wu , Zhenlian Qi , Philip S. Yu
{"title":"Large language models in law: A survey","authors":"Jinqi Lai , Wensheng Gan , Jiayang Wu , Zhenlian Qi , Philip S. Yu","doi":"10.1016/j.aiopen.2024.09.002","DOIUrl":"10.1016/j.aiopen.2024.09.002","url":null,"abstract":"<div><div>The advent of artificial intelligence (AI) has significantly impacted the traditional judicial industry. Moreover, recently, with the development of AI-generated content (AIGC), AI and law have found applications in various domains, including image recognition, automatic text generation, and interactive chat. With the rapid emergence and growing popularity of large models, it is evident that AI will drive transformation in the traditional judicial industry. However, the application of legal large language models (LLMs) is still in its nascent stage. Several challenges need to be addressed. In this paper, we aim to provide a comprehensive survey of legal LLMs. We not only conduct an extensive survey of LLMs but also expose their applications in the judicial system. We first provide an overview of AI technologies in the legal field and showcase the recent research in LLMs. Then, we discuss the practical implementations presented by legal LLMs, such as providing legal advice to users and assisting judges during trials. In addition, we explore the limitations of legal LLMs, including data, algorithms, and judicial practice. Finally, we summarize practical recommendations and propose future development directions to address these challenges.</div></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"5 ","pages":"Pages 181-196"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142539171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AI OpenPub Date : 2024-01-01DOI: 10.1016/j.aiopen.2024.08.005
Qisai Liu , Xian Yeow Lee , Soumik Sarkar
{"title":"A study of natural robustness of deep reinforcement learning algorithms towards adversarial perturbations","authors":"Qisai Liu , Xian Yeow Lee , Soumik Sarkar","doi":"10.1016/j.aiopen.2024.08.005","DOIUrl":"10.1016/j.aiopen.2024.08.005","url":null,"abstract":"<div><p>Deep reinforcement learning (DRL) has been shown to have numerous potential applications in the real world. However, DRL algorithms are still extremely sensitive to noise and adversarial perturbations, hence inhibiting the deployment of RL in many real-life applications. Analyzing the robustness of DRL algorithms to adversarial attacks is an important prerequisite to enabling the widespread adoption of DRL algorithms. Common perturbations on DRL frameworks during test time include perturbations to the observation and the action channel. Compared with observation channel attacks, action channel attacks are less studied; hence, few comparisons exist that compare the effectiveness of these attacks in DRL literature. In this work, we examined the effectiveness of these two paradigms of attacks on common DRL algorithms and studied the natural robustness of DRL algorithms towards various adversarial attacks in hopes of gaining insights into the individual response of each type of algorithm under different attack conditions.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"5 ","pages":"Pages 126-141"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651024000159/pdfft?md5=a50110d80c809055a00e87466dc649b1&pid=1-s2.0-S2666651024000159-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142163035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AI OpenPub Date : 2024-01-01DOI: 10.1016/j.aiopen.2024.09.001
Kui Qian , Beth Friedman , Jun Takatoh , Alexander Groisman , Fan Wang , David Kleinfeld , Yoav Freund
{"title":"CellBoost: A pipeline for machine assisted annotation in neuroanatomy","authors":"Kui Qian , Beth Friedman , Jun Takatoh , Alexander Groisman , Fan Wang , David Kleinfeld , Yoav Freund","doi":"10.1016/j.aiopen.2024.09.001","DOIUrl":"10.1016/j.aiopen.2024.09.001","url":null,"abstract":"<div><p>One of the important yet labor intensive tasks in neuroanatomy is the identification of select populations of cells. Current high-throughput techniques enable marking cells with histochemical fluorescent molecules as well as through the genetic expression of fluorescent proteins. Modern scanning microscopes allow high resolution multi-channel imaging of the mechanically or optically sectioned brain with thousands of marked cells per square millimeter. Manual identification of all marked cells is prohibitively time consuming. At the same time, simple segmentation algorithms to identify marked cells suffer from high error rates and sensitivity to variation in fluorescent intensity and spatial distribution.</p><p>We present a methodology that combines human judgement and machine learning that serves to significantly reduce the labor of the anatomist while improving the consistency of the annotation.</p><p>As a demonstration, we analyzed murine brains with marked premotor neurons in the brainstem. We compared the error rate of our method to the disagreement rate among human anatomists. This comparison shows that our method can reduce the time to annotate by as much as ten-fold without significantly increasing the rate of errors. We show that our method achieves significant reduction in labor while achieving an accuracy that is similar to the level of agreement between different anatomists.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"5 ","pages":"Pages 142-154"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651024000160/pdfft?md5=d645a8a10e8ed886c8fad283100f34b8&pid=1-s2.0-S2666651024000160-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142238093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}