Cari Beth Head, Paul Jasper, Matthew McConnachie, Linda Raftree, Grace Higdon
{"title":"Large language model applications for evaluation: Opportunities and ethical implications","authors":"Cari Beth Head, Paul Jasper, Matthew McConnachie, Linda Raftree, Grace Higdon","doi":"10.1002/ev.20556","DOIUrl":"https://doi.org/10.1002/ev.20556","url":null,"abstract":"Abstract Large language models (LLMs) are a type of generative artificial intelligence (AI) designed to produce text‐based content. LLMs use deep learning techniques and massively large data sets to understand, summarize, generate, and predict new text. LLMs caught the public eye in early 2023 when ChatGPT (the first consumer facing LLM) was released. LLM technologies are driven by recent advances in deep‐learning AI techniques, where language models are trained on extremely large text data from the internet and then re‐used for downstream tasks with limited fine‐tuning required. They offer exciting opportunities for evaluators to automate and accelerate time‐consuming tasks involving text analytics and text generation. We estimate that over two‐thirds of evaluation tasks will be affected by LLMs in the next 5 years. Use‐case examples include summarizing text data, extracting key information from text, analyzing and classifying text content, writing text, and translation. Despite the advances, the technologies pose significant challenges and risks. Because LLM technologies are generally trained on text from the internet, they tend to perpetuate biases (racism, sexism, ethnocentrism, and more) and exclusion of non‐majority languages. Current tools like ChatGPT have not been specifically developed for monitoring, evaluation, research, and learning (MERL) purposes, possibly limiting their accuracy and usefulness for evaluation. In addition, technical limitations and challenges with bias can lead to real world harm. To overcome these technical challenges and ethical risks, the evaluation community will need to work collaboratively with the data science community to co‐develop tools and processes and to ensure the application of quality and ethical standards.","PeriodicalId":35250,"journal":{"name":"New Directions for Evaluation","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136370974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nina R. Sabarre, Blake Beckmann, Sahiti Bhaskara, Kathleen Doll
{"title":"Using AI to disrupt business as usual in small evaluation firms","authors":"Nina R. Sabarre, Blake Beckmann, Sahiti Bhaskara, Kathleen Doll","doi":"10.1002/ev.20562","DOIUrl":"https://doi.org/10.1002/ev.20562","url":null,"abstract":"Abstract While many knowledge workers may fear that the rise of artificial intelligence (AI) will threaten their jobs, this article argues that small evaluation businesses should embrace AI tools to increase their value in the marketplace and remain relevant. In this article, consultants from a research, evaluation, and strategy firm, Intention 2 Impact, Inc., make a case for using AI tools to disrupt business as usual in evaluation from theoretical and practical perspectives. Theoretically, AI may be another example of technology that was initially feared but is now ubiquitous in society. Using concrete examples, the authors describe how businesses and evaluators have evolved to keep up with changes in supply and demand. Lastly, it is posited that embracing AI will save time for those working in small businesses, which can ultimately increase added value and profitability.","PeriodicalId":35250,"journal":{"name":"New Directions for Evaluation","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136370975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Finding a safe zone in the highlands: Exploring evaluator competencies in the world of AI","authors":"Sarah Mason","doi":"10.1002/ev.20561","DOIUrl":"https://doi.org/10.1002/ev.20561","url":null,"abstract":"Abstract Since the public launch of ChatGPT in November 2022, disciplines across the globe have grappled with questions about how emerging artificial intelligence will impact their fields. In this article I explore a set of foundational concepts in artificial intelligence (AI), then apply them to the field of evaluation broadly, and the American Evaluation Association's evaluator competencies more specifically. Given recent developments in narrow AI, I then explore two potential frameworks for considering which evaluation competencies are most likely to be impacted—and potentially replaced—by emerging AI tools. Building on Moravec's Landscape of Human Competencies and Lee's Risk of Replacement Matrix I create an exploratory Landscape of Evaluator Competencies and an Evaluation‐Specific Risk of Replacement Matrix to help conceptualize which evaluator competencies may be more likely to contribute to long‐term sustainability for the field. Overall, I argue that the interpersonal, and contextually‐responsive aspects of evaluation work—in contrast to the more technical, program management, or methodological aspects of the field—may be the competencies least likely to be impacted or replaced by AI. As such, these may be the competencies we continue to emphasize, both in the day‐to‐day aspects of our operations, and in the training of new and emerging evaluators. This article is intended to be a starting point for discussions that continue throughout the remainder of this issue.","PeriodicalId":35250,"journal":{"name":"New Directions for Evaluation","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135219668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluation criteria for artificial intelligence","authors":"Bianca Montrosse‐Moorhead","doi":"10.1002/ev.20566","DOIUrl":"https://doi.org/10.1002/ev.20566","url":null,"abstract":"Abstract Criteria identify and define the aspects on which what we evaluate is judged and play a central role in evaluation practice. While work on the use of AI in evaluation is burgeoning, at the time of writing, a set of criteria to consider in evaluating the use of AI in evaluation has not been proposed. As a first step in this direction, Teasdale's Criteria Domains Framework was used as the lens through which to critically read articles included in this special issue. This resulted in the identification of eight criteria domains for evaluating the use of AI in evaluation. Three of these criteria domains relate to the conceptualization and implementation of AI in evaluation practice. Five criteria domains are focused on outcomes, specifically those stemming from the use of AI in evaluation. More work is needed to further identify and deliberate possible criteria domains for AI use in evaluation.","PeriodicalId":35250,"journal":{"name":"New Directions for Evaluation","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135195070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Meeting the challenges of educating internal evaluators","authors":"C. Lam, Keiko Kuji‐Shikatani, A. Love","doi":"10.1002/ev.20539","DOIUrl":"https://doi.org/10.1002/ev.20539","url":null,"abstract":"The COVID‐19 pandemic and its related health, social, economic and geopolitical shocks have greatly increased the demand for internal evaluation as a way of helping organizations, especially those in the public sector, adapt to ongoing challenges and new realities. To help meet the demand, this chapter discusses the recent trend to educate managers, front‐line supervisors and other organization professionals to be nonspecialist internal evaluators—individuals who are not evaluation specialists. Three experienced internal evaluators and educators share real‐world examples of their successful strategies for educating nonspecialist internal evaluators. They conclude with a discussion of lessons learned and suggestions for the road ahead.","PeriodicalId":35250,"journal":{"name":"New Directions for Evaluation","volume":"2023 1","pages":"103 - 95"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47767883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"¡Milwaukee Evaluation! Inc.: Design principles for recentering social justice in evaluation training and disrupting whiteness and neoliberalism","authors":"N. Robinson, Emily R. Connors, T. Cobb","doi":"10.1002/ev.20545","DOIUrl":"https://doi.org/10.1002/ev.20545","url":null,"abstract":"Across the nation, the American Evaluation Association (AEA) recognizes over thirty volunteer‐led organizations called local affiliates. These affiliates provide professional development, networking, and field building opportunities that influence the local evaluation marketplace and ecosystem in ways that have not been systematically studied or understood within the larger discourse of continuing education for evaluators. In this chapter, we present a single case study on the long‐term efforts of ¡Milwaukee Evaluation! Inc., the AEA Local Affiliate in Wisconsin, to recenter social justice in evaluator training and education. The case study is presented in the form of two design principles that have shaped the affiliate's work over the past 10 years and helped move the local evaluation marketplace and infrastructure toward deeper expressions of social justice. The affiliate uses a two‐generation approach, creates liminal spaces as the site for critical consciousness raising and emancipatory capacity building, validates (working class) evaluators of color, and covers topics such as reparations and neoliberalism in its educational offerings. The rationale for this approach is discussed.","PeriodicalId":35250,"journal":{"name":"New Directions for Evaluation","volume":"2023 1","pages":"75 - 84"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48814979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Guest Editors’ notes","authors":"J. LaVelle, L. Neubauer, A. Boyce, T. Archibald","doi":"10.1002/ev.20543","DOIUrl":"https://doi.org/10.1002/ev.20543","url":null,"abstract":"This volume has been in development since November 2019, when between AEA2019 sessions Guest Editor John LaVelle posed writing a New Directions for Evaluation on evaluator education to the NDE Editors, and received a supportive nod to move forward with developing a proposal. Following that hallway discussion, he reached out to some of the many people in evaluation that he admired; colleagues that could complement his strengths, offset his limitations, provide checks and balances to his vision, and make the project better through their contributions and critiques. Leah C. Neubauer, Ayesha S. Boyce, and Tom Archibald agreed to collaborate on the project. This group, the self-titled “Dream Team” (Leah reminded us of that name’s origins, the 1992 USA men’s Olympic basketball team) united around the ethos of creating an intentional space for contributors that had often not been included in discussions about evaluator education, and asking them to imagine a better future for evaluator education than we had experienced ourselves. Like any major project, there have been adventures and challenges along the way. Conceptually, we were critiqued for being a volume led solely by faculty in university settings, not being radical enough in our vision, and the accelerated timelines for submitting proposals to us for review (see Neubauer et al., 2023 ). Procedurally, when we received over forty proposals for consideration, we proposed editing two NDE volumes to incorporate as many voices as possible; when told that was not an option, we had to find ways to pare down the number of chapters while still maximizing voices in the conversation. Individually and collectively, this meant us giving up including individual chapters that addressed topics we are passionate about and find value in discussing, including chapters that we wanted to write ourselves. Humanistically, we had to find ways to recognize and support the individual and group needs of over 50 individual contributors from across the world, inclusive of scheduling and timeline challenges, writing styles, individual visions for chapters, paradigmatic and disciplinary idiosyncrasies, cultural and pan-national positionalities, and a range of experiences publishing in peer-reviewed and peer-edited outlets. All of this occurred during the COVID-19 pandemic. Upon reflection, we think these sorts of challenges are probably the rule rather than the exception. As we look at our processes, products, and connections our contributors have created, we have no regrets about the decisions we made.","PeriodicalId":35250,"journal":{"name":"New Directions for Evaluation","volume":"2023 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43008732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rebecca M. Teasdale, R. Pitts, Emily F. Gates, Clara Shim
{"title":"Teaching specification of evaluative criteria: A guide for evaluation education","authors":"Rebecca M. Teasdale, R. Pitts, Emily F. Gates, Clara Shim","doi":"10.1002/ev.20546","DOIUrl":"https://doi.org/10.1002/ev.20546","url":null,"abstract":"Evaluative criteria describe the attributes that define a high‐quality intervention and represent values about which intervention characteristics or results are desirable. There are many types of criteria, including those focused on intervention outcomes or impact, design and implementation, and relevance. Criteria may remain implicit and assumed in evaluation practice, yet they nonetheless direct evaluative inquiry and provide the value basis for evaluative conclusions. It is therefore important for novice evaluators to understand why criteria matter and how to thoughtfully consider which and whose criteria to use. We introduce an approach for teaching criteria specification to guide novice evaluators in considering, selecting, and articulating criteria. We introduce key concepts, outline an instructional approach, and provide a downloadable teaching activity for graduate students, undergraduates, and working professionals.","PeriodicalId":35250,"journal":{"name":"New Directions for Evaluation","volume":"2023 1","pages":"31 - 37"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44140780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tiffany L. S. Tovey, L. Smith, Dana Jayne Linnell, D. Wisner, Cade Coles
{"title":"Explicitly integrating interpersonal skills in the evaluation curriculum","authors":"Tiffany L. S. Tovey, L. Smith, Dana Jayne Linnell, D. Wisner, Cade Coles","doi":"10.1002/ev.20541","DOIUrl":"https://doi.org/10.1002/ev.20541","url":null,"abstract":"Interpersonal skills are an essential element of evaluators’ work, particularly when evaluators aim to affect the understanding and use of evaluation to make changes in the programs and policies we are evaluating. In this chapter, we provide a brief background on interpersonal skills in the evaluation literature, situating these competencies as essential in making evaluators facilitators of change, and highlight evaluation education's lack of explicit training in interpersonal skills. We then share our experiences with a standalone course for evaluators on interpersonal skill development. We conclude with insights into how evaluation educators can integrate activities focused on developing interpersonal skills into their own evaluation courses.","PeriodicalId":35250,"journal":{"name":"New Directions for Evaluation","volume":"2023 1","pages":"23 - 29"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42863127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Paisley Worthington, Rebecca Stroud Stasel, Katrina Carbone, Jennifer H. Hughes, Michelle Searle
{"title":"Learning by linking the Canadian Evaluation Society's student case competition within a graduate evaluation course","authors":"Paisley Worthington, Rebecca Stroud Stasel, Katrina Carbone, Jennifer H. Hughes, Michelle Searle","doi":"10.1002/ev.20536","DOIUrl":"https://doi.org/10.1002/ev.20536","url":null,"abstract":"There are many ways to intertwine theoretical and applied learning to nurture the competencies required to conduct evaluation. Experiential learning opportunities remain a priority for many evaluation educators who are helping learners apply foundational skills and knowledge to practice. Evaluators develop their professional expertise in diverse venues, including through experience, through professional learning, or, as we highlight in this chapter, in graduate school. Incorporating experiential learning from a professional association into a formal graduate course requires a willingness to blend university course expectations and activities with collaborative learning experiences. Using reflective dialogue and poetry enacted through dialogic analysis and reflection, we examine enduring perceptions and learning activated from student participation in the Canadian Evaluation Society's national evaluation case competition as part of evaluation education situated within a formal university graduate course. Weaving five voices representing learners, case study coach, and course instructor, we discuss how the evaluation competition was used to deepen understanding and develop evaluator competencies.","PeriodicalId":35250,"journal":{"name":"New Directions for Evaluation","volume":"2023 1","pages":"105 - 113"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43546232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}