{"title":"Taking count: A computational analysis of data resources on academic LibGuides","authors":"C. Hennesy, Alicia Kubas, J. McBurney","doi":"10.29173/iq1040","DOIUrl":"https://doi.org/10.29173/iq1040","url":null,"abstract":"The LibGuides platform is a ubiquitous tool in academic libraries and is commonly used by librarians to compile and share lists of recommended social science numerical data resources with users. This study leverages the machine-accessible nature of the LibGuides platform to collect links to data and statistical resources from over 10,000 LibGuide pages at 123 R1 research institutions. After substantial data cleaning and normalization, an analysis of the most common resources on those guides provides a unique window into the data repositories, libraries, archives, statistical data platforms, and other machine-readable data sources that are most popular on academic library guides. Results show that freely available resources from U.S. government agencies are among the most common to be included on data and statistical resources guides across institutions. Resources requiring paid licenses or memberships for full access, such as Statistical Insight (ProQuest), Social Explorer, and ICPSR are linked to most frequently overall, regardless of the percentage of institutions that include them. Findings also suggest that libraries are more likely to share traditional licensed statistical resources (e.g., Cambridge’s Historical Statistics of the United States) and collections of simple charts and graphs (e.g., Statista) than more robust and complex microdata resources (e.g., IPUMS).","PeriodicalId":84870,"journal":{"name":"IASSIST quarterly","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42346586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lucas Hertzog, Jenny Chen-Charles, Camille Wittesaele, K. de Graaf, Ray Titus, Jane-Frances Kelly, N. Langwenya, L. Baerecke, B. Banougnin, W. Saal, John Southall, L. Cluver, E. Toska
{"title":"Data management instruments to protect the personal information of children and adolescents in sub-Saharan Africa","authors":"Lucas Hertzog, Jenny Chen-Charles, Camille Wittesaele, K. de Graaf, Ray Titus, Jane-Frances Kelly, N. Langwenya, L. Baerecke, B. Banougnin, W. Saal, John Southall, L. Cluver, E. Toska","doi":"10.29173/iq1044","DOIUrl":"https://doi.org/10.29173/iq1044","url":null,"abstract":"Recent data protection regulatory frameworks, such as the Protection of Personal Information Act (POPI Act) in South Africa and the General Data Protection Regulation (GDPR) in the European Union, impose governance requirements for research involving high-risk and vulnerable groups such as children and adolescents. Our paper's objective is to unpack what constitutes adequate safeguards to protect the personal information of vulnerable populations such as children and adolescents. We suggest strategies to adhere meaningfully to the principal aims of data protection regulations. Navigating this within established research projects raises questions about how to interpret regulatory frameworks to build on existing mechanisms already used by researchers. Therefore, we will explore a series of best practices in safeguarding the personal information of children, adolescents and young people (0-24 years old), who represent more than half of sub-Saharan Africa's population. We discuss the actions taken by the research group to ensure regulations such as GDPR and POPIA effectively build on existing data protection mechanisms for research projects at all stages, focusing on promoting regulatory alignment throughout the data lifecycle. Our goal is to stimulate a broader conversation on improving the protection of sensitive personal information of children, adolescents and young people in sub-Saharan Africa. We join this discussion as a research group generating evidence influencing social and health policy and programming for young people in sub-Saharan Africa. Our contribution draws on our work adhering to multiple transnational governance frameworks imposed by national legislation, such as data protection regulations, funders, and academic institutions.","PeriodicalId":84870,"journal":{"name":"IASSIST quarterly","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44274109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
João Aguiar Castro, Joana Rodrigues, Paula Mena Matos, Célia M D Sales, Cristina Ribeiro
{"title":"Getting in touch with metadata: a DDI subset for FAIR metadata production in clinical psychology","authors":"João Aguiar Castro, Joana Rodrigues, Paula Mena Matos, Célia M D Sales, Cristina Ribeiro","doi":"10.29173/iq1008","DOIUrl":"https://doi.org/10.29173/iq1008","url":null,"abstract":"To address metadata with researchers it is important to use models that include familiar domain concepts. In the Social Sciences, the DDI is a well-accepted source of such domain concepts. To create FAIR data and metadata, we need to establish a compact set of DDI elements that fit the requirements in projects and are likely to be adopted by researchers inexperienced with metadata creation. Over time, we have engaged in interviews and data description sessions with research groups in the Social Sciences, identifying a manageable DDI subset. A recent Clinical Psychology project, TOGETHER, dealing with risk assessment for hereditary cancer, considered the inclusion of a DDI subset for the production of metadata that are timely and interoperable with data publication initiatives in the same domain. Taking a DDI subset identified by the data curators, we make a preliminary assessment of its use as a realistic effort on the part of researchers, taking into consideration the metadata created in two data description sessions, the effort involved, and overall metadata quality. A follow-up questionnaire was used to assess the perspectives of researchers regarding data description.","PeriodicalId":84870,"journal":{"name":"IASSIST quarterly","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44750458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Editor's notes: FAIR BOT. As metadata is data is metadata is data ...","authors":"K. Rasmussen","doi":"10.29173/iq1086","DOIUrl":"https://doi.org/10.29173/iq1086","url":null,"abstract":"Welcome to the first issue of IASSIST Quarterly for the year 2023 - IQ vol. 47(1). \u0000The last article in this issue has in the title the FAIR acronym that stands for Findable, Accessible, Interoperable, and Reusable. These are the concepts most often focused on by our articles in the IQ and FAIR has an extra emphasis in this issue. The first article introduces and demonstrates a shared vocabulary for data points where the need arose after confusions about data and metadata. Basically, I find that the most valuable virtue of well-structured data – I deliberately use a fuzzy term to save you from long excursions here in the editor's notes – is that other well-structured data can benefit from use of the same software. Similarly, well-structured metadata can benefit from the same software. I also see this as the driver for the second article, on time series data and description. Sometimes, the software mentioned is the same software in both instances as metadata is treated as data or vice versa. This allows for new levels of data-driven machine actions. These days universities are busy investigating and discussing the latest chatbots. I find many of the approaches restrictive and prefer to support the inclusive ones. Likewise, I also expect and look forward to bots having great relevance for the future implementation of FAIR principles. \u0000The first article is on data and metadata by George Alter, Flavio Rizzolo, and Kathi Schleidt and has the title ‘View points on data points: A shared vocabulary for cross-domain conversations on data and metadata’. The authors have observed that sharing data across scientific domains is often impeded by differences in the language used to describe data and metadata. To avoid confusion, the authors develop a terminology. Part of the confusion concerns disagreement about the boundaries between data and metadata; and that what is metadata in one domain can be data in another. The shift between data and metadata is what they name as ‘semantic transposition’. I find that such shifts are a virtue and a strength and as the authors say, there is no fixed boundary between data and metadata, and both can be acted upon by people and machines. The article draws on and refers to many other standards and developments, most cited are the data model of Observations and Measurements (ISO 19156) and tools of the Data Documentation Initiative’s Cross Domain Integration (DDI-CDI). The article is thorough and explanatory with many examples and diagrams for learning, including examples of transformations between the formats: wide, long, and multidimensional. The long format of entity-attribute-value has the value domain restricted by the attribute, and in examples time and source are added, which demonstrates how further metadata enter the format. When transposing to the wide format, this is a more familiar data matrix where the same value domain applies to the complete column. The multidimensional format with facets is for most reade","PeriodicalId":84870,"journal":{"name":"IASSIST quarterly","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"69787768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modernizing data management at the US Bureau of Labor Statistics","authors":"Daniel W. Gillman, Clayton Waring","doi":"10.29173/iq1038","DOIUrl":"https://doi.org/10.29173/iq1038","url":null,"abstract":"The US Bureau of Labor Statistics (BLS) is undertaking initiatives to improve its data and metadata systems. Planning for the replacement of the public facing LABSTAT data query system and efforts within the Office of Productivity and Technology to combine multiple production systems within a single cross-divisional database platform are examples. BLS views time-series data as a combination of three elemental components found in every time-series. A measure element; a person, places, and things element; and a time element are the components. The authors turned this basic approach into a formal conceptual model represented in UML (Unified Modeling Language). The UML model describes a flexible multi-dimensional data structure, of which time-series are a kind, and supports any kind of query into the data. The Office of Productivity and Technology has adopted the model, and it is guiding their approach moving forward. The model was also adopted by the Financial Industry Business Ontology project under the Object Management Group and by the Data Documentation Initiative Cross-Domain Integration (DDI-CDI) development project. There are other similarities between the OPT effort and DDI-CDI as well. In this way, the OPT project demonstrates the feasibility and usefulness of many of the ideas in DDI-CDI. In this paper we describe the time-series formulation and the UML conceptual model. Then, the design of the OPT system and some of its features are described, relating those that are like DDI-CDI where appropriate. In doing so, we provide a thorough understanding of the structure of time-series.","PeriodicalId":84870,"journal":{"name":"IASSIST quarterly","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46238134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"View points on data points: A shared vocabulary for cross-domain conversations on data and metadata","authors":"George Alter, Flavio Rizzolo, K. Schleidt","doi":"10.29173/iq1051","DOIUrl":"https://doi.org/10.29173/iq1051","url":null,"abstract":"Sharing data across scientific domains is often impeded by differences in the language used to describe data and metadata. We argue that disagreements over the boundary between data and metadata are a common source of confusion. Information appearing as data in one domain may be considered metadata in another domain, a process that we call “semantic transposition.” To promote greater understanding, we develop new terminology for describing how data and metadata are structured, and we show how it can be applied to a variety of widely used data formats. Our approach builds upon previous work, such as the Observations and Measurements (ISO 19156) data model. We rely on tools from the Data Documentation Initiative’s Cross Domain Integration (DDI-CDI) to illustrate how the same data can be represented in different ways, and how information considered data in one format can become metadata in another format.","PeriodicalId":84870,"journal":{"name":"IASSIST quarterly","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48708553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A model for data ethics instruction for non-experts","authors":"L. Phan, Ibraheem Ali, S. Labou, E. Foster","doi":"10.29173/iq1028","DOIUrl":"https://doi.org/10.29173/iq1028","url":null,"abstract":"The dramatic increase in use of technological and algorithmic-based solutions for research, economic, and policy decisions has led to a number of high-profile ethical and privacy violations in the last decade. Current disparities in academic curriculum for data and computational science result in significant gaps regarding ethics training in the next generation of data-intensive researchers. Libraries are often called to fill the curricular gaps in data science training for non-data science disciplines, including within the University of California (UC) system. We found that in addition to incomplete computational training, ethics training is almost completely absent in the standard course curricula. In this report, we highlight the experiences of library data services providers in attempting to meet the need for additional training, by designing and running two workshops: Ethical Considerations in Data (2021) and its sequel Data Ethics & Justice (2022). We discuss our interdisciplinary workshop approach and our efforts to highlight resources that can be used by non-experts to engage productively with these topics. Finally, we report a set of recommendations for librarians and data science instructors to more easily incorporate data ethics concepts into curricular instruction.","PeriodicalId":84870,"journal":{"name":"IASSIST quarterly","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46414958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Emancipating data science for Black and Indigenous students via liberatory datasets and curricula","authors":"T. Monroe-White","doi":"10.29173/iq1007","DOIUrl":"https://doi.org/10.29173/iq1007","url":null,"abstract":"Despite findings highlighting the severe underrepresentation of women and minoritized groups in data science, most scholarly research has focused on new methodologies, tools, and algorithms as opposed to who data scientists are or how they learn their craft. This paper proposes that increased representation in data science can be achieved via advancing the curation of datasets and pedagogies that empower Black, Indigenous, and other minoritized people of color to enter the field. This work contributes to our understanding of the obstacles facing minoritized students in the classroom and solutions to mitigate their marginalization.","PeriodicalId":84870,"journal":{"name":"IASSIST quarterly","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43434409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The work continues","authors":"Michele Hayslett","doi":"10.29173/iq1076","DOIUrl":"https://doi.org/10.29173/iq1076","url":null,"abstract":"Welcome to the final issue of the IASSIST Quarterly for the year 2022 – IQ volume 46(4), our eagerly-awaited special issue on Systemic Racism in Data Practices.\u0000This issue represents more than you might think: the culmination of more than two years of the intellectual hard work of writing, of course, but that in itself is not unusual for any journal issue. However. The global pandemic exploded just after the conception of this special issue and hit all of us hard, wreaking not only physical destruction of lives but also unleashing social upheaval, job insecurity, housing insecurity, and major mental health challenges. Social injustice erupted during the pandemic, shocking and enraging many of us with its violence and disregard for human dignity. I was privileged to witness the genesis of this issue, and I helped recruit our guest editors, Trevor Watkins and Jonathan Cain. I salute their perseverance, patience and courage, and that of the article authors, in bringing this content to fruition. Many involved in this issue faced multiple personal challenges, from the loss of family members to repeated moves, job changes, and more in the process of trying to get this work done. Some were unable to surmount the many obstacles and were forced to withdraw their proposals. So I do not think it is hyperbole to say this is the hardest issue we have ever produced. Trevor and Jonathan, thank you again for spearheading this important work.\u0000Some good things have come from the societal call for racial justice for IASSIST, including this issue of the IQ. IASSIST has initiated several new ventures to advocate for diversity and equity, both within our organization and among researchers generally: We restructured our membership fees to allow half price for people joining from lower income countries. IASSIST also sponsored diversity scholarships for members to attend the American Library Association conference and the ICPSR Summer Program in Quantitative Methods in 2022. A new Anti-racism Resources Interest Group which focuses on compiling anti-racism resources has been working for more than two years and recently collaborated with the Professional Development Committee to present a webinar on varying national approaches to collecting (or not collecting) data about race and ethnicity (see this page for the webinar recording as well as the essays members have written). The group welcomes contributions of essays for additional countries and suggestions of other webinar topics. Looking ahead, the 2023 conference theme is Diversity in Research: Social Justice from Data, sure to result in some fascinating presentations (and future IQ papers!). And here at the IQ, we’re already contemplating a second special issue in this area around the role of social justice in data services. We invite volunteers who would like to serve as guest editors to contact us. And so the work continues.\u0000The IQ editorial team is happy to welcome a new volunteer, Phillip Ndhlovu, ","PeriodicalId":84870,"journal":{"name":"IASSIST quarterly","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47768769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nastasha E. Johnson, M. Nelson, Katherine N. Yngve
{"title":"Deficit, asset, or whole person? Institutional data practices that impact belongingness","authors":"Nastasha E. Johnson, M. Nelson, Katherine N. Yngve","doi":"10.29173/iq1031","DOIUrl":"https://doi.org/10.29173/iq1031","url":null,"abstract":"Given the capitalist model of higher education that has developed since the 1980s, the data collected by institutions of higher education on students is based on micro-targeting to understand and retain students as consumers, and to retain that customer base (i.e. to prevent attrition/dropouts). Institutional data has long been collected but the authors will question how, why, and for whom the data is collected in the current higher education model. The authors will then turn to the current higher education focus on equity, diversity, inclusion, and particularly on the concept of belongingness in higher education. The authors question the collective and local purposes of institutional data collection and the fallout of the current practices and will argue that using existing institutional data to facilitate student belongingness is impossible with current practices. We will propose a new framework of asset-minded institutional data practices that centers the student as a whole person and recenters data collection away from the concept of students as commodities. We propose a new framework based on data feminism that intends to elevate qualitative data and all persons/experiences along the bell-shaped curve, not just the middle two quadrants.\u0000 ","PeriodicalId":84870,"journal":{"name":"IASSIST quarterly","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43188195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}