{"title":"Mining EZProxy Data: User Demographics and Electronic Resources","authors":"Ellie Kohler, Connie Stovall","doi":"10.29242/lac.2018.68","DOIUrl":null,"url":null,"abstract":"After a mandate to utilize data to demonstrate impact on student success, Virginia Tech University Libraries began diving into previously untapped data sources. Given that the collections budget makes up 48% of the total library budget, roughly 90% of which streams to electronic resources, it was deemed necessary to make more direct connections between electronic resource usage and student success. Usual practices prior to the charge involved analyzing usage from Counter reports and cost data, such as frequency and cost per use, primarily for the purposes of serials budgeting and negotiations. Due to these past data collection analysis practices, university libraries could only create basic inferences about its electronic resource users. In order to create more robust user inferences, the university libraries turned to EZproxy logs as well as university-collected student data and began a multiphase research project based on the connection between the two data streams. The long-range purpose of the research project is to create better understanding of student user demographics by connecting electronic resource usage information with university-held student demographic information. Ultimately, plans include impact measurement of the university libraries on Virginia Tech’s overall success and constitutes the start of a broader systematic study of the impact of university libraries’ dollars spent on electronic resources. Development of this study includes research into encryption and anonymization techniques, as well as current best practices in security of personal information. Discussion will include challenges, including onand off-campus usage access and meeting resistance to utilizing personally identifiable data. The discussion will also include tools utilized in the study, which include EZproxy, Graylog, Python, and Tableau. Background and Purpose Total collections spending typically makes up 37% of an academic libraries’ total library expenditures.1 Virginia Tech’s collection spending consumes even more than the average at 48%; electronic resources consume 90% of that collections budget. Given the sheer proportion of funding devoted to electronic resources, it is not surprising that administrators need more data to demonstrate the effectiveness of investments. Libraries, like all other university units, need to map their outcomes to the university’s and demonstrate value and impact, and doing so with data is imperative. “More than 2,500 institutions worldwide are currently using Ezproxy,” and, for many universities, utilizing usage data from EZproxy creates opportunities to demonstrate value and impact.2 Literature Review Libraries use a variety of methods to demonstrate the impact of their products and services. One university library analyzed the following service points where they also collected corresponding user identification at each: all types of reference questions, circulation transactions, instruction sessions, delivery requests, interlibrary loan requests, and EZproxy logins for off-campus users. While that university had already performed a cost-benefit analysis for most services points, it had not utilized EZproxy data but decided to do so after more pressure to demonstrate impact on student success. After collecting user data, they obtained demographic data corresponding to each user via campus institutional research. Like many investigating impact, they obtained the following: academic standing, academic level, academic program, age, sex, ethnicity, enrollment status, and GPA. Notably, the service demonstrating most use was the EZproxy login data.3 McCarthy studied data of 4,803 distance learners enrolled in at least one online class by means of off-campus EZproxy logins and from the college registrar records using the Banner system. The researcher utilized","PeriodicalId":193553,"journal":{"name":"Proceedings of the 2018 Library Assessment Conference: Building Effective, Sustainable, Practical Assessment: December 5–7, 2018, Houston, TX","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2018 Library Assessment Conference: Building Effective, Sustainable, Practical Assessment: December 5–7, 2018, Houston, TX","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.29242/lac.2018.68","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
After a mandate to utilize data to demonstrate impact on student success, Virginia Tech University Libraries began diving into previously untapped data sources. Given that the collections budget makes up 48% of the total library budget, roughly 90% of which streams to electronic resources, it was deemed necessary to make more direct connections between electronic resource usage and student success. Usual practices prior to the charge involved analyzing usage from Counter reports and cost data, such as frequency and cost per use, primarily for the purposes of serials budgeting and negotiations. Due to these past data collection analysis practices, university libraries could only create basic inferences about its electronic resource users. In order to create more robust user inferences, the university libraries turned to EZproxy logs as well as university-collected student data and began a multiphase research project based on the connection between the two data streams. The long-range purpose of the research project is to create better understanding of student user demographics by connecting electronic resource usage information with university-held student demographic information. Ultimately, plans include impact measurement of the university libraries on Virginia Tech’s overall success and constitutes the start of a broader systematic study of the impact of university libraries’ dollars spent on electronic resources. Development of this study includes research into encryption and anonymization techniques, as well as current best practices in security of personal information. Discussion will include challenges, including onand off-campus usage access and meeting resistance to utilizing personally identifiable data. The discussion will also include tools utilized in the study, which include EZproxy, Graylog, Python, and Tableau. Background and Purpose Total collections spending typically makes up 37% of an academic libraries’ total library expenditures.1 Virginia Tech’s collection spending consumes even more than the average at 48%; electronic resources consume 90% of that collections budget. Given the sheer proportion of funding devoted to electronic resources, it is not surprising that administrators need more data to demonstrate the effectiveness of investments. Libraries, like all other university units, need to map their outcomes to the university’s and demonstrate value and impact, and doing so with data is imperative. “More than 2,500 institutions worldwide are currently using Ezproxy,” and, for many universities, utilizing usage data from EZproxy creates opportunities to demonstrate value and impact.2 Literature Review Libraries use a variety of methods to demonstrate the impact of their products and services. One university library analyzed the following service points where they also collected corresponding user identification at each: all types of reference questions, circulation transactions, instruction sessions, delivery requests, interlibrary loan requests, and EZproxy logins for off-campus users. While that university had already performed a cost-benefit analysis for most services points, it had not utilized EZproxy data but decided to do so after more pressure to demonstrate impact on student success. After collecting user data, they obtained demographic data corresponding to each user via campus institutional research. Like many investigating impact, they obtained the following: academic standing, academic level, academic program, age, sex, ethnicity, enrollment status, and GPA. Notably, the service demonstrating most use was the EZproxy login data.3 McCarthy studied data of 4,803 distance learners enrolled in at least one online class by means of off-campus EZproxy logins and from the college registrar records using the Banner system. The researcher utilized