Data and text mining facilitating our researchers needs


















- Slides: 18
Data and text mining: facilitating our researchers’ needs in the 21 st century A case study Hui Hua Chua ALCTS CRS College & Research Libraries IG ALA Annual Conference · Chicago, IL · June 25, 2017
• What is Text Assembler? • Project context • Implementation process • Lessons learned
WHAT IS TEXT ASSEMBLER? TA user interface Text Assemble r WSK API Lexis. Nexis news corpus User interface: authenticated users query and view sample search results; queue query for processing and download results TA: passes query to API; retrieves and stores results for delivery; processes results to create plain text file; manages multiple queries and query processing based on WSK parameters
MICHIGAN STATE UNIVERSITY • Fall 2016: 50, 344 students enrolled • Undergraduate (39, 090 or 77. 6%) vs graduate (11, 254 or 22. 4%) • Top five colleges by enrollment: Business (7686), Social Science (6562), Natural Science (6192), Engineering (6075), Agriculture & Natural Resources (4588)
TDM LANDSCAPE IN 2014 • In flux and developing • Libraries and publishers receiving researcher requests, but often no established business, technical or access models • Ad-hoc access to data requested on case-by-case basis from publisher • License or purchase corpus and host in-house
MSU LIBRARIES’ ROLE IN TDM • Facilitate data access (license negotiation, purchase, data hosting, work with APIs) • Consult or provide assistance with tools, methods or data Almost always driven by specific researcher request
LEXISNEXIS WEB SERVICES KIT (WSK) • Product: API plus access to content; enables larger downloads of data than Lexis. Nexis Academic • Content: current news with updates • Business model: subscription • No specific user request or demand “Let a hundred flowers bloom”
IMPLEMENTATION TEAM • Digital Scholarship Librarian: Thomas Padilla • Digital Library Programmer: Devin Higgins • Programmer: Megan Schanz • Liaison to School of Journalism: Hui Hua Chua
IMPLEMENTATION TIMELINE & PROCESS 7/2014 WSK acquired 8/201412/2014 Fact-finding & testing • • In-house LN sales and technical staff Other universities that had licensed WSK Potential MSU users and use cases Decision made to develop unmediated web-based user interface. Why? Technical considerations and large potential user base
IMPLEMENTATION TIMELINE & PROCESS 11/20141/2015 Direct API use to answer 2 specific research questions 12/2014 Project request submitted to MSUL Systems department 2/2015 Programmer assigned. System developed (1. 5 months) 8/2015 Usability testing of UI 9/2015 Public launch of Text Assembler 12/2016 Permission received from MSU Technologies to share code with acknowledgement. Source code: https: //gitlab. msu. edu/msu-libraries/text-assembler
LESSONS LEARNT: BEFORE IMPLEMENTATION • Better coordination of acquisition and implementation processes. Factfinding could have been completed before purchase, leading to less time to public launch.
LESSONS LEARNT: POST IMPLEMENTATION • How to balance system constraints with user behavior or preferences • • • System update (1 week development time) Work with users to formulate more targeted queries Hourly query rotation • More guidance needed with query formulation • • More explicit search instructions, specifically for selecting appropriate data sources and field searching Work with users individually • Takes a while for new system to be adopted and used
USAGE Queued searches since system went live 121 Number of different users in the last 3 months Average number of results per search 5 Largest search 299, 146 24, 855
USAGE CONTINUED Based on review of specific queries and user contacts • Primarily used as research tool; not the UG teaching tool we had hoped for • Heaviest use by/in the Social Sciences Library relationship with TA users is arms-length
FURTHER QUESTIONS How is success defined? How do libraries define, assess and demonstrate value in TDM services? Potential assessment of • contribution to research outputs • use in teaching
THANK YOU! Hui Hua Chua Collections & User Support Librarian Michigan State University Libraries chua@msu. edu TA source code: https: //gitlab. msu. edu/msu-libraries/text-assembler