Liz Sheffield

Liz Sheffield

Solution Architect

Philadelphia, PA

elizabeth.a.sheffield@gmail.com

es3279@drexel.edu

git repo


Research Focus

Natural Language Processing

Sentiment Analysis

Language Generation

Data Propogation


Languages

English

Perl

Python



Education

Ph.D. Information Science / Drexel University
2020 - Current

Current PhD student advised by Dr. Jake Ryland Williams


M.S. Information Systems Engineering / Johns Hopkins Whiting School of Engineering
2015 - 2017

Graduated with Honors 


BA, Computer Science with a second major in Linguistics / University of North Carolina - Chapel Hill
2009 - 2013

Non-Matriculating Student / Guilford College
2007 - 2009

About

I am a solution architect currently pursuing a part-time PhD in Information Science at Drexel University. Initially interested in the combination of computer science and linguistics during undergrad, I am particularly interested in how NLP can simplify data analysis/comprehension. I have a secondary interest in how machines can better comprehend human language idiosyncrasies and how we can leverage data to address social issues.

On a personal note, I enjoy hiking, running, and playing open world video games.


Work Experience

Solution Architect / Comcast
September 2021 - Current

Entertainment Metadata Architect supporting apps on Sky, NBC, and Comcast platforms. Focused on Sports, Music, and VOD metadata. Supporting inbound and outbound metadata flows, architecture work revolves around a centralized master data management (MDM) platform for entertainment metadata. The metadata empowers complex content discovery and personalization use cases.


Solution Architect / Cigna
July 2013 - September 2021

Career Progression from: Java Developer, Technical Lead, Delivery Lead, Senior Systems Analyst, to Solution Architect in the Provider Data Domain. Designed large scale solutions involving multiple custom and vendor applications providing users the tools to analyze and manage provider data. Working in a quasi-researcher function, evaluated vendors, produced options, and worked with the scrum teams to deliver the final solutions.


604.744 Information Retrieval Grader / Johns Hopkins University
August 2018 - Dec 2019

Supported a course focused on efficient storage, organization, and retrieval of information


Projects / Research Questions

Episodes of Social Media Use and Evidence of Regime Changes

Utilizing a repository of 8+ year old Twitter accounts, this research includes developing a taxonomy for episodes of social media use, algorithmically segmenting timelines into episodes of use, evaluating the algorithmic segmentation. Finally features will be identified to train a model on predicting if the author of posts within an episode of social media use is the same author of a previous episode of use - this makes it a low context author attribution task.


Automated Sarcasm Detection within Chat Bot conversations

Customer service interactions are currently being steered towards chat bot interactions, but chat bots are not skilled at detecting the use of sarcasm or irony in responses. Looking at current methods of sarcasm detection on stand-alone tweets and customer reviews, then applying methodologies conversations (expanding datasets to reply-tos and customer service logs).


Bad Data Propgation through Data Networks

Theory: Valid data updates within an integrated domain should behave differently, possibly entering the network through multiple nodes, than bad data updates. i.e. Fake events or facts should propogate through news sites in a different manner than a real event


Poet Author Attribution

When identifying the poet for a given stanza of text, are morphological/phonological/style statistics more relevant than word choice?

  • Q1: Can binary or tf-idf classifiers be effective on a corpus of poetry stanzas

  • Q2: Is Stanza length or Line Length more useful?

  • Q3: Is Rhyme Scheme useful in classification?

  • Hiking Ontology

    Hiking Trail Selection can be a convoluted process for hikers unfamiliar with available trails. Individuals may seek to find trails of specific length, location, or other feature, but while this information is available in the descriptions of trails on local websites, the information is neither linked nor searchable. The Hiking Trail Ontology seeks to address this gap.


    NFLTwitterBack

    A database of NFL Running Back game performance metrics and social media activity metrics using the SportsDataIO API and Twitter API. Preliminary structure to support questions around the impact of social media use/trends on player performance.