February 1, 2021

Tweet Sentiment Analysis of English Premier League Clubs

View Code

Domains

Data

Analytics

Research

AcademicFebruary 1, 2021 - February 1, 2021

Vimal Rajesh

@vimrajesh

Source

Tweet Sentiment Analysis of English Premier League Clubs

Tech Stack

Python

Pandas

Matplotlib

Deep Learning

Git

Project Summary

Abstract

Football Twitter is noisy, emotional, and context-heavy, which makes it a useful benchmark for comparing sentiment-analysis approaches beyond clean textbook datasets.

This project analyzed a two-month Kaggle dataset covering 14 English Premier League clubs and compared a fast lexicon-based baseline with a pre-trained transformer model.

VADER captured short-form polarity efficiently, while BERT offered stronger handling of semantics, ambiguity, and sarcasm in conversational sports data.

What I Built

The 14-club EPL dataset provided a practical benchmark for comparing lexicon-based and transformer-based sentiment analysis.
VADER offered a strong fast baseline, while BERT captured more semantic nuance and sarcasm.

Impact

Framed the work as a practical NLP comparison relevant to social listening and fan-reaction analysis.
Added an AI project grounded in real conversational data rather than synthetic examples.

Page Info

EPL Tweet Dataset

Analyzed a Kaggle dataset covering Twitter discussions around 14 English Premier League clubs across a continuous two-month window.

VADER Baseline

Used VADER to score polarity in short-form social-media text, capturing intensity, slang, punctuation, and fast-moving fan reactions.

BERT Comparison

Compared the lexicon-based baseline with a pre-trained BERT model to better capture context, semantic nuance, and sarcasm in football conversations.

All Projects