College
College of Engineering and Polymer Science
Date of Last Revision
2025-12-11 06:49:18
Major
Computer Science
Honors Course
CPSC 498
Number of Credits
3
Degree Name
Bachelor of Science in Computer Science
Date of Expected Graduation
Fall 2025
Abstract
This project applies principal component analysis (PCA) to sentiment analysis of text to identify complex emotional responses from plain text. Existing sentiment analysis tools often rely on large language models or struggle to achieve high accuracy when processing large collections of short inputs, such as social media comments. By contrast, this project uses PCA as a lightweight, mathematically grounded alternative that can scale efficiently while still capturing meaningful emotional structure in text data.
PCA has shown strong effectiveness in text analysis, particularly when supported by a sufficiently large dataset and a robust preprocessing pipeline. To create consistent, information-rich input vectors, this project implements a comprehensive preprocessing pipeline that includes a custom CSV reader, text parser, tokenizer, lemmatizer, part-of-speech tagger, and negation handler. Each component contributes to converting raw sentences into emotion-vector representations.
Once transformed, these vectors are combined into an emotion matrix, which PCA can then evaluate to identify the dominant emotional dimensions in the text. This final PCA stage extracts the primary emotional loadings, allowing the tool to determine the emotional meaning of input text with high accuracy while avoiding many of the limitations common in current sentiment analysis systems.
Research Sponsor
Dr. Zhong-Hui Duan
First Reader
Dr. En Cheng
Second Reader
Dr. John C. Hoag
Honors Faculty Advisor
Dr. Zhong-Hui Duan
Proprietary and/or Confidential Information
No
Recommended Citation
Gegick, Luke, "PCA Text Sentiment Analysis Tool" (2025). Williams Honors College, Honors Research Projects. 2064.
https://ideaexchange.uakron.edu/honors_research_projects/2064