Enhancing Stock Market Prediction by Tackling Class Imbalance in Sentiment Analysis

Authors

  • Wenhao Song Author

DOI:

https://doi.org/10.61173/n7s8q369

Keywords:

Sentiment Analysis, SMOTE, LGBM, Ran-dom Forest, Stock Market Prediction

Abstract

This study addresses the critical challenge of forecasting stock market movements by leveraging sentiment analysis of financial news headlines. Predicting the Dow Jones Industrial Average (DJIA) is of great significance to investors and financial institutions, as market trends are often influenced by public sentiment and rapidly evolving news cycles. Traditional quantitative models often struggle to capture the nuanced impact of textual information, especially in the context of imbalanced data distributions where rare but impactful market events are underrepresented. In this work, the paper constructs a comprehensive sentiment analysis framework utilizing advanced natural language processing methods to classify the sentiment of publicly available news headlines and examine their relationship with subsequent DJIA price fluctuations. By integrating headline sentiment scores with historical price data, our experiments systematically evaluate the predictive reliability of various machine learning models, with a particular focus on class imbalance mitigation. The results demonstrate that models employing class weighting in LightGBM outperform those using conventional resampling techniques, achieving recall rates of 0.45 for downturn prediction versus 0.12 with baseline methods. These findings highlight the value of algorithmic enhancements for rare event forecasting and suggest that richer, domain-specific representations may further improve predictive accuracy. Future research will explore enhanced features and time-series modeling to boost the robustness of financial sentiment analysis.

Downloads

Published

2025-12-19

Issue

Section

Articles