The Key Indicators Affecting the Salaries of NBA Players are Analyzed Based on Stepwise Multiple Linear Regression and Random Forest Model
DOI:
https://doi.org/10.61173/0hpmcn17Keywords:
NBA players, stepwise multiple linear re-gression, random forest modelAbstract
As the world's top-level basketball league, the National Basketball Association(NBA) has significant differences in player salaries, but the key influencing factors have not yet been fully clarified. Most of the existing studies focus on the linear relationship between salary and performance indicators, ignoring nonlinear effects or factors such as business value and rookie contracts. This study is based on NBA data from 2020 to 2025, eliminating star players and rookies to reduce bias. It adopts stepwise multiple linear regression (SMLR) and random Forest (RF) models to explore the determinants of salary. After SMLR solved the problem of variable collinearity, it showed that playing time, player influence assessment, and ball-handling offensive percentage (Usage Percentage(USG%)) were the main linear predictive indicators of salary. The model adjusted R² to 0.614, explaining 61.4% of the salary variation. The random forest model further reveals the influence of nonlinear factors such as age(AGE), which may be related to special contracts such as the Bird clause. Its test set R² reaches 0.664, and the prediction error is lower than that of SMLR, especially performing better in the medium and high salary range. Research shows that a player's actual contribution (MIN, PIE) and tactical status (USG%) are the core drivers of salary, and the random forest model has more advantages in capturing complex relationships. This research provides the team management with a basis for quantitatively evaluating the value of players and helps them optimize the team configuration.