Traffic Congestion Level Prediction: AComparative Study Based on Random Forest and XGBoost Models
DOI:
https://doi.org/10.61173/adaqya36Keywords:
Random Forest, XGBoost, Traffic Conges-tion, Hyperparameter Tuning, Machine LearningAbstract
With the acceleration of global urbanization, the problem of traffic congestion has become increasingly severe, exerting a significant impact on economic development, environmental governance, and the management of residents. To alleviate the problem of traffic congestion, this study focuses on the performance of the random forest and XGBoost models in predicting traffic congestion levels. In the data preprocessing stage, this study cleaned and encoded key features such as traffic speed and energy consumption, and constructed interaction variables through feature engineering to enhance the model’s ability to capture traffic patterns. During the training phase of the model, this study combined multiple hyperparameter optimization methods, including GridSearchCV, RandomizedSearchCV, Optuna, and BayesSearchCV, for tuning and conducted a comparative analysis. The experimental results show that the optimized models have significant improvements compared to the Baseline. The combined accuracy of random forest + BayesSearchCV has increased from 79.92% to 97.52%, F1-macro has reached 0.9664, and F1-weighted has reached 0.9753. The combined accuracy rate of XGBoost + Optuna increased from 78.56% to 97.6%. The F1-macro score reached 0.9719, and the F1-weighted score reached 0.9761. This study further demonstrates the application potential of machine learning methods in the field of traffic prediction and clarifies a feasible direction for the construction of smart city transportation systems.