Incorporating Macroeconomic Indicators to Improve P2P Lending Credit Risk Modeling: A Machine Learning-Based Heterogeneity Analysis
DOI:
https://doi.org/10.61173/rjf59897Keywords:
P2P lending, Credit risk, Macroeconomic indicators, Machine learning, Heterogeneity analysisAbstract
This study investigates the improvement effect of macroeconomic indicators on credit risk modeling in the P2P lending market, focusing on near-prime borrower groups and short-term loan segments. Based on 171,644 loans from Lending Club platform during 2012-2013, this paper compares the performance of four machine learning algorithms (logistic regression, random forest, XGBoost, and LightGBM) under baseline feature sets (borrower/loan characteristics only) and enhanced feature sets (including regional macroeconomic indicators). Empirical results demonstrate: (1) Logistic regression achieves optimal performance-stability balance after integrating macroeconomic indicators, with AUC improving 1.08% to 0.6854, KS statistic increasing 3.82%, while maintaining 1.0% overfitting degree; (2) Model performance improvement exhibits significant heterogeneity, with 36-month short-term loans achieving 1.41% AUC improvement, and near-prime borrowers (FICO 620-680) showing 2-3 times higher sensitivity to unemployment rate fluctuations compared to other groups; (3) SHAP analysis reveals macroeconomic indicators contribute 3%-5% to predictive power, with interaction terms between macro and baseline features significantly outperforming single macro indicators. The study provides empirical support for P2P platforms to construct group-specific risk pricing models.