Rilwan Olaniyi Shanu1, Adam Folohunso Zubair2, Mukaila Alade Rahman3, Micheal Tokunbo Adenibuyan4
1,2 Lecturer, Department of Computer Science, Faculty of Computing and Information Technology, Lagos State University Ojo, Lagos Nigeria.
3 Professor, Department of Computer Science, Faculty of Computing and Information Technology, Lagos State University Ojo, Lagos Nigeria.
4 Lecturer, Department of Computer Science, College of Computing, Bells University of Technology, Ota, Ogun State Nigeria.
Abstract
One of the most difficult responsibilities in today’s banking and financial services is the credit risk evaluation. Making accurate predictions of the probability of default (PD) of loan applicants is vital for financial organizations in order to manage risk exposure, optimize lending decisions and remain within regulatory compliance. The aim of this study is to apply holistic machine learning methodology in the credit scoring domain. The workflow adopts a pipeline of nine steps consisting of two classification models constructed, trained, tested and compared using the MATLAB’s Risk Management Toolbox. The base model is Logistic Regression and the challenger model is a Decision Tree. Both models are trained on the data set that has 1,200 observations of customers with nine predictor variables and one binary response variable. During the preprocessing stage data binning is performed using Monotone Adjacent Pooling Algorithm (MAPA) and both weight-based and impurity-based approaches is used to study relevance of predictor variables. Model performance is evaluated on three verified metrics; Accuracy Ratio (AR), Area Under Receiver Operating Characteristic Curve (AUROC) and Kolmogorov-Smirnov (KS) Statistic. The results indicate that the Decision Tree model performs better than the Logistic Regression model on the three measures with default binning (AR=0.389, AUROC=0.695 and KS=0.297 vs AR=0.325, AUROC=0.663, and KS=0.232). However, changing the binning settings to the Split criterion using Gini index, the Logistic Regression model gives better results (AUROC = 0.71). This work further addresses predictor importance and hyperparameter tuning and others. The results support the notion that the choice of credit scoring model depends on the dataset and setup, also the stringent validation protocols needed for ethical AI applications in finance.
Keywords: Credit scoring, Logistic regression, Decision tree, Probability of default, Machine learning, MATLAB implementation, Credit scorecard, Financial risk, Model validation
References
- Anderson, R. (2007) The credit scoring toolkit: Theory and practice for retail credit risk management and decision automation. Oxford University Press.
- EBA (2023) Supervisory handbook on the validation of IRB rating systems eba/rep/2023/29. European Banking Authority.
- Baesens, B., Gestel, T.V, Viaene, S., et al. (2003) ‘Benchmarking state-of-the-art classification algorithms for credit scoring’, Journal of The Operational Research Society, 54(6), pp. 627-635.
- Breiman, L., Friedman, J., Olshen, R.A., et al. (2017) Classification and regression trees. Chapman and Hall/CRC.
- Davis, J. and Goadrich, M. (2006) ‘The relationship between precision-recall and ROC curves’, Proceedings of the 23rd International Conference on Machine Learning. https://ftp.cs.wisc.edu/machine-learning/shavlik-group/davis.icml06.pdf
- Elith, J., Leathwick, J.R. and Hastie,T. (2008) ‘A working guide to boosted regression trees’, Journal of Animal Ecology, 77(4), pp. 802-813.
- Engelmann, B., Hayden, E. and Tasche, D. (2003) ‘Testing rating accuracy’, Risk, 16(1), pp. 82-86.
- BIS (2006) Basel II: International convergence of capital measurement and capital standards. Bank for International Settlements. https://www.bis.org/publ/bcbs128.htm
- Deswal, S. and Pal, M. (2025) ‘Uncertainty estimation in predicting oxygenation by plunging jet aerators using probabilistic machine learning and conformal prediction’, International Journal of Technology, Health and Sustainability, 1(2), pp. 83-93. https://ijths.com/wp-content/uploads/2025/12/IJTHS-010230.pdf
- Deswal, S., Pal, M., Bhardwaj, P., et al. (2026) ‘Traffic Noise Modelling using Integrated Conformal Prediction Based Uncertainty Estimation with Machine Learning Algorithms’, International Journal of Technology, Health and Sustainability, 2(2), pp. 465-485. https://ijths.com/wp-content/uploads/IJTHS-0202005.pdf
- Hamon, R., Junklewitz, H., Sanchez, I., et al. (2022) ‘Bridging the gap between AI and explainability in the GDPR: Towards trustworthiness-by-design in automated decision-making’, IEEE Computational Intelligence Magazine, 17(1), pp. 72-85.
- Hand, D.J. (2005) ‘Good practice in retail credit scorecard assessment’, Journal of the Operational Research Society, 56(9), pp. 1109-1117.
- Hand, D.J. and Henley, W.E. (1997) ‘Statistical classification methods in consumer credit scoring: a review’, Journal of the Royal Statistical Society: Series A (Statistics in Society)’, 160(3), pp. 523-541.
- CH (2008) Country risk management. Comptroller’s Handbook. Washington DC: Office of the Comptroller of the Currency.
- Hastie, T., Tibshirani, R. and Friedman, J. (2008) Model inference and averaging. The elements of statistical learning: Data mining, inference, and prediction. Springer.
- Huang, C.-L., Chen, M.-C. and Wang, C.-J. (2007) ‘Credit scoring with a data mining approach based on support vector machines’, Expert Systems with Applications, 33(4), pp. 847-856.
- James, G., Witten, D., Hastie, T., et al. (2013) An introduction to statistical learning: with applications in R. Springer.
- Khandani, A. E., Kim, A.J. and Lo, A.W. (2010) ‘Consumer credit-risk models via machine-learning algorithms’, Journal of Banking and Finance, 34(11), pp. 2767-2787.
- Khashman, A. (2010) ‘Neural networks for credit risk evaluation: Investigation of different neural models and learning schemes’, Expert Systems with Applications, 37(9), pp. 6233-6239.
- Lessmann, S., Baesens, B., Seow, H.-V., et al. (2015) ‘Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research’, European Journal of Operational Research, 247(1), pp. 124-136.
- Mays, F.E. (2004) Credit scoring for risk managers: The handbook for lenders. Thomson/South-Western.
- OCC (2011) Supervisory guidance on model risk management. Office of the Comptroller of the Currency (OCC) Bulletin 2011-12. https://www.federalreserve.gov/frrs/guidance/supervisory-guidance-on-model-risk-management.htm
- Page, M.J., McKenzie, J.E., Bossuyt, P.M., et al. (2021) ‘The PRISMA 2020 Statement: An updated guideline for reporting systematic reviews’, BMJ, 372, 71. https://doi.org/10.1136/bmj.n71
- Quinlan, J.R. (1986) ‘Induction of decision trees’, Machine Learning, 1(1), pp. 81-106.
- Quinlan, J.R. (2014) C4. 5: programs for machine learning. Elsevier.
- Rindskopf, D. (2023) Generalized linear models. In: APA handbook of research methods in psychology: Data analysis and research publication; Cooper, H., Coutanche, M.N., McMullen, L.M., et al. 2nd ed. American Psychological Association.
- Siddiqi, N. (2006) Credit risk scorecards: developing and implementing intelligent credit scoring. Hoboken, NJ: John Wiley and Sons.
- Siddiqi, N. (2012) Credit risk scorecards: developing and implementing intelligent credit scoring. Hoboken, NJ: John Wiley and Sons.
- Škorjanc, Ž. (2025) ‘The right to explanation of a credit score: A holistic approach under the GDPR, AI Act, and Directive (EU) 2023/2225 on credit agreements for consumers’, Global Privacy Law Review, 6(3), pp. 91-106.
- Smirnov, N. (1948) ‘Table for estimating the goodness of fit of empirical distributions’, The annals of Mathematical Statistics, 19(2), pp. 279-281.
- Sobehart, J.R., Keenan, S.C. and Stein, R. (2000) ‘Benchmarking quantitative default risk models: A validation methodology’, Moody’s Investors Service, 4(6), pp. 57-72.
- Stephens, M.A. (1992) Introduction to Kolmogorov (1933) on the empirical determination of a distribution. In: Breakthroughs in statistics; Kortz, S and Johnson, N.L. NY: Springer.
- Thomas, L., Crook, J. and Edelman, D. (2017) Credit scoring and its applications. SIAM.
- Udu, C.E. and Okpala, C.C. (2026) ‘Artificial Intelligence-Enabled Resilient Scheduling: A Systematic Review and Research Roadmap for Digital Twin and Machine Learning in Disruption-Aware Operations’, International Journal of Technology, Health and Sustainability, 2(2), pp. 486-497. https://ijths.com/wp-content/uploads/IJTHS-0202014.pdf
- West, D. (2000) ‘Neural network credit scoring models’, Computers and Operations Research, 27(11-12), pp. 1131-1152.
- Yeh, I.-C. and Lien, C.-h. (2009) ‘The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients’, Expert Systems with Applications, 36(2), pp. 2473-2480.
