Machine learning algorithms for credit scoring data classification
Backend for performance demonstration of various ML algorithms
Our developers have designed and developed the R language-based classification system for defaulted/non-defaulted loans, the system is based on the following points:
- data loading and data encoding for categorical data types,
- data normalization and pre-processing,
- data sampling,
- data classification and cross-validation for robust performance estimation.
The system allows testing 3 variants of data sampling (oversampling, undersampling and bootstrap sampling) and 6 variants of classification algorithms (KNN, SVM, logistic regression, stochastic gradient descent, decision tree and random forest).
As a result of algorithms implementation, the user receives a detailed information about the classification performance using such metrics as MSE, Kolmogorov-Smirnov statistics, and ROC curves.
The system was deployed on AWS server and connected via API to the web interface. Both APIs and web interface were developed by our web developers as well.
The main concern of the system implementation on the server was in the data processing memory usage, that was successfully overpassed by code optimization for memory and computational resources usage.
Похожие проекты
Виртуальная примерочная для косметических продуктов
Система состоит из моделей распознавания лица и сегментации, а также алгоритма, позволяющего изменять цвет объекта без потери их исходной текстуры.
Онлайн сурдопереводчик
AI-алгоритм, который конвертирует видео человека, говорящего на языке жестов, в текстовый формат.
Приложение-ассистент для тренировок
Мобильное приложение, отслеживающее корректность выполнения упражнений во время тренировок.