Academic

Data bias, algorithmic discrimination and the fairness issues of individual credit accessibility

PurposeThis study examines the impact of data bias and algorithmic discrimination on individual credit accessibility in China’s financial system. It aims to align financial inclusion and equity goals with statistical fairness conditions by constructing fairness metrics from multiple dimensions. The paper evaluates the fairness of commonly used credit evaluation models and proposes a novel approach to eliminate data bias in historical datasets.Design/methodology/approachWe model credit evaluation using Logistic Regression, Random Forest, and XGBoost algorithms, focusing on education level and work experience as sensitive attributes. To mitigate data bias in historical datasets, we employ the Metropolis-Hastings (M-H) algorithm for data preprocessing.Findings(1) Machine learning models like Random Forest and XGBoost outperform traditional methods in addressing unfairness arising from multiple sensitive attributes. (2) Sensitive attributes, while excluded from credit scoring models, may i

S
Shenggang Yang
· · 1 min read · 3 views

PurposeThis study examines the impact of data bias and algorithmic discrimination on individual credit accessibility in China’s financial system. It aims to align financial inclusion and equity goals with statistical fairness conditions by constructing fairness metrics from multiple dimensions. The paper evaluates the fairness of commonly used credit evaluation models and proposes a novel approach to eliminate data bias in historical datasets.Design/methodology/approachWe model credit evaluation using Logistic Regression, Random Forest, and XGBoost algorithms, focusing on education level and work experience as sensitive attributes. To mitigate data bias in historical datasets, we employ the Metropolis-Hastings (M-H) algorithm for data preprocessing.Findings(1) Machine learning models like Random Forest and XGBoost outperform traditional methods in addressing unfairness arising from multiple sensitive attributes. (2) Sensitive attributes, while excluded from credit scoring models, may indirectly influence outcomes through other indicators. Limiting the gap in credit accessibility between the general population and protected groups is essential for fairness of opportunity. (3) Data bias significantly affects credit ratings, increasing the false positive rate for certain demographic subgroups and reducing their credit accessibility.Practical implicationsThe study provides a micro-level examination of individual credit accessibility and fairness in China. It analyzes the fairness of credit evaluation models used by Chinese financial institutions across different population groups and proposes an M-H algorithm–based method to eliminate data bias in historical datasets.Originality/valueThis paper enhances research on fairness in individual credit accessibility in China by introducing three fairness metrics for evaluating credit evaluation models. It offers a micro-level perspective for scholars studying related issues.

Executive Summary

This study examines the impact of data bias and algorithmic discrimination on individual credit accessibility in China's financial system. It evaluates the fairness of credit evaluation models and proposes a novel approach to eliminate data bias in historical datasets. The study finds that machine learning models can outperform traditional methods in addressing unfairness and that data bias significantly affects credit ratings. The research provides a micro-level examination of individual credit accessibility and fairness in China, offering a new perspective for scholars studying related issues.

Key Points

  • Data bias and algorithmic discrimination can significantly impact individual credit accessibility
  • Machine learning models can outperform traditional methods in addressing unfairness
  • Sensitive attributes can indirectly influence credit scoring outcomes even when excluded from models

Merits

Novel Approach to Data Bias Elimination

The study proposes a novel approach to eliminate data bias in historical datasets using the Metropolis-Hastings algorithm, which can help improve the fairness of credit evaluation models.

Comprehensive Evaluation of Credit Evaluation Models

The study evaluates the fairness of commonly used credit evaluation models, providing a comprehensive understanding of their strengths and limitations.

Demerits

Limited Generalizability

The study's findings may not be generalizable to other countries or financial systems, as the research is focused on China's financial system.

Lack of Transparency in Model Development

The study could benefit from more transparency in the development and validation of the proposed approach to eliminate data bias.

Expert Commentary

This study provides a timely and important contribution to the literature on fairness in credit evaluation models. The use of machine learning models and the Metropolis-Hastings algorithm to eliminate data bias are notable strengths of the research. However, the study's limited generalizability and lack of transparency in model development are areas for future research. The study's findings have significant implications for financial inclusion and regulatory frameworks, highlighting the need for more fair and transparent credit evaluation models.

Recommendations

  • Future research should aim to develop more transparent and generalizable approaches to eliminating data bias in credit evaluation models
  • Regulators and financial institutions should work together to develop and implement more fair and transparent credit evaluation models

Sources