ROBUSTNESS AND FAIRNESS EVALUATION OF MACHINE LEARNING MODELS IN HIGH STAKES DECISION SYSTEMS

Nguyen Leblanc Chen

ROBUSTNESS AND FAIRNESS EVALUATION OF MACHINE LEARNING MODELS IN HIGH STAKES DECISION SYSTEMS

Authors

Nguyen Leblanc Chen

Research Scientist – Ethical AI, USA.

Keywords:

Machine Learning, Fairness, Robustness, High Stakes Decision Systems, Evaluation Metrics, Algorithmic Bias

Synopsis

Purpose: This paper examines the importance of evaluating both robustness and fairness in machine learning (ML) models deployed in high stakes decision environments such as healthcare, criminal justice, and finance. Design/methodology/approach: We review theoretical and empirical studies on robustness and fairness, categorize key evaluation methods, and compare metrics and testing frameworks. Findings: Results indicate that while many fairness techniques reduce disparate impact, robustness to distributional shifts and adversarial perturbations remains a critical gap in current evaluations. Fairness interventions can inadvertently affect robustness, highlighting the need for joint evaluation frameworks. Practical implications: The paper offers guidance for practitioners on selecting appropriate evaluation metrics and testing protocols for high stakes ML deployments. Originality/value: This work systematically contrasts robustness and fairness assessments, highlighting trade offs and proposing integrated evaluation recommendations.

References

[1] Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., & Mané, D. (2016). Concrete problems in AI safety. arXiv.

[2] Barocas, S., & Selbst, A. D. (2016). Big data’s disparate impact. California Law Review, 104(3), 671–732.

[3] Gummadi, V. P. K. (2023). MuleSoft batch processing: High-volume streaming architecture. Computer Fraud & Security, 2023(12), 50–57. https://doi.org/10.52710/cfs.886

[4] Beutel, A., Chen, J., Zhao, Z., & Chi, E. H. (2019). Fairness in recommendation ranking through pairwise comparisons. WWW.

[5] Chen, I. Y., Szolovits, P., & Ghassemi, M. (2018). Can AI help reduce disparities in general medical and mental health care? AMA Journal of Ethics, 20(9), E944–E952.

[6] Cotter, A., Gupta, M., Jiang, H., & Sridharan, K. (2019). Two wrong fairness criteria are not better than one. NeurIPS.

[7] Goodfellow, I., Shlens, J., & Szegedy, C. (2014). Explaining and harnessing adversarial examples. arXiv.

[8] Gulrajani, I., & Lopez Paz, D. (2020). In search of lost domain generalization. ICML.

[9] Hardt, M., Price, E., & Srebro, N. (2016). Equality of opportunity in supervised learning. NeurIPS.

[10] Hendrycks, D., & Gimpel, K. (2017). A baseline for detecting misclassified and out of distribution examples in neural networks. ICLR.

[11] Liu, S., Pradhan, M., Tasdizen, T., & Stoyanov, V. (2018). Delving into bias and fairness in clinical machine learning systems. Machine Learning for Healthcare.