Fair Machine Learning in Real-World Healthcare Data

Mehmet Nurullah KurutkanJune 29, 2025

This paper, titled “A scoping review of fair machine learning techniques when using real-world data” by Huang et al. (2024), provides a comprehensive overview of fair machine learning (ML) in healthcare, specifically when utilizing real-world data (RWD). Published in the Journal of Biomedical Informatics, the review addresses the growing concerns about algorithmic bias and fairness as artificial intelligence (AI) and ML increasingly aid clinical decisions.

Objective and Importance The primary objective of this scoping review was to summarize existing literature and identify gaps in tackling algorithmic bias and optimizing fairness in AI/ML models that use RWD in healthcare domains. The authors highlight that AI tools can have a disparate impact, leading to uneven distribution of benefits and drawbacks across different societal groups and subpopulations, potentially exacerbating existing health inequities. Previous research has identified examples of such bias in healthcare, including health care cost predictions, atherosclerotic disease presence, heart failure outcomes, and postpartum mental health service utilization across different racial and socioeconomic groups. This paper aims to fill the gap of a systematic review that covers both ML fairness assessment and bias mitigation strategies in the context of RWD in healthcare.

Methodology The review followed a two-round process: initially evaluating existing relevant review articles, and then assessing individual studies. Researchers conducted comprehensive literature searches in mainstream computer science databases (IEEE Xplore, ACM Digital Library, Web of Science) for review articles, and in biomedical databases (PubMed, Embase) for individual studies. The selection criteria for individual studies focused on peer-reviewed original research published within the last decade (2012–2022) in English, specifically on mitigating bias issues in ML models using RWD. The definition of RWD adhered to FDA guidelines, focusing on data generated from routine healthcare, such as Electronic Health Records (EHRs), administrative claims, and billing data, while excluding data from experimental clinical research or personal devices.

Key Findings

Scope of Research: The review identified 35 general review articles on fair ML and 11 distinct studies specifically focusing on fair ML in healthcare applications using RWD.
Fairness Assessment Metrics: The paper summarizes ten metrics used to evaluate algorithmic fairness in healthcare, including:
- Overall accuracy equality
- Equality of opportunity
- Predictive parity
- Predictive equality
- Statistical or demographic parity
- Disparate impact
- Equalized Odds
- Intergroup standard deviation (IGSD)
- Conjunctive accuracy improvement (CAIα)
- Generalized entropy index (GEI) Most identified studies primarily focused on group fairness metrics.
Algorithmic Bias Mitigation Techniques: These techniques are categorized into three groups:
- Pre-processing: This was the most commonly used approach (82% of studies identified). Examples include reweighing to modify protected attribute weights, addressing missing data impacts, eliminating sensitive attributes and resampling, integrating social determinants of health (SDoH), and using multisource data.
- Post-processing: Less frequently used, with examples like multi-calibration and post-hoc recalibration to improve model fairness.
- In-processing: Also less common, with one notable example being a novel debiasing approach called Training and Representation Alteration (TARA) using generative models and adversarial learning.
Real-World Data Characteristics: The review notes that the existing datasets used exhibit potential biases due to their specific geographical locations, contexts, and populations represented (e.g., racial demographics, age groups, socioeconomic status). Examples of datasets include MIMIC III, Clalit Health Services (CHS), and the IBM MarketScan Medicaid Database.

Gaps and Future Directions The authors highlight several crucial areas requiring further research:

Limited Application Diversity: Despite illustrating potential, the range of healthcare applications for fair ML remains limited and needs to be broadened to cover a wider array of diseases and health-related concerns.
Complexities of Fairness Metrics: Achieving different fairness metrics simultaneously is challenging due to inherent trade-offs, underscoring the importance of context-based metric selection. The review suggests that biases are not solely technical issues but often stem from underlying biological and socioeconomic factors (e.g., SDoH).
Comparative Evaluation of Mitigation Techniques: There is a lack of evidence demonstrating the superiority of pre-processing over in-processing or post-processing methods, and comprehensive comparative studies are needed.
Multi-modality and Unstructured Data: Existing studies predominantly address single-modality data, while there’s a pressing need for techniques to evaluate and mitigate bias in multi-modality data and unstructured data (like clinical narratives and Large Language Models, LLMs).
Model Interpretability: Understanding model interpretation through Explainable AI (XAI) techniques (e.g., SHAP, attention mechanisms) is crucial for identifying bias causes and informing more equitable clinical decision-making.
Data Governance and Access: Addressing biases requires attention to data collection, governance, and access, as current health data often comes from individual medical centers that may not represent the national population. Nationwide or international data-sharing initiatives and novel approaches like federated learning are essential to enhance access to representative data.

Conclusion The paper concludes that fair AI/ML in healthcare is a burgeoning field that needs heightened research focus to cover diverse applications and various types of RWD. It offers valuable reference material and insights for researchers, emphasizing that fair ML can lead to equitable, precise, and patient-centered healthcare practices by ensuring unbiased clinical decision support and fostering trust among all stakeholders.

Reference: Huang, Y., Guo, J., Chen, W.-H., Lin, H.-Y., Tang, H., Wang, F., Xu, H., & Bian, J. (2024). A scoping review of fair machine learning techniques when using real-world data. Journal of Biomedical Informatics, 151, 104622. https://doi.org/10.1016/j.jbi.2024.104622

Video

Podcast Link

https://notebooklm.google.com/notebook/03033438-63a2-431b-86b5-2576aa8c6c1f/audio

Subscribe to the Health Topics Newsletter!

When One Method Is Not Enough: The Multimethod SEM Framework for Rigorous Research
March 12, 2026
Physicians rarely rely on a single diagnostic test when confronting a complex disease. They combine imaging, laboratory work, and genetic…
Can Generative AI Strengthen Critical Thinking? A Pedagogical Framework for LLM Integration in Higher Education
March 12, 2026
The rapid integration of large language models (LLMs) such as GPT-4 and DeepSeek R1 into higher education has generated considerable…
Analysis theories on artificial intelligence, ChatGPT, data science, and metaverse
February 15, 2026
The rapid convergence of artificial intelligence, data science, generative AI systems such as ChatGPT, and immersive environments like the metaverse…
Lotus Protocol: A New Approach to Systematic Reviews
February 13, 2026
The article How to Conduct a Multi-Domain Systematic (Literature) Review? Guidelines Using The Lotus Protocol addresses a growing methodological gap…
The Health Benefits of Voluntary Simplicity
February 12, 2026
Voluntary simplicity is a multidimensional lifestyle orientation that refers to individuals’ conscious reduction of consumption levels in order to build…
Reviewer Fatigue and the Future of Peer Evaluation
February 11, 2026
The contemporary academic publishing ecosystem is sustained by peer review, a system widely regarded as the epistemic backbone of scientific…
Factors Driving 30-Day ED Revisits in Older Patients
February 11, 2026
Population ageing has transformed emergency care demand patterns worldwide, placing unprecedented pressure on emergency departments (EDs) and exposing systemic gaps…
Addressing Care Worker Burnout: Key Findings
February 11, 2026
The growing complexity of long-term care needs, combined with chronic workforce shortages, has positioned nursing homes among the most psychologically…
Impact of Loneliness on Quality of Life in Older Adults
February 10, 2026
This article, titled “Loneliness as a Predictor of Quality of Life in Older Adults Receiving Primary Health Care in Türkiye:…
Analysis of Patient Participation: Trends and Insights
February 10, 2026
The growing emphasis on patient-centered healthcare has transformed the role of patients from passive recipients of care into active partners…

Fair Machine Learning in Real-World Healthcare Data

Video

Podcast Link

Subscribe to the Health Topics Newsletter!

Related Posts