Fair Machine Learning in Real-World Healthcare Data

This paper, titled “A scoping review of fair machine learning techniques when using real-world data” by Huang et al. (2024), provides a comprehensive overview of fair machine learning (ML) in healthcare, specifically when utilizing real-world data (RWD). Published in the Journal of Biomedical Informatics, the review addresses the growing concerns about algorithmic bias and fairness as artificial intelligence (AI) and ML increasingly aid clinical decisions.

Objective and Importance The primary objective of this scoping review was to summarize existing literature and identify gaps in tackling algorithmic bias and optimizing fairness in AI/ML models that use RWD in healthcare domains. The authors highlight that AI tools can have a disparate impact, leading to uneven distribution of benefits and drawbacks across different societal groups and subpopulations, potentially exacerbating existing health inequities. Previous research has identified examples of such bias in healthcare, including health care cost predictions, atherosclerotic disease presence, heart failure outcomes, and postpartum mental health service utilization across different racial and socioeconomic groups. This paper aims to fill the gap of a systematic review that covers both ML fairness assessment and bias mitigation strategies in the context of RWD in healthcare.

Methodology The review followed a two-round process: initially evaluating existing relevant review articles, and then assessing individual studies. Researchers conducted comprehensive literature searches in mainstream computer science databases (IEEE Xplore, ACM Digital Library, Web of Science) for review articles, and in biomedical databases (PubMed, Embase) for individual studies. The selection criteria for individual studies focused on peer-reviewed original research published within the last decade (2012–2022) in English, specifically on mitigating bias issues in ML models using RWD. The definition of RWD adhered to FDA guidelines, focusing on data generated from routine healthcare, such as Electronic Health Records (EHRs), administrative claims, and billing data, while excluding data from experimental clinical research or personal devices.

Key Findings

  • Scope of Research: The review identified 35 general review articles on fair ML and 11 distinct studies specifically focusing on fair ML in healthcare applications using RWD.
  • Fairness Assessment Metrics: The paper summarizes ten metrics used to evaluate algorithmic fairness in healthcare, including:
    • Overall accuracy equality
    • Equality of opportunity
    • Predictive parity
    • Predictive equality
    • Statistical or demographic parity
    • Disparate impact
    • Equalized Odds
    • Intergroup standard deviation (IGSD)
    • Conjunctive accuracy improvement (CAIα)
    • Generalized entropy index (GEI) Most identified studies primarily focused on group fairness metrics.
  • Algorithmic Bias Mitigation Techniques: These techniques are categorized into three groups:
    • Pre-processing: This was the most commonly used approach (82% of studies identified). Examples include reweighing to modify protected attribute weights, addressing missing data impacts, eliminating sensitive attributes and resampling, integrating social determinants of health (SDoH), and using multisource data.
    • Post-processing: Less frequently used, with examples like multi-calibration and post-hoc recalibration to improve model fairness.
    • In-processing: Also less common, with one notable example being a novel debiasing approach called Training and Representation Alteration (TARA) using generative models and adversarial learning.
  • Real-World Data Characteristics: The review notes that the existing datasets used exhibit potential biases due to their specific geographical locations, contexts, and populations represented (e.g., racial demographics, age groups, socioeconomic status). Examples of datasets include MIMIC III, Clalit Health Services (CHS), and the IBM MarketScan Medicaid Database.

Gaps and Future Directions The authors highlight several crucial areas requiring further research:

  • Limited Application Diversity: Despite illustrating potential, the range of healthcare applications for fair ML remains limited and needs to be broadened to cover a wider array of diseases and health-related concerns.
  • Complexities of Fairness Metrics: Achieving different fairness metrics simultaneously is challenging due to inherent trade-offs, underscoring the importance of context-based metric selection. The review suggests that biases are not solely technical issues but often stem from underlying biological and socioeconomic factors (e.g., SDoH).
  • Comparative Evaluation of Mitigation Techniques: There is a lack of evidence demonstrating the superiority of pre-processing over in-processing or post-processing methods, and comprehensive comparative studies are needed.
  • Multi-modality and Unstructured Data: Existing studies predominantly address single-modality data, while there’s a pressing need for techniques to evaluate and mitigate bias in multi-modality data and unstructured data (like clinical narratives and Large Language Models, LLMs).
  • Model Interpretability: Understanding model interpretation through Explainable AI (XAI) techniques (e.g., SHAP, attention mechanisms) is crucial for identifying bias causes and informing more equitable clinical decision-making.
  • Data Governance and Access: Addressing biases requires attention to data collection, governance, and access, as current health data often comes from individual medical centers that may not represent the national population. Nationwide or international data-sharing initiatives and novel approaches like federated learning are essential to enhance access to representative data.

Conclusion The paper concludes that fair AI/ML in healthcare is a burgeoning field that needs heightened research focus to cover diverse applications and various types of RWD. It offers valuable reference material and insights for researchers, emphasizing that fair ML can lead to equitable, precise, and patient-centered healthcare practices by ensuring unbiased clinical decision support and fostering trust among all stakeholders.

Reference: Huang, Y., Guo, J., Chen, W.-H., Lin, H.-Y., Tang, H., Wang, F., Xu, H., & Bian, J. (2024). A scoping review of fair machine learning techniques when using real-world data. Journal of Biomedical Informatics, 151, 104622. https://doi.org/10.1016/j.jbi.2024.104622

Video

Podcast Link

https://notebooklm.google.com/notebook/03033438-63a2-431b-86b5-2576aa8c6c1f/audio

Subscribe to the Health Topics Newsletter!

Google reCaptcha: Invalid site key.