The research by Li et al. (2025) addresses a critical challenge in healthcare: accurately predicting in-hospital Length of Stay (LOS). This metric, defined as the duration between a patient’s admission and discharge, is a straightforward and uniformly applicable indicator for assessing the quality and effectiveness of healthcare services across diverse systems. Precise LOS prediction is vital for hospital leaders and health policy decision-makers, as it directly supports decision-making processes and informs hospital operations, including patient flow management, planning for elective cases, and human resource allocation. Furthermore, analyzing LOS across different demographic and socioeconomic groups can reveal disparities in quality of care and access to services, contributing to efforts for improved health equity.
This study formulates LOS prediction as a survival analysis problem, which is a statistical method designed to predict the time until a specific event of interest occurs. A key advantage of survival analysis is its ability to handle “censored instances,” where the event (e.g., discharge) is not fully observed during the study period. In this research, the primary event of interest is discharge to home, chosen because it signifies a successful patient outcome where inpatient care is no longer required. Other discharge categories, such as death or transfer to another facility, were considered censored as they do not represent the ideal outcome for assessing quality and effectiveness.
The study utilizes a comprehensive dataset from the General Medicine Inpatient Initiative (GEMINI) data repository, which comprises detailed clinical and administrative data from over 30 participating hospitals across Ontario, Canada. The dataset includes 118,357 unique admissions for ten distinct General Internal Medicine (GIM) disease types, such as cerebral infarction, heart failure, and urinary tract infections. A significant portion of patients (69.1%) were discharged home, while the remaining 30.9% were considered censored.
To achieve robust predictions, the researchers implemented and compared five different survival analysis models: the Standard Cox model, the XGBoost-enhanced Cox model, the Random Survival Forest (RSF), DeepSurv, and CoxTime. A central aspect of this research is its emphasis on interpretable machine learning (XAI), specifically employing the Shapley Additive Explanations (SHAP) method. SHAP values are based on game theory and measure each feature’s contribution to the final prediction, providing an unbiased estimation of feature importance and enhancing the model’s trustworthiness for healthcare providers.
Key findings indicate that the XGBoost-enhanced Cox model consistently outperformed the other models across all ten diseases. The models achieved an average Concordance Index (C-index) of approximately 0.7, signifying a 70% accuracy in correctly ranking patients based on their time to discharge. The interpretability provided by SHAP values revealed several most influential predictors for LOS:
- Disease Diagnosis: Six specific disease indicators were identified as top features, including cerebral infarction, urinary tract infection, and heart failure.
- Patient Demographics: Age was a significant factor.
- Derived Healthcare Scores: Important scores included “admit_charlson_index,” “admit_elixhauser_index,” “admit_frailty_score,” and “modified_laps”.
- Pre-admission Variables: The duration of emergency room stay (“duration_er_stay_hours”) and prior admission within 30 days (“prev_admission_gim_30d”) were also highly influential.
- Socioeconomic Variables: Notably, “census_dependency” (average number of dependents) and “census_lab_part_rate” (employment rate) from the patient’s neighborhood census data significantly impacted LOS, suggesting that socioeconomic determinants play a key role and addressing them could help reduce disparities in hospital care.
The practical applications of this research are extensive:
- Personalized Patient Care: Clinicians can use these insights to identify patients at high risk for extended stays and implement early interventions to prevent complications.
- Improved Hospital Operations and Resource Allocation: Accurate LOS forecasts enable better patient flow management, reducing overcrowding, and optimizing staffing, bed availability, and medical supplies.
- Enhanced Healthcare Equity: The inclusion and analysis of socioeconomic variables allow hospitals to identify and address disparities in care, ensuring timely and appropriate care for all patients, regardless of their background.
- Better Patient Engagement: Providing patients and their families with more accurate expectations regarding their hospital stay duration can reduce uncertainty and facilitate better discharge planning, leading to improved post-discharge outcomes.
In conclusion, this research demonstrates how survival analysis, combined with interpretable AI techniques, can provide valuable insights into LOS prediction, supporting crucial decision-making for hospital management, operational efficiency, and the promotion of health equity.
Reference: Li, Y., Hall, T., Razak, F., Verma, A., Chignell, M., & Wang, L. (2025). Using interpretable survival analysis to assess hospital length of stay. BMC Health Services Research, 25, 741. https://doi.org/10.1186/s12913-025-12852-0

