Background: The recently developed deep learning (DL)-based early warning score (DEWS) has shown a potential in predicting deteriorating patients. We aimed to validate DEWS in multiple centers and compare the prediction, alarming and timeliness performance with those of the modified early warning score (MEWS) to identify patients at risk for in-hospital cardiac arrest (IHCA).Methods: This retrospective cohort study included adult patients admitted to the general wards of five hospitals during a 12-month period. We validated DEWS internally at two hospitals and externally at the other three hospitals. The occurrence of IHCA within 24 hours of vital sign observation was the outcome of interest. We used the area under the receiver operating characteristic curve (AUROC) as the main performance metric.Results: The study population consisted of 173,368 patients (224 IHCAs). The predictive performance of DEWS was superior to that of MEWS in both the internal (AUROC: 0.860 vs. 0.754, respectively) and external (AUROC: 0.905 vs. 0.785, respectively) validation cohorts. At the same specificity, DEWS had a higher sensitivity than MEWS, and at the same sensitivity, DEWS had a lower mean alarm count than MEWS, with nearly half of the alarm rate in MEWS. Additionally, DEWS was able to predict more IHCA patients in the 24 to 0.5 hours before the outcome.Conclusion: Our study showed that DEWS was superior to MEWS in the three key aspects (IHCA predictive, alarming, and timeliness performance). This study demonstrates the potential of DEWS as an effective, efficient screening tool in rapid response systems (RRSs) to identify high-risk patients.