
Real-world evidence (RWE) and real-world data (RWD) have been gaining prominence in the recent years, opening new avenues in clinical research, especially for settings where conventional randomized controlled trials (RCTs) are infeasible or ethically challenged. One such innovative step is the use of external control arms (ECAs), which consider retrospective patient data as comparators for single-arm clinical studies. This approach is particularly crucial in rare diseases, oncology, or therapeutic areas with high unmet need, where recruiting patients for a conventional control arm may be difficult or impossible. However, developing an effective and reliable ECA warrants careful matching of baseline characteristics, outcomes, and treatment pathways. These tasks are increasingly being supported and improved by the advent of artificial intelligence (AI).(1, 2)
AI is instrumental in enhancing the development of ECAs by enabling quicker and more efficient extraction of relevant data from huge, heterogeneous RWD sources, such as electronic health records (EHRs), registries, claims databases, and even clinical notes. Through techniques like natural language processing (NLP) and machine learning (ML) algorithms, AI tools can accurately homogenize different data points, attribute missing values, and recognize eligible patient populations. This facilitates the selected external cohort to closely resemble the treatment population in terms of demographics, disease features, and prognostic factors, minimizing bias and increasing the reliability of the comparison.(1-4)
Benefits of AI extend beyond just data selection by further facilitating vigorous statistical approaches in ECA development. Advanced ML approaches, such as propensity score matching, inverse probability treatment weighting, or even more adaptable models like generative adversarial networks (GANs), can be applied to calculate baseline covariates and replicate counterfactual outcomes.(5) These methods enable researchers to more precisely estimate treatment effects, improve causal inference, and minimize confounding, all of which are crucial elements to make ECA findings reliable to regulators, clinicians, and payers. Moreover, AI can help examine the validity of the ECA over time with evolving clinical practices, providing dynamic updates and adaptive practices.(2, 3, 5)
However, the incorporation of AI in ECA development has several limitations. Data quality and completeness continue to be major challenges, as even the most innovative algorithms cannot counteract analytically missing or biased data. The interpretability and transparency of AI models are crucial for ensuring regulatory acceptance and duplicability.(2, 4) To address these concerns, stakeholders are promoting the use of explainable AI (XAI) and standardized validation frameworks to improve the understanding of model outputs, making them reliable to both scientific and non-scientific audiences.(6)
As research progresses and explores the importance of ECAs in drug development, the combination of AI with RWD analytics is expected to transform evidence generation. By automating multifaceted analyses and discovering hidden patterns in large-scale datasets, AI is revolutionizing ECAs from a promising concept into a scalable, robust, and proficient solution that supports conventional clinical trial approaches. With the appropriate measures and collaboration across disciplines, AI-driven ECAs are immensely capable of expediting treatment access while maintaining scientific integrity and patient-centricity.
Become A Certified HEOR Professional – Enrol yourself here!
References
- Elvatun S, Knoors D, Brant S, et al. Synthetic data as external control arms in scarce single-arm clinical trials. PLOS Digit Health. 2025 Jan 23;4(1):e0000581.
- Pasculli G, Virgolin M, Myles P, et al. Synthetic Data in Healthcare and Drug Development: Definitions, Regulatory Frameworks, Issues. CPT Pharmacometrics Syst Pharmacol. 2025 May;14(5):840-852.
- American Statistical Association: AmstatNews: External Control Arms: Key Elements. June 2022. [Accessed online on 30th July 2025]. Available at: https://magazine.amstat.org/blog/2022/06/01/external-control-arms-key-elements/
- Singh, Ajit, Medical Data Imputation: Using Generative AI to Impute Missing Values in Medical Datasets (March 04, 2025). Available at SSRN: https://ssrn.com/abstract=5196881
- Kwak D, Liang Y, Shi X, et al. Comparing Machine Learning and Advanced Methods with Traditional Methods to Generate Weights in Inverse Probability of Treatment Weighting: The INFORM Study. Pragmat Obs Res. 2024 Oct 4;15:173-183.
S Ali, T Abuhmed, S El-Sappagh, et al. Explainable Artificial Intelligence (XAI): What we know and what is left to attain Trustworthy Artificial Intelligence, Information Fusion. 2023; 99:101805.










The cost of prescription drugs is a significant burden on patients and the healthcare system, especially in countries such as the USA. High drug prices can strain government programs, such as Medicare and Medicaid, and private insurers, which can lead to higher premiums for consumers. Additionally, high drug prices are responsible for increased out-of-pocket expenses for patients, which further lead to medication non-adherence, and thus result in poorer health outcomes. On the other hand, the research and development activities in pharmaceutical industries depend on their profit from sales, and an extremely harsh reduction in drug prices can have adverse consequences in terms of a lack of incentive for innovation in the pharmaceutical industry. (1)
There has been a shift in the global healthcare ecosystem from volume-based to value-based payment model, thanks to a surge in data availability, interoperability, advancing health technologies, cost and competitive pressures, scientific advances, and increasing adoption of personalized medicine. The resulting availability of a large quantity of real-world data (RWD) has made it possible to perform continual observation of disease epidemiology, treatment patterns, and outcomes in the real world. Analysing strong RWD generates strong real-world evidence (RWE), and the incredible power of RWE in the drug approval process, including prioritizing and streamlining drug development, is being realised by all stakeholders. RWE especially gains importance because randomized controlled trials (RCTs) cannot be applied to the entire patient population of a specific disease. Parallel to this, the value, usage, and acceptance of RWE in the pharmaceutical and biotechnology industries have also increased in recent years.[1]