Machine Learning Archives - Marksman Healthcare

The PALISADE Checklist: A Framework for Trustworthy Machine Learning in HEOR
Machine learning (ML) is revolutionizing healthcare by supporting smarter decisions and deeper insights, particularly in Health Economics and Outcomes Research (HEOR), where data-led findings impact real-world healthcare policies.[1] However, growing ML adoption is raising concerns about its reliability, transparency, and ethical use. To address these, the PALISADE Checklist provides a well-defined framework for implementing ML responsibly and reliably in HEOR.[2, 3] The framework ensures responsible adoption of ML by assessing its Purpose, Appropriateness, Limitations, Implementation, Sensitivity and Specificity, Algorithm characteristics, Data characteristics, and Explainability; thereby getting its name- the PALISADE checklist.[3]

The PALISADE Checklist is an innovative framework presented by the ISPOR Machine Learning Task Force to channel the responsible and dependable use of ML in HEOR. It takes into consideration five applications of ML methods that are crucial to HEOR; viz. 1) ML-assisted cohort selection, 2) feature selection, 3) predictive analytics, 4) causal inference, and 5) health economic evaluation, and reflection on transparency and explainability.[3] The rapid adoption of ML techniques in healthcare has necessitated a well-defined methodology to ensure transparency, consistency, and ethical foundation of these methods. PALISADE helps address ML-driven challenges by providing an extensive, standardized set of factors for researchers, analysts, and stakeholders to assess ML applications in HEOR contexts.[2, 3]

Fundamentally, the PALISADE checklist offers prompts that ML developers can use to constitute their thinking about how the suitability of ML methods can be conveyed to stakeholders and healthcare decision makers. The checklist supports comprehensibility by urging practitioners to clearly define the objectives for using ML in a given HEOR study; asking whether the ML model serves prediction, categorization, or extrapolation, and whether its objective aligns with the wider goals of the undergoing health research. This clarity of intention is crucial for accountability and also for allowing regulators, payers, and policy makers to elucidate results meaningfully as they appropriately apply them in decision-making.[3]

Another basic element of the framework is the focus on methodological relevance. ML is not a one-size-fits-all approach, and the checklist stipulates a thorough justification for choosing ML over conventional statistical methods. Researchers must weigh apparent advantages of ML pertaining to accuracy, scalability, or insight, along with the suitability of available data. This promotes the selection of ML not just as a trend, but as a carefully chosen instrument for enhancing the quality and applicability of HEOR studies.[3]

The PALISADE checklist also underscores the inherent limitations and risks in ML applications. It heavily focuses on the importance of recognizing model uncertainty, overfitting, bias, and data quality concerns that could impact the end results. The checklist urges users to openly reveal these limitations, thus reinforcing the integrity of the research and also, training stakeholders to interpret results with caution. Such transparency is crucial in HEOR, where policy decisions influence patient care and overall public health.[3]

Executional considerations also play a key role in the PALISADE checklist. It encourages researchers to contemplate how ML models will work in real-world settings, beyond controlled environments or highly curated datasets. This consists of evaluating whether the model’s extrapolations are generalizable across populations and time periods, and whether the working infrastructure is in place to incorporate the model into standard healthcare decision-making. Practical feasibility is equally important as theoretical execution for applying ML in HEOR.[3]

The framework, along with the performance metrics like accuracy or precision, clearly includes comprehensibility as an important element. Healthcare stakeholders, right from clinicians to patients to policymakers, should be capable at understanding the specific prediction made by a particular model. PALISADE supports models that provide comprehensibility outputs, while also warranting additional tools and documentation to convert complex algorithms into coherent logic. This helps lower the risk of misappropriated findings and build trust in ML tools.[3]

The final constituents of the checklist address data and algorithm features. Researchers are encouraged to document the origin, quality, and representativeness of instructions, authentication, and test datasets. Similarly, PALISADE warrants an exhaustive clarification of the algorithm’s structure, parameters, and tuning decisions. These factors are significant for reproducibility, one of the key factors of scientific accuracy, confirming that findings can be corroborated or improved upon by future researchers.[3]

Finally, the PALISADE Checklist is more than a framework as it warrants accountability in the application of ML in HEOR. By integrating ethical principles, scientific robustness, and practical insight into each phase of ML execution, it helps facilitate the powerful utilization of ML both innovatively and responsibly.

Become A Certified HEOR Professional – Enrol yourself here!

References
1. Dasari M, Dasari P, Fossati S, et al. Applications of Artificial Intelligence and Machine Learning in Health Economics and Outcomes Research: A Targeted Literature Review. Value in Health. 2024; 27(12). (ISPOR Europe 2024, Barcelona, Spain).
2. ISPOR. Global Expert Panel Identifies 5 Areas Where Machine Learning Could Enhance Health Economics and Outcomes Research – A Good Practices Report of the ISPOR Machine Learning Task Force. July 2022. Available online at: https://www.ispor.org/heor-resources/news-top/news/view/2022/07/05/global-expert-panel-identifies-5-areas-where-machine-learning-could-enhance-health-economics-and-outcomes-research
3. Padula WV, Kreif N, Vanness DJ, et al. Machine Learning Methods in Health Economics and Outcomes Research-The PALISADE Checklist: A Good Practices Report of an ISPOR Task Force. Value Health. 2022 Jul;25(7):1063-1080.
MarksMan Healthcare

May 16, 2025

HEOR, Machine Learning

PALISADE checklist
Using AI to Accurately Assess Risk of Bias in Published Articles: Are We There Yet?

In the realm of medical research, the credibility and accuracy of published articles are paramount. Healthcare professionals rely on these articles to make informed decisions regarding patient care, treatment modalities, and developing clinical guidelines. However, the presence of bias in scientific studies can significantly undermine the validity and trustworthiness of their findings, potentially leading to misguided conclusions and inappropriate healthcare practices. The emergence of artificial intelligence (AI) has sparked considerable interest in utilizing its capabilities to assist in assessing bias in published articles.(1)

Bias can occur at various stages of the research process, including study design, data collection, analysis, interpretation, etc. Identifying and minimizing bias is crucial to ensure that research findings are unbiased, reliable, and can effectively translate into clinical practice. Typically, this assessment involves thoroughly examining various aspects of a study, such as study design, methodology, data collection, analysis, and reporting. Experts evaluate factors that may introduce bias, including conflicts of interest, selective reporting, inadequate blinding or randomization, and other potential sources of bias. This manual process requires expertise and can be time-consuming, especially when analyzing a large number of articles. The introduction of artificial intelligence (AI) in risk of bias assessment offers several advantages over traditional means. By leveraging machine learning algorithms, AI tools can identify patterns and indicators of bias in titles, abstracts, and full-text articles. This technology accelerates the screening process, increases consistency in assessments, and provides additional insights into potential biases.(2)

While AI cannot replace human expertise, it serves as a valuable tool for initial screening and prioritization, enabling researchers and clinicians to focus their attention on articles with a lower risk of bias and facilitating evidence-based decision-making in a more timely and efficient manner. The integration of AI in assessing the risk of bias in published articles signifies a significant advancement, promising enhanced reliability and objectivity in evaluating scientific literature.(3)

AI algorithms can analyze vast amounts of data and identify patterns that might be challenging for humans to detect. In recent years, researchers have developed AI-based tools and techniques to assist in assessing the risk of bias in scientific studies. These tools utilize machine learning algorithms to evaluate published articles based on predefined criteria and indicators of bias.(3)

While AI has shown promise in assessing bias in scientific literature, it is crucial to emphasize the need for collaboration between researchers, clinicians, and AI experts. By combining domain expertise and technical knowledge, interdisciplinary teams can develop more accurate and reliable AI models. Ongoing research and development are necessary to refine AI models, improve their performance in detecting various types of bias, and address the limitations, such as the reliance on limited text information.(4, 5)

The application of AI in assessing the risk of bias (ROB) in scientific articles is not without its challenges. AI models may lack contextual understanding, struggle with interpretation and identifying subtle bias, and have limited adaptability to evolving research practices. They can also perpetuate biases present in training data, raise ethical concerns, and create accountability challenges. Further, despite reasonably accurate predictions, the imperfections of AI models highlight the need for manual verification to ensure comprehensive and reliable assessments.(5)

While AI has showcased promise in assessing bias within the scientific literature, a collaborative approach between researchers, clinicians, and AI experts is vital for further advancement. The fusion of domain expertise and technical acumen within interdisciplinary teams can foster the development of more accurate and reliable AI models. Continued research and development efforts are essential to refine existing models, augment their performance in detecting diverse types of bias, and address inherent limitations, such as the dependence on limited textual information.

Become a Certified HEOR Professional – Enrol yourself here!

References:

1. Jardim PS, Rose CJ, Ames HM, et al. Automating risk of bias assessment in systematic reviews: a real-time mixed methods comparison of human researchers to a machine learning system. BMC Medical Research Methodology. 2022 Jun 8;22(1):167.
2. Arno A, Elliott J, Wallace B, Turner T, Thomas J. The views of health guideline developers on the use of automation in health evidence synthesis. Systematic Reviews. 2021 Dec;10:1-0.
3. Soboczenski F, Trikalinos TA, Kuiper J, et al. Machine learning to help researchers evaluate biases in clinical trials: a prospective, randomized user study. BMC medical informatics and decision making. 2019 Dec;19:1-2.
4. Marshall IJ, Kuiper J, Wallace BC. Automating risk of bias assessment for clinical trials. IEEE J Biomed Health Inform. 2015 Jul;19(4):1406-12.
5. Marshall IJ, Kuiper J, Wallace BC. RobotReviewer: evaluation of a system for automatically assessing bias in clinical trials. J Am Med Inform Assoc. 2016 Jan;23(1):193-201.

MarksMan Healthcare

July 17, 2023

Artificial Intelligence, Machine Learning, Risk of Bias
Living SLRs: an Approach to Enhance Accuracy and Recency of SLRs
Living systematic literature reviews (SLRs) are a type of SLRs that are continually updated by periodically including relevant new evidence as and when it becomes available. SLRs are often considered to occupy the top of the evidence pyramid because they synthesize evidence from different sources and present a summary of the evidence, thus enabling clinical and policy-level decision-making.

Thus, it becomes essential that SLRs are of high quality, and are updated to include the latest available information.(1) Traditional SLRs that are published in high-quality journals can be expected to be of high quality, but lag when it comes to the ‘updated’ aspect because such SLRs represent static depictions of snapshots of the evidence at the time the research was published.(2) With the emergence of new evidence in the field, some of the recommendations given in an SLR that was published previously might become outdated, thereby challenging the validity of the guidelines that were developed using the SLR.(3)
Thus, while it is difficult to update an SLR, failure to do so results in lower accuracy and recency of the SLRs.(4)

Living SLRs is an approach that tries to resolve this problem. Living SLRs are high-quality, up-to-date, sometimes online, evidence summaries that help identify new trends and developments in the field. A living SLR involves regular literature screening (e.g., monthly), through which newly detected studies are added to the review. Accordingly, metrics such as meta-analysis or other summary measures are also updated with new study results, thereby leading to an updated review of findings and conclusions.(5)

Living SLRs are prepared following a review process similar to that of regular SLRs; however, after the initial publication, the literature is monitored and new results are incorporated as they become available. Continuous monitoring makes it possible to offer the most recent data at all times and further supports the validation of previous conclusions based on the most recent findings in the given field. This guarantees that clinical recommendations, which are largely based on SLRs, take benefit of the most recent clinical data.(5)

Since living SLRs necessitate a continuous workflow, the effort required is moderate, coordinated over long periods, and involves a gradual evolution in the review team, as opposed to the intensive, sporadic effort of standard SLRs and traditionally updated SLRs. Approaches such as machine learning (RCT classifier) and citizen science (Cochrane Crowd) are often utilized to expedite the evidence-screening process.(6)

Recently, especially with living SLRs that are available online, there have been efforts to improve data visualization and relevance, thereby enhancing the user experience, through the usage of AI. Recent innovations have made it possible for the user to select the outcome of interest, with the usage of features such as interactive portals, user-friendly platforms, customizable inclusion criteria, and automatically scheduled updates.(7)

Living SLRs do have certain challenges as well; probably the most important ones are related to the workload as it requires a larger investment than traditional SLRs. An equally challenging concern is the need to engage a large and dedicated team to constantly work on the updates, including tracking ongoing studies, locating full-text articles, chasing trial authors for data, screening the articles, data management, updating PRISMA chart, and results tables. With offline (published) living SLRs, editors need to set up peer reviews in advance to prevent delays, which can also be challenging.(8) The process of republishing reviews and triggering a new DOI may also negatively affect citation counts and impact factors. The process requires a continuous workflow with frequent statistical analysis which can lead to an inflated false-positive rate.(8)

With more research published in the scientific literature over the past few years, the potential pool of qualified studies for any particular SLR is expected to grow with time. With the advent of new technology to improvise the healthcare process, living SLR is proving as a realistic approach for updating SLRs.(9) Though automation in the form of AI is increasingly being used to speed up research screening, human intervention is inevitable for ensuring high-quality screening and data extraction from relevant studies.

Become A Certified HEOR Professional – Enrol yourself here!

References
1. Garner P, Hopewell S, Chandler J, et al. When and how to update systematic reviews: consensus and checklist. BMJ. 2016 Jul 20;354:i3507. doi: 10.1136/bmj.i3507. Erratum in: BMJ. 2016 Sep 06;354:i4853..
2. White A, Schmidt K. Systematic literature reviews. Complement Ther Med. 2005 Mar;13(1):54-60. doi: 10.1016/j.ctim.2004.12.003.
3. Shojania KG, Sampson M, Ansari MT, et al. How quickly do systematic reviews go out of date? A survival analysis. Ann Intern Med. 2007 Aug 21;147(4):224-33. doi: 10.7326/0003-4819-147-4-200708210-00179.
4. Simmonds M, Elliott JH, Synnot A, Turner T. Living Systematic Reviews. Methods Mol Biol. 2022;2345:121-134. doi: 10.1007/978-1-0716-1566-9_7.
5. Akl EA, Meerpohl JJ, Elliott J, et al. Living Systematic Review Network. Living systematic reviews: 4. Living guideline recommendations. J Clin Epidemiol. 2017 Nov;91:47-53. doi: 10.1016/j.jclinepi.2017.08.009. Epub 2017 Sep 11. PMID: 28911999.
6. Noel-Storr A. Working with a new kind of team: harnessing the wisdom of the crowd in trial identification. EFSA J. 2019 Jul 8;17(Suppl 1):e170715.
7. https://www.cytel.com/live-slr
8. Millard T, Synnot A, Elliott J, et al. Feasibility and acceptability of living systematic reviews: results from a mixed-methods evaluation. Syst Rev. 2019 Dec 14;8(1):325. doi: 10.1186/s13643-019-1248-5. PMID: 31837703; PMCID: PMC6911272.
9. Thomas J, Noel-Storr A, Marshall I, et al. Living Systematic Review Network. Living systematic reviews: 2. Combining human and machine effort. J Clin Epidemiol. 2017 Nov;91:31-37. doi: 10.1016/j.jclinepi.2017.08.011. Epub 2017 Sep 11. PMID: 28912003.
MarksMan Healthcare

February 13, 2023

Healthcare Decision Making, Living Systematic Literature Reviews, Machine Learning, Reproducibility