Risk of Bias Archives - Marksman Healthcare

Assessing The Risk of Bias in Diagnostic Studies
Assessing the risk of bias (RoB) in diagnostic studies is crucial for determining the reliability of their findings. Unlike interventional trials that assess treatment outcomes, diagnostic studies explore the ability of diagnostic tests to correctly identify the presence or absence of a disease condition. Bias in such studies can result in both an overestimation or underestimation of the true diagnostic accuracy of a test, which may mislead the clinical practice. To avoid this, reviewers must carefully assess the study design, execution, and reporting to identify any factors that could alter the results.(1-4)

One of the most commonly applied and recommended tools for assessing the RoB in diagnostic studies is the “Quality Assessment of Diagnostic Accuracy Studies” (QUADAS-2), which analytically evaluates RoB across four domains, viz. patient selection, index test, reference standard, and flow and timing. Every single domain characterizes a common point in the study that may potentially lead to bias. Along with bias, the QUADAS-2 also encourages reviewers to factor in the applicability of the study results to the clinical context. The strength of QUADAS-2 lies in its structured approach and adaptability, allowing for adaptation based on the specific review setting.(1, 4)

For studies directly comparing two or more diagnostic tests, the QUADAS-C tool, which is an extension of QUADAS-2, is specifically applicable.(5) Studies on comparative diagnostic accuracy present unique RoB, including differential verification, incorporation bias, or test application discrepancies. QUADAS-C helps address these risks by means of domains that parallel QUADAS-2 but involve signalling questions specific to the comparative context. This enables a more appropriate assessment of methodological accuracy and internal validity in direct test assessments. This tool also upholds transparent, reproducible views about bias like QUADAS-2 and is increasingly implemented in comparative diagnostic systematic reviews.(5)

The selection of participants is among the most common sources of bias. If the patient population in the study lacks representation (for instance, representing healthy volunteers or obvious disease cases), the findings may not apply to real-world clinical settings. Such spectrum bias can alter the estimates of sensitivity and specificity of diagnostic tests. Preferably, diagnostic studies should randomly or consecutively enrol patients, thus avoiding case-control designs unless necessary. Inappropriate exclusions or selective enrolment can also result in partial verification bias, further limiting interpretability.(1, 2, 4)

Conduct and interpretation of the index test under evaluation is another significant domain considered in QUADAS-2. If the test is interpreted with an understanding of the reference standard result, observer bias may impact the assessment; thus necessitating the aspect of blinding. Also, diagnostic limits should be clearly specified prior to the study initiation. Changing these limits post hoc to improve accuracy measures can potentially cause bias, as it customizes results to the sample rather than representing test performance in a neutral population.(2, 4)

The reference standard, an approach that helps establish the true disease status, should ideally be the most accurate and applied consistently. Flawed standard or the standard inferred with prior knowledge of the index test results can cause bias and hamper the accuracy measures. Moreover, using different reference standards for different groups within the study, also known as differential verification, can also mislead the test results. Therefore, both the choice of reference standard and its reliable implementation are crucial to the authenticity of a diagnostic accuracy study.(1, 2, 4)

Another subtle but crucial factor is the timing. A long delay between conducting the index test and performing the reference standard can lead to disease progression or regression, which may change the true status of the disease. In case of loss of follow-up, or not all patients receiving the reference standard, there can be attrition and verification bias. Therefore, maintaining a short, clinically appropriate time interval between tests and considering all enrolled patients in the final analysis are important to help preserve internal validity.(3, 4)

QUADAS-2 leads reviewers through each of these domains by integrating signalling questions, i.e., targeted prompts that direct attention to possible risks or methodological challenges. These questions are customized to the specific context of a review, which necessitate reviewers to make informed decisions on levels of risk bias, such as low, high, or unclear risk. Ideally, this action is performed by two independent reviewers, with an agreement to address inconsistencies and improve objectivity and reproducibility.(4)

In conclusion, evaluating RoB in diagnostic studies is a refinement-oriented and context-focused assessment of study design, execution, and transparency. Tools like QUADAS-2 and QUADAS-C facilitate a structured and systematic method for this assessment, thus helping identification and representation of biases that may concede the accuracy of test performance estimates. Critical appraisal aspects, such as patient selection, test blinding, reference standard reliability, and study timing, can help make better judgements about the integrity of diagnostic evidence and its generalizability to clinical decision-making.

Become A Certified HEOR Professional – Enrol yourself here!

References
1. Hall MK, Kea B, Wang R. Recognising Bias in Studies of Diagnostic Tests Part 1: Patient Selection. Emerg Med J. 2019 Jul;36(7):431-434.
2. Schmidt R, Factor RW. Understanding Sources of Bias in Diagnostic Accuracy Studies. Archives of Pathology & Laboratory Medicine. 2013; 137(4):558-65.
3. Di Girolamo N, Winter A, Meursinge Reynders R. High and unclear risk of bias assessments are predominant in diagnostic accuracy studies included in Cochrane reviews. J Clin Epidemiol. 2018 Sep;101:73-78.
4. Whiting PF, et al; QUADAS-2 Group. QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med. 2011 Oct 18;155(8):529-36.
5. Yang B, Mallett S, Takwoingi Y, et al; QUADAS-C Group. QUADAS-C: A Tool for Assessing Risk of Bias in Comparative Diagnostic Accuracy Studies. Ann Intern Med. 2021 Nov;174(11):1592-1599.
6. Bezerra CT, Grande AJ, Galvão VK, et al. Assessment of the strength of recommendation and quality of evidence: GRADE checklist. A descriptive study. Sao Paulo Med J. 2022 Nov-Dec;140(6):829-836.
MarksMan Healthcare

June 19, 2025

Risk of Bias

Diagnostic Accuracy, Evidence Appraisal, QUADAS-2 Tool, QUADAS-C Tool
Using AI to Accurately Assess Risk of Bias in Published Articles: Are We There Yet?

In the realm of medical research, the credibility and accuracy of published articles are paramount. Healthcare professionals rely on these articles to make informed decisions regarding patient care, treatment modalities, and developing clinical guidelines. However, the presence of bias in scientific studies can significantly undermine the validity and trustworthiness of their findings, potentially leading to misguided conclusions and inappropriate healthcare practices. The emergence of artificial intelligence (AI) has sparked considerable interest in utilizing its capabilities to assist in assessing bias in published articles.(1)

Bias can occur at various stages of the research process, including study design, data collection, analysis, interpretation, etc. Identifying and minimizing bias is crucial to ensure that research findings are unbiased, reliable, and can effectively translate into clinical practice. Typically, this assessment involves thoroughly examining various aspects of a study, such as study design, methodology, data collection, analysis, and reporting. Experts evaluate factors that may introduce bias, including conflicts of interest, selective reporting, inadequate blinding or randomization, and other potential sources of bias. This manual process requires expertise and can be time-consuming, especially when analyzing a large number of articles. The introduction of artificial intelligence (AI) in risk of bias assessment offers several advantages over traditional means. By leveraging machine learning algorithms, AI tools can identify patterns and indicators of bias in titles, abstracts, and full-text articles. This technology accelerates the screening process, increases consistency in assessments, and provides additional insights into potential biases.(2)

While AI cannot replace human expertise, it serves as a valuable tool for initial screening and prioritization, enabling researchers and clinicians to focus their attention on articles with a lower risk of bias and facilitating evidence-based decision-making in a more timely and efficient manner. The integration of AI in assessing the risk of bias in published articles signifies a significant advancement, promising enhanced reliability and objectivity in evaluating scientific literature.(3)

AI algorithms can analyze vast amounts of data and identify patterns that might be challenging for humans to detect. In recent years, researchers have developed AI-based tools and techniques to assist in assessing the risk of bias in scientific studies. These tools utilize machine learning algorithms to evaluate published articles based on predefined criteria and indicators of bias.(3)

While AI has shown promise in assessing bias in scientific literature, it is crucial to emphasize the need for collaboration between researchers, clinicians, and AI experts. By combining domain expertise and technical knowledge, interdisciplinary teams can develop more accurate and reliable AI models. Ongoing research and development are necessary to refine AI models, improve their performance in detecting various types of bias, and address the limitations, such as the reliance on limited text information.(4, 5)

The application of AI in assessing the risk of bias (ROB) in scientific articles is not without its challenges. AI models may lack contextual understanding, struggle with interpretation and identifying subtle bias, and have limited adaptability to evolving research practices. They can also perpetuate biases present in training data, raise ethical concerns, and create accountability challenges. Further, despite reasonably accurate predictions, the imperfections of AI models highlight the need for manual verification to ensure comprehensive and reliable assessments.(5)

While AI has showcased promise in assessing bias within the scientific literature, a collaborative approach between researchers, clinicians, and AI experts is vital for further advancement. The fusion of domain expertise and technical acumen within interdisciplinary teams can foster the development of more accurate and reliable AI models. Continued research and development efforts are essential to refine existing models, augment their performance in detecting diverse types of bias, and address inherent limitations, such as the dependence on limited textual information.

Become a Certified HEOR Professional – Enrol yourself here!

References:

1. Jardim PS, Rose CJ, Ames HM, et al. Automating risk of bias assessment in systematic reviews: a real-time mixed methods comparison of human researchers to a machine learning system. BMC Medical Research Methodology. 2022 Jun 8;22(1):167.
2. Arno A, Elliott J, Wallace B, Turner T, Thomas J. The views of health guideline developers on the use of automation in health evidence synthesis. Systematic Reviews. 2021 Dec;10:1-0.
3. Soboczenski F, Trikalinos TA, Kuiper J, et al. Machine learning to help researchers evaluate biases in clinical trials: a prospective, randomized user study. BMC medical informatics and decision making. 2019 Dec;19:1-2.
4. Marshall IJ, Kuiper J, Wallace BC. Automating risk of bias assessment for clinical trials. IEEE J Biomed Health Inform. 2015 Jul;19(4):1406-12.
5. Marshall IJ, Kuiper J, Wallace BC. RobotReviewer: evaluation of a system for automatically assessing bias in clinical trials. J Am Med Inform Assoc. 2016 Jan;23(1):193-201.

MarksMan Healthcare

July 17, 2023

Artificial Intelligence, Machine Learning, Risk of Bias
Risk of Bias Assessment During Systematic Literature Reviews: Why and How?
A systematic literature review (SLR) is considered the highest form of evidence due to its rigorous approach through which every relevant piece of published or unpublished literature that is currently available to address a specific research issue is pooled and analyzed to answer a research question. For this reason, it is essential that the risk of bias (ROB) is assessed for all studies that are included in the SLRs. ROB assessment ensures the transparency of the outcomes, the validity of evidence, confidence, and reproducibility of the SLR process. It is typically done by identifying systematic errors or limitations in the conduct, design, or analysis of each included study in SR. Further, ROB assessment of the included studies is also a requirement for optimal reporting of SLRs, recommended by the PRISMA 2020 statement. (1,2)

The tool for ROB assessment depends on the study design of the included studies. For SLRs including RCTs, different tools are available for ROB assessment; one such tool is the Cochrane risk of bias tool version 2.0 (RoB-2), that was published in 2019 as an upgrade to the previous Cochrane RoB tool. According to the developers, the Cochrane RoB-2 is suitable for assessing ROB in individually-randomized, parallel-group, and cluster- randomized trials. (3) Other tools that are used for assessing ROB of RCTs include the EPOC RoB Tool for complex interventions randomized trials, the Critical Appraisal Skills Programme (CASP) checklist, the Joanna Briggs Institute (JBI) critical appraisal checklist, and the Scottish Intercollegiate Guidelines Network (SIGN) critical appraisal checklists for assessing methodological quality of different study types, including RCT. (4-6) Out of all of these available tools, the choice of the most appropriate tools depends on the research question, domain coverage of the tool, availability of the tool, the type of scoring system used, and any specific regulatory requirement.

For SLRs including non-randomized studies, a frequently used tool is the ROBINS-I (Risk Of Bias In Non-randomized Studies – of Interventions) tool. Apart from this, other popular tools include the JBI critical appraisal checklist for non-randomized experimental studies, the EPOC RoB tool, and the methodological index for non-randomized studies (MINORS) tool. (1,5,6)

There are several available tools for observational studies. To name a few, the CASP cohort study checklist, the SIGN critical appraisal checklists, the NIH quality assessment tool, the Newcastle-Ottawa Scale, and JBI critical appraisal checklist are the recommended tools for cohort study and Case-control studies. The Appraisal tool for Cross-Sectional Studies (AXIS) is another recommended tool for cross-sectional studies. Additionally, MINORS and the JBI Critical Appraisal tool can be used for ROB assessment of case series and case reports. (4-7) If an SLR includes more than one type of study design, the ROB assessment must use multiple tools based on the study design. Alternatively, a recently developed mixed methods appraisal tool (MMAT) can be used in such a situation.(8) Finally, for SLRs of SLRs, tools such as ROBINS and AMSTAR-2 are available, which can assess the risk of bias or methodological quality of the included SLR.(9,10)

While assessing ROB of the included studies in an SLR, it is essential to ensure that the latest version of the appropriate tool is used, and that the tool is validated and reliable. It is also recommended that independent assessment of ROB is carried out, and the results consolidated so that a comprehensive ROB assessment is performed. Finally, the reporting of ROB of the included studies must follow the recommendations of the tool.

While systematic reviews are the gold standard of evidence synthesis, their findings highly depend on the validity of their assessed studies. Therefore, the researchers must ensure that these studies are reliable and free from bias, leading to more accurate conclusions and evidence-based recommendations.

Become A Certified HEOR Professional – Enrol yourself here!

References
1. Zeng X, Zhang Y, Kwong JS, et al. The methodological quality assessment tools for preclinical and clinical studies, systematic review and meta-analysis, and clinical practice guideline: a systematic review. J Evid Based Med. 2015 Feb;8(1):2-10.
2. Jüni P, Altman DG, Egger M. Systematic reviews in health care: Assessing the quality of controlled clinical trials. BMJ. 2001 Jul 7;323(7303):42-6.
3. Sterne JAC, Savović J, Page MJ, et al. RoB 2: a revised tool for assessing risk of bias in randomised trials. BMJ. 2019 Aug 28;366:l4898
4. Nadelson S, Nadelson LS. Evidence‐based practice article reviews using CASP tools: a method for teaching EBP. Worldviews on Evidence‐Based Nursing. 2014 Oct;11(5):344-6.
5. Baker A, Young K, Potter J, Madan I. A review of grading systems for evidence-based guidelines produced by medical specialties. Clinical medicine. 2010 Aug;10(4):358.
6. Munn Z, Moola S, Lisy K, Riitano D, Tufanaru C. Methodological guidance for systematic reviews of observational epidemiological studies reporting prevalence and cumulative incidence data. Int J Evid Based Healthc. 2015;13(3):147–153.
7. Downes MJ, Brennan ML, Williams HC, Dean RS. Development of a critical appraisal tool to assess the quality of cross-sectional studies (AXIS). BMJ Open. 2016 Dec 8;6(12):e011458. doi: 10.1136/bmjopen-2016-011458.
8. Hong QN, Gonzalez-Reyes A, Pluye P. Improving the usefulness of a tool for appraising the quality of qualitative, quantitative and mixed methods studies, the Mixed Methods Appraisal Tool (MMAT). J Eval Clin Pract. 2018 Jun;24(3):459-467.
9. Whiting P, Savović J, Higgins JP, Caldwell DM, Reeves BC, Shea B, Davies P, Kleijnen J, Churchill R; ROBIS group. ROBIS: A new tool to assess risk of bias in systematic reviews was developed. J Clin Epidemiol. 2016 Jan;69:225-34.
10. Shea BJ, Reeves BC, Wells G, Thuku M, Hamel C, Moran J, Moher D, Tugwell P, Welch V, Kristjansson E, Henry DA. AMSTAR 2: a critical appraisal tool for systematic reviews that include randomised or non-randomised studies of healthcare interventions, or both. BMJ. 2017 Sep 21;358:j4008.
MarksMan Healthcare

May 8, 2023

Risk of Bias, Systematic Literature Reviews
The Importance of CHARMS Checklist in the SLRs of Clinical Prediction Models
Clinical prediction models (CPMs) are statistical models that use patient characteristics and clinical variables to estimate the probability of a particular health outcome, such as a disease or adverse event. CPMs can be diagnostic prediction models that aid in diagnosis, by predicting the likelihood that a person is currently having a particular health condition (for example, Wells score for pulmonary embolism). Another type of CPMs is the prognostic prediction models, which aid in prognosis by predicting the likelihood that a person will experience a particular health outcome over a specific period (for example, the Framingham Risk Score for cardiovascular disease). (1) These days CPMs have become an essential part of evidence-based clinical practice, and thus it becomes important that the CPMs provide accurate estimation of the disease condition for which they are used. (2, 3) Towards this direction, systematic literature reviews (SLRs) are often conducted to determine the quality and validity of CPMs, as well as to identify gaps in the literature to inform the development of new CPMs. (4)

The CHARMS (Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modelling Studies) checklist is a tool that has been developed to facilitate the critical evaluation and data extraction while performing SLRs of CPMs. First published in 2014, this checklist provides explicit guidance to help reviewers and users to frame the right review question. It also provides a data extraction list with explicit guidance on which items to extract from CPM studies for evaluating the risk of bias and applicability. It can be used to critically appraise all types of primary prediction model studies for all kinds of the target population, outcomes, and predictors, regardless of the statistical techniques used. (5) Thus, the CHARMS checklist helps the researchers in two critical areas of conducting an SLR: framing the review question, and critical appraisal of included articles.

The contents of the CHARMS checklist are arranged in two sections. First, there is a list of 7 key items to help the researcher frame a well-defined, proper, and focussed review questions; these key items include the intended scope of the review, type of prediction model studies, prognostic versus diagnostic prediction model, target population to whom the prediction model applies, outcome to be predicted, the period of the prediction, and the intended moment of using the model. Next, the checklist provides guidance for data extraction and critical appraisal of the included articles in the SLR by means of 35 key items that are organized in 11 domains: source of data, participants, outcome to be predicted, candidate predictors, sample size, missing data, model development, model performance, model evaluation, results, interpretation and discussion. (4, 5)

Ever since its publication in 2014, the CHARMS checklist has been used by various SLRs of CPMs. (6-10) To further facilitate the easier application of the CHARMS tool, an Excel template for data extraction and risk of bias assessment of clinical prediction models has been recently published; this template makes it possible for the user to apply both the CHARMS and the PROBAST (Prediction model Risk Of Bias Assessment Tool) while conducting critical appraisal of SLRs of CPMs. It will also encourage more accurate and thorough reporting of these systematic reviews. (11)

Critical appraisal of included studies is an essential part of any SLR, and SLRs of CPMs is not an exception. It is crucial that the tool used for the critical appraisal is validated, asks the correct questions, is user-friendly, and gives accurate results. The CHARMS checklist, standing the test of the time, fits the bill perfectly, and along with the PROBAST tool, has become an invaluable resource in the conduct of SLRs of CPMs.

Become A Certified HEOR Professional – Enrol yourself here!

References:
1. Vogenberg FR. Predictive and prognostic models: implications for healthcare decision-making in a modern recession. Am Health Drug Benefits. 2009 Sep;2(6):218-22.
2. Van Smeden M, Reitsma JB, Riley RD, et al. Clinical prediction models: diagnosis versus prognosis. Journal of clinical epidemiology. 2021 Apr 1;132:142-5.
3. Hendriksen JM, Geersing GJ, Moons KG, de Groot JA. Diagnostic and prognostic prediction models. J Thromb Haemost. 2013 Jun;11 Suppl 1:129-41.
4. Damen JAA, Moons KGM, van Smeden M, Hooft L. How to conduct a systematic review and meta-analysis of prognostic model studies. Clin Microbiol Infect. 2023 Apr;29(4):434-440. Doi: 10.1016/j.cmi.2022.07.019. Epub 2022 Aug 4.
5. Moons KG, de Groot JA, Bouwmeester W, et al. Critical appraisal and data extraction for systematic reviews of prediction modelling studies: the CHARMS checklist. PloS Med. 2014Oct 14;11(10):e1001744.
6. Wynants L, Van Calster B, Collins GS, et al. Prediction models for diagnosis and prognosis of covid-19: systematic review and critical appraisal. Bmj. 2020 Apr 7;369.
7. Viswanathan M, Patnode CD, Berkman ND, et al. Assessing the risk of bias in systematic reviews of health care interventions. Methods guide for effectiveness and comparative effectiveness reviews [Internet]. 2017 Dec 13.
8. Damen JA, Hooft L, Schuit E, et al. Prediction models for cardiovascular disease risk in the general population: systematic review. bmj. 2016 May 16;353.
9. Smith EE, Kent DM, Bulsara KR, et al. Accuracy of prediction instruments for diagnosing large vessel occlusion in individuals with suspected stroke: a systematic review for the 2018 guidelines for the early management of patients with acute ischemic stroke. Stroke. 2018 Mar;49(3):e111-22.
10. Meehan AJ, Lewis SJ, Fazel S, et al. Clinical prediction models in psychiatry: a systematic review of two decades of progress and challenges. Molecular Psychiatry. 2022 Jun;27(6):2700-8.
11. Fernandez-Felix BM, López-Alcalde J, Roqué M, Muriel A, Zamora J. CHARMS and PROBAST at your fingertips: a template for data extraction and risk of bias assessment in systematic reviews of predictive models. BMC Medical Research Methodology. 2023 Dec;23(1):1-8.
MarksMan Healthcare

April 10, 2023

CHARMS, Clinical Prediction Models, Risk of Bias, Systematic Literature Reviews