• Federated Data Networks (FDNs): Enhancing the Quality of RWE Research

    Federated Data Networks (FDNs): Enhancing the Quality of RWE Research

    Real-World Data (RWD), from which Real-World Evidence (RWE) is generated, has the unique capability of depicting real-world outcomes. RWD can also reduce timelines for research and development, and generate profound insights into the disease process. However, RWD from a single source often suffers from bias relating to equipment, lack of phenotypic diversity, limited training models, and diverse cohorts. RWD are also scattered and structured in diverse formats, which makes it difficult to unlock its full value. Furthermore, health data are personal, highly sensitive, and subject to data privacy rights and regulations.[1] As a result of these barriers, new methods are needed to enable unlocking the full potential of RWD, and Federated Data Networks (FDNs) are one such attempt.

    FDNs are a string of decentralized, interrelated nodes that allow data to be challenged and analysed by the other nodes in the network, without the data leaving the parent node.[1] The member nodes involved in FDNs are governed by a common framework which allows harmonized standards and tools for data access. Each member node is semi-independent as they can make decisions on ceding data access. Since the shared data are masked, blocked or anonymized, the member nodes have limited idea on the identity of the data contained in the other nodes; as a result, data ownership is maintained. The algorithms are trained collaboratively without data exchange by study models called federated learning. Thus, FDNs provide safe data mining with regulated access to diverse data, without crossing the legal barriers.

    In contrast to data sharing, data transfer, or data pooling, FDN is simply data visiting, and applying or modifying the results on the existing guidelines and practice.[1] To illustrate, consider two persons in a phone call: here, ideas are shared without sharing their identities, billing address etc. Similarly, FDNs involve sharing of only mathematical values and metadata sets, without sharing confidential patient identity. An example of FDN is the TIES (Text Information Extraction System) Cancer Research Network (TCRN). This FDN has 4 active nodes that search across 5.8 million cases and 2.5 million patients assess cohorts with rare phenotypes.[2]

    FDNs can create huge impact on the stakeholders such as physicians, hospitals, insurance companies, researchers and patients. With the surge in digital health devices, the federated model assures good training options for the physicians and hospitals. FDNs are useful in disease classification, mortality forecasting, and predicting treatment outcomes. Proven applications of FDNs include prognosis of stroke prevention, improving patient pathways in cancer, coronary artery disease, classification of EEG recordings, brain tumor classification, breast density classification, multi-disease chest X-ray classification, adverse drug reaction prediction, recognition of human activity and emotion, and prediction of oxygen requirements in COVID patients.[3,4] A federated approach using diverse datasets from different institutions had a 98.3% accuracy in COVID-19 detection, 95.4% accuracy in recognizing human activity and emotions, and 97.7% accuracy in mortality prediction.[3]

    In clinical research, FDNs serve to potentiate protocol optimization, patient selection, and adverse effect monitoring. It also facilitates translational research. Federated approach paves way to research on rare disease where the incidence rates and data sets are very low. FDNs reduce the time and resources by identifying the target patients rapidly for recruitment in clinical trials. Also, FDNs aid disease surveillance process by pooling data from different geographies.[5] For the manufacturers, FDNs facilitate continuous product validation and improvement.

    To ensure data privacy, FDNs work in line with various regulatory provisions such as General Data Protection Regulation (GDPR) of Europe, Data Protection Act (DPA) of UK, Health Insurance Portability and Accountability Act (HIPAA) of the United States, California Consumer Privacy Act (CCPA), CDSCO of India, Personal Information Protection Law (PIPL), Cybersecurity Law (CSL), and Data Security Law (DSL) of China and the On Personal Data (OPD) of Russia.[6]

    Despite their utility in data networking, FDNs face certain challenges in creating robust RWE. Insufficient and inconsistent data, uneven data quality, bias, and lack of data standardization are some of the factors that can lead to inconsistent conclusions. Cloud-based data, which is crucial to develop FDNs, are not enabled by all health care institutions. Practically speaking, debugging and optimizing FDNs is strenuous, because the hardware and networking differs on various sites which makes the learning algorithms diverse.[7] Another important challenge is the discrepancy in research grant funding. Larger hospitals may contribute to more datasets and may expect more research grant funds. However funding should be more towards the value of these contributions than to the size of datasets. That said, the main problem is in the accurate scaling of the value of these contributions.[7]

    The success of FDNs lies with the strong and consistent governance coupled with open lines of communication among partners. Also, an approach involving incentives can boost the quality and quantity of data contributions among the member nodes. With these steps, FDNs can significantly increase the external validity and enhance the robustness and quality of RWD and the resulting RWE, without the need to centralize datasets, thereby realizing the promise of precision medicine.

    Become A Certified HEOR Professional – Enrol yourself here!

    References

    1. Hallock H et al. Federated networks for distributed analysis of health data. Frontiers in Public Health. 2021;9.
    2. Jacobson R et al. A federated network for translational cancer research using clinical data and biospecimens. Cancer Research. 2015;75(24):5194-5201.
    3. Prayitno et al. A systematic review of federated learning in the healthcare area: from the perspective of data properties and applications. Applied Sciences. 2021;11(23):11191.
    4. Joshi M et al. Federated learning for healthcare domain – pipeline, applications and challenges. ACM Transactions on Computing for Healthcare. 2022. https://dl.acm.org/doi/10.1145/3533708
    5. Au F. Aggregated data or federated data: is one better than the other? https://blog.orionhealth.com/aggregated-data-or-federated-data-is-one-better-than-the-other/
    6. The best of both worlds: benefits of applying AI/ML in a federated data network. https://www.bcplatforms.com/the-best-of-both-worlds-benefits-of-applying-ai-ml-in-a-federated-data-network/
    7. Ng D et al. Federated learning: a collaborative effort to achieve better medical imaging models for individual sites that have small labelled datasets. Quantitative Imaging in Medicine and Surgery. 2021;11(2):852-857.
  • Text Mining for Search Term Development Aiding in Conduct of Better SLRs

    Text Mining for Search Term Development Aiding in Conduct of Better SLRs

    Systematic literature reviews (SLRs) are widely used to pool and present the findings from multiple studies in a dependable way and are often used to inform policy and practice guidelines. (1) An important SLR feature is the application of scientific tools to find and curtail bias as well as error in the selection and treatment of studies. (2) However, the increasing number of published studies together with the rate of their publication makes it even more complicated and time-consuming to identify relevant studies in an unbiased way. (3)

    To reduce the impact of publication bias, reviewers usually try identifying all relevant research to include it in SLRs. This is challenging and laborious, but the challenge is growing due to increasing databases to search as well as the number of papers and journals being published. Furthermore, evidence suggests the existence of an inherent North American bias in several major bibliographic databases (e.g. PubMed). Therefore, a range of other smaller databases needs to be looked into to identify research for reviews aiming at maximising external validity. (4) This then requires a multi-layered approach to searching through extensive Boolean searches from electronic bibliographic databases and specialised registers and websites. (5)

    Unfortunately, sensitive electronic searches of bibliographic databases show low specificity. Consequently, reviewers often end up manually looking through many thousands of irrelevant titles and abstracts for identifying the much smaller number of relevant ones; which is known as ‘screening’. (6) Roughly, an experienced reviewer can take between 30 seconds and a few minutes to evaluate a citation, which is why 10,000 citations involved in the screening process is considerable. (7) On the other hand, reviews for informed policy and practice must be completed within timetables (often short) and limited budgets; also, this review must be comprehensive in order to be an accurate reflection of the state of knowledge in a given area.(5)

    Text mining has been suggested as a prospective solution to these practical issues, as automating some of the screening process can prove to be time-saving. (5)  Text mining is defined as, ‘the process of discovering knowledge and structure from unstructured data (i.e., text)’. (8,9) There are two particularly promising ways in which text mining can be used to support the screening in SLRs, viz. i) by prioritising the list of items for manual screening to include most likely to be relevant studies can be included at the top of the list; and ii) by manually assigning include/exclude categories of studies for further application of such categorisations automatically. (10) The prioritisation of relevant items may not lessen the workload, but identifying most of the relevant studies first can enable other members of the team to proceed with the next stages of the review, whilst the rest of the irrelevant citations are screened by others. This reduces the turnaround time, even if the total workload may not really reduce.(5)

    The benefits of text mining in case of SLRs cannot be denied when it comes to developing database search strings for topics described by diverse terminology. Stansfield et al. have recently suggested five ways in which the text mining tools can aid in developing the search strategy: (11,12)

    • Improving the precision of searches – Framing more precise phrases instead of single-word terms
    • Identifying search terms to improve search sensitivity – Using additional search terms
    • Aiding the translation of search strategies across databases
    • Searching and screening within an integrated system
    • Developing objectively derived search strategies

    The utility of these tools depends on their different competencies, the way they are used, and the text analysed. (11)

    Moreover, Li et al. have recently proposed a text mining framework to reduce the abstract screening burden as well as to provide high-level information summary while conducting SLRs. This framework includes three self-defined semantics-based ranking metrics with keyword, indexed-term and topic relevance. This framework has been reported to reduce the labour of SLRs to a large degree, while keeping comparably higher recall. (13)

    An array of different issues concerning text mining makes it difficult to identify a single, most effective approach for its use in SLRs. There are, however, key messages/toolsets for applying text mining to the SLR context. Future research in this area should aim at addressing the duplication of evaluations as well as the feasibility of the toolsets for use across a range of subject-matter areas.(5)

    Become a Certified HEOR Professional – Enrol yourself here!

    References 

    1. Gough D, Oliver S, Thomas J. An Introduction to Systematic Reviews. London: Sage; 2012.
    2. Gough D, Thomas J, Oliver S. Clarifying differences between review designs and methods. Syst Rev 2012; 1(28).
    3. Bastian H, Glasziou P, Chalmers I. Seventy-five trials and eleven systematic reviews a day: how will we ever keep up? PLoS Med 2010; 7(9).
    4. Gomersall A, Cooper C. Joint Colloquium of the Cochrane and Campbell Collaborations. Keystone, Colorado: The Campbell Collaboration; 2010. Database selection bias and its effect on systematic reviews: a United Kingdom perspective.
    5. O’Mara-Eves A, Thomas J, McNaught J, et al. Using text mining for study identification in systematic reviews: a systematic review of current approaches Syst Rev 2015; 4(1):5.
    6. Lefebvre C, Manheimer E, Glanville J. Searching for studies (chapter 6) In: Higgins J, Green S, editors. Cochrane Handbook for Systematic Reviews of Interventions Version 510 [updated March 2011] Oxford: The Cochrane Collaboration; 2011.
    7. Allen I, Olkin I. Estimating time to conduct a meta-analysis from number of citations retrieved. JAMA 1999; 282(7):634-5.
    8. Ananiadou S, McNaught J. Text Mining for Biology and Biomedicine. Boston/London: Artech House; 2006.
    9. Hearst M. Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (ACL 1999): 1999. 1999. Untangling Text Data Mining; pp. 3–10.
    10. Thomas J, McNaught J, Ananiadou S. Applications of text mining within systematic reviews. Res Synth Methods 2011; 2(1):1–14.
    11. Stansfield C, O’Mara-Eves A, Thomas J. Text mining for search term development in systematic reviewing: A discussion of some methods and challenges. Res Synth Methods 2017; 8(3):355-365.
    12. Gore G. Text mining for searching and screening the literature. McGill. April, 2019. 
    13. Li D, Wang Z, Wang L, et al. A Text-Mining Framework for Supporting Systematic Reviews. Am J Inf Manag 2016; 1(1):1-9.

    Written by: Ms. Tanvi Laghate

  • How to Improve Healthcare Outcomes with Key Analytic Tools?

    How to Improve Healthcare Outcomes with Key Analytic Tools?

    Healthcare outcomes are defined as the changes observed and/or recorded in health status of individual or population patient/s usually due to an intervention, measures or specific healthcare investment. (1) The goal is to save the lives, shorten hospital stays and build healthier communities relying on preventative measures. (2) The fundamental steps of improving outcomes are measuring, reporting and analysing the outcomes. The efficient synthesis, organization and analysis of healthcare data offer the healthcare providers and other healthcare stakeholders with systematic and insightful treatment, measures and diagnosis. This may lead to higher patient care quality and better outcomes at lower costs.

    Healthcare industries generate a huge amount of information known as ‘big data’, driven by record keeping, compliance and regulatory requirement, potential to improve healthcare deliveries, and digitalization of historic data. (3) It include the clinical data from hospitals, clinics, pharmacies, pathological laboratories, diagnostic/imaging reports, healthcare insurances, and administrative data; individual patient data in electronic patient records (EPR) during various phases of clinical trials; pre-clinical data; hospitalization frequency data; research articles and reviews in scientific and medical journal; and information from various healthcare data resources; social media posts on different platforms; and less patient-specific information such as emergency care, news feed and healthcare magazines. (4) As per reports the data of U.S. alone may reach 1024 gigabyte soon. (3) There is need of rapidly transforming the volumes of aggregated healthcare data to value-based healthcare. 

    The analysis and assessment of huge healthcare data can be performed using advance platforms and tools with ability to handle structured, semi-structured or unstructured data. The data from random sources need to connect, match, cleanse and prepared for processing using three main steps of extract, transform and load. (4) The key platforms and tools to handle ‘big data’ are the Hadoop Distributed File System, MapReduce, PIG and PIG Latin, Hive, Jaql, Zookeeper, HBase, Cassandra, Oozie, Lucene, Avro, Mahout. (3) The analytic tools combine knowledge and data driven insights for identifying risks-factor and augmentation. These analytic tools have important applications for queries, reports, online analytical processing (OLAP) and data mining. (3) These analytic tools can search and analyse massive quantity of information from past treatments, latest published researches and healthcare databases to predict outcomes for individual patient. (5)

    Data analytic tools benefit all the components of healthcare system to improve healthcare outcomes. These components are healthcare service providers, patients, payers, stakeholders and managements. (6) Healthcare providers can develop new strategies and plan to care for patients such as reduce unnecessary hospitalizations and expenses. The patients at greatest risk of readmission can be identified and get guidance on follow ups for efficient resource utilization to save a huge amount of money spent each year on unnecessary hospitalization.

    The time gap always exists between a clinical event and the information to reach healthcare decision makers which could have bring the positive outcomes. The near real-time health surveillance can be performed using the information from social media blogs, micro-blogging on social networking sites such as Twitter and Facebook, and newspaper articles. (7) These social media networks provide information on the current locations by geo-tagged alerts. Real time analytic tools bring together the disparate information from various resources to the point of patient care, where the benefit can really be life-saving. It offers healthcare system access the most up-to date information. It realigns task based on priorities of healthcare providers, stakeholders, and insurers to improve healthcare outcomes. It addresses the gaps in care, quality, risk, utilization and regulatory requirement to support the improvements in clinical and quality outcomes; and financial performances. It provides a real-time report stating the real healthcare status of a patient and suggestions on improvement of the quality, achievement of compliance and realization of full reimbursement for their services. (8)

    It is often difficult for patients and clinician to keep the track of various healthcare organization-specific programs. The analytic tools may provide clinicians the information on a right program an eligible patient may enrol at a right time to help improve care and decrease costs. (8) The healthcare providers can assess patient-specific eligibility, gaps in care, risk scores, and historical medical information at the point of care which can be easily integrated into their existing operational model.

    The analytic tools improve healthcare outcomes by reducing the efforts and time required to handle ‘big data’ and conversion of volume to value-based information. These tools help encourage quality care to the patients benefitting payers as well as investors. The analytic tools would significantly support the advancement of medical and health science.

    Become an Certified HEOR Professional – Enrol yourself here!

    References

    1. Velentgas P., Dreyer N.A., and Wu W. A. (eds) Outcome Definition and Measurement. In ‘Developing a Protocol for Observational Comparative Effectiveness research: A User’s Guide’. Rockville,MD: Agency for Healthcare Research and Quality; AHRQ Publication No. 12(13)-EHC099, 2013.
    2. Kumar P. How real time analytics improves outcomes in healthcare. Published online on ‘IBM Cloud Blog’ dated June 19, 2017.
    3. Raghupathi W. and Raghupathi V. (2014). Big data analytics in healthcare: promise and potential. Health Information Science and Systems 2, 3
    4. Gandomi A., and Haider M. (2015). Beyond the hype: Big data concepts, methods, and analytics. International Journal of information Management 35, 137-144.
    5. Winters-Miner L.A. (2014) Seven ways predictive analytics can improve healthcare. Medical predictive analytics have the potential to revolutionize healthcare around the world. Published online on ‘Elsevier’s Daily stories for the science, Technology and health communities’ on Oct 06, 2014.
    6. Sun J. and Reddy C.K. (2013). Big data analytics for healthcare. Published in ‘KDD 2013 Proceedings of the 19th ACM SIAM International Conference on Knowledge Discovery and Data Mining’ held at Austin, TX, pg 1525-1525.
    7. Lee K., Agrawal A., and Choudhary A. (2013) Real-time disease surveillance using Twitter data: demonstration on flu and cancer. Published in ‘KDD 2013 proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining’, held at Chicago, Illinois, USA, pg 1474-1477.
    8. Rizzo D. The power of real-time analytics at the point of care. Published online on ‘Health IT Outcomes: Guest Column’ dated Dec 14, 2015.
  • How Patient Records Abstraction Can Help in Healthcare Decision Making?

    How Patient Records Abstraction Can Help in Healthcare Decision Making?

    Patient Records Abstraction (PRA) is a process done manually by searching through a medical record to identify data required for a particular or secondary use. It consists of direct matching of information found in the record to the data required, but also includes operations on the data such as categorizing, coding, transforming, interpreting, summarizing, and calculating. The abstraction, in the end, summarizes information about a patient for a specific secondary data use. (1) PRA typically involves reviewing patient files and abstracting (i.e., extracting) key data, which are then entered into electronic files. (2) Depending on the measure or purpose, there can be different sources for data collection such as paper medical records, electronic medical records (EMR), patient surveys, administrative databases, etc.

    PRA helps in reviewing large or small data sets and documents for information which can be helpful in the future for decision making. (3) It often involves collecting organizationally-defined, clinically relevant data elements, which don’t electronically convert, from the legacy system into the new target system. This process, therefore, makes detailed patient data instantly available in the electronic chart in a faster, accurate and cost-effective manner; (4) facilitating access to care without referring to a paper chart or an EMR. (5)

    In order to make informed decisions with the help of PRA, a tool referred to as ‘abstraction hierarchy’ (AH) is often implemented to facilitate cognitive work analysis (CWA). (6) These hierarchies can be used to develop depictions of patient care in line with biomedical knowledge, making medical problem solving easier, and act as a frame of reference. (7)

    Studies exploring different aspects of AH also suggest that, it can be useful in implementing shared decision making (SDM) in order to improve patient care through their active engagement. Implementing SDM would be an advantageous approach to care, as patient involvement in decision making can result in improved health outcomes, thus providing an enhanced ethical framework for clinicians to deliver appropriate care and improved efficiency of the health system.7

    Researchers have found a way to mine huge amounts of patient data with the widespread use of EMR for identifying the best predictors of health outcomes. Also, every EMR system possibly has only a subset of the information necessary for a particular clinical trial.  This is where PRA can come into play to provide the necessary, overall data, thus substantially saving time and energy. (8) Consistent improvement in healthcare data quality plays a vital role in planning, development, and maintenance of healthcare services. This improvement can affect clinical and administrative decision making in many ways, thereby increasing patient safety, and facilitating the efficacy of clinical care pathways. (9)

    In addition to these advancements in healthcare technologies, the concept of ‘Big Data’ is emerging, which seems to have been first derived from an IT strategic consulting group’s approach to manage volume, velocity, and variety of data. (10) Researchers believe that the EMRs that contain huge volumes of patient data in variety of domains could also be considered as ‘Big Data’.  This is owing to the reports stating the United States alone will soon witness one billion patient visits documented per year in EMR systems. (11) Besides, the amount of additional data available about medical conditions, underlying genetics, medications, and treatment approaches is high. (12)

    Furthermore, the use of ‘real-world data’ (RWD) that contributes to the ‘real-world evidence’ (RWE) is also on the rise. RWD and associated RWE may constitute valid scientific evidence depending on the characteristics of the data. For making better choices about health and health care requires the best possible evidence. Sadly, many decisions made today lack the high-quality evidence derived from randomized, controlled trials or well-designed observational studies. Therefore, rich, diverse sources of digital data such as, EMRs, claims data, consumer data, chart reviews, which can cumulatively be referred to as ‘Big Data’- are becoming widely available for research, thus facilitating data extraction with the help of robust abstraction tools. To add to this, concept of chart abstraction methodologies that will integrate physician insights is also emerging. This will ensure better understanding of- i) various ways to address common obstacles and limitations to RWD collection, and ii) the importance community physician relationships for study implementation and success. The research and health care communities, consequently, have the opportunity to support improved healthcare decision making. (13)

    Become an Certified HEOR Professional – Enrol yourself here!

    References

    1. Nahm M. Data accuracy in medical record abstraction. The University of Texas School of Health Information Sciences at Houston.
    2. Half R. What skills do you need to be an electronic medical records abstractor/auditor? July, 2016.
    3. Rasmussen D. How data abstraction is going to change healthcare forever. September, 2016.
    4. Improving patient records: Conclusions and Recommendations. The computer-based patient record: An essential technology for health care. 1997.
    5. Nahm M. Data accuracy in medical record abstraction. The University of Texas School of Health Information Sciences at Houston.
    6. St-Maurice JD, Burns CM. Modeling Patient Treatment With Medical Records: An Abstraction Hierarchy to Understand User Competencies and Needs. Eysenbach G, ed. JMIR Human Factors 2017; 4(3):e16.
    7. Hajdukiewicz JR, Vicente KJ, Doyle DJ, et al. Modeling a medical environment: an ontology for integrated medical informatics design. Int J Med Inform. 2001; 62(1):79–99.
    8. Jones G. EMR to EDC for RWE. May, 2017.
    9. Adeleke IT, Adekanye AO, Onawola KA, et al. Data quality assessment in healthcare: a 365-day chart review of inpatients’ health records at a Nigerian tertiary hospital. Journal of the American Medical Informatics Association : JAMIA. 2012; 19(6):1039-1042.
    10. Laney D. 3D Data Management: Controlling Data Volume, Velocity, and Variety. META Group; 2001.
    11. Hripcsak G, Albers DJ. Next-generation phenotyping of electronic health records. J Am Med Inform Assoc 2013; 20(1):117–21.
    12. Ross MK, Wei W, Ohno-Machado L. “Big Data” and the Electronic Health Record. Yearbook of Medical Informatics 2014; 9(1):97-104.
    13. Califf RM, et al. Transforming evidence generation to support health and health care decisions. N Engl J Med 2016; 375:2395-2400