Systematic literature reviews (SLRs) are widely used to pool and present the findings from multiple studies in a dependable way and are often used to inform policy and practice guidelines. (1) An important SLR feature is the application of scientific tools to find and curtail bias as well as error in the selection and treatment of studies. (2) However, the increasing number of published studies together with the rate of their publication makes it even more complicated and time-consuming to identify relevant studies in an unbiased way. (3)

To reduce the impact of publication bias, reviewers usually try identifying all relevant research to include it in SLRs. This is challenging and laborious, but the challenge is growing due to increasing databases to search as well as the number of papers and journals being published. Furthermore, evidence suggests the existence of an inherent North American bias in several major bibliographic databases (e.g. PubMed). Therefore, a range of other smaller databases needs to be looked into to identify research for reviews aiming at maximising external validity. (4) This then requires a multi-layered approach to searching through extensive Boolean searches from electronic bibliographic databases and specialised registers and websites. (5)

Unfortunately, sensitive electronic searches of bibliographic databases show low specificity. Consequently, reviewers often end up manually looking through many thousands of irrelevant titles and abstracts for identifying the much smaller number of relevant ones; which is known as ‘screening’. (6) Roughly, an experienced reviewer can take between 30 seconds and a few minutes to evaluate a citation, which is why 10,000 citations involved in the screening process is considerable. (7) On the other hand, reviews for informed policy and practice must be completed within timetables (often short) and limited budgets; also, this review must be comprehensive in order to be an accurate reflection of the state of knowledge in a given area.(5)

Text mining has been suggested as a prospective solution to these practical issues, as automating some of the screening process can prove to be time-saving. (5)  Text mining is defined as, ‘the process of discovering knowledge and structure from unstructured data (i.e., text)’. (8,9) There are two particularly promising ways in which text mining can be used to support the screening in SLRs, viz. i) by prioritising the list of items for manual screening to include most likely to be relevant studies can be included at the top of the list; and ii) by manually assigning include/exclude categories of studies for further application of such categorisations automatically. (10) The prioritisation of relevant items may not lessen the workload, but identifying most of the relevant studies first can enable other members of the team to proceed with the next stages of the review, whilst the rest of the irrelevant citations are screened by others. This reduces the turnaround time, even if the total workload may not really reduce.(5)

The benefits of text mining in case of SLRs cannot be denied when it comes to developing database search strings for topics described by diverse terminology. Stansfield et al. have recently suggested five ways in which the text mining tools can aid in developing the search strategy: (11,12)

  • Improving the precision of searches – Framing more precise phrases instead of single-word terms
  • Identifying search terms to improve search sensitivity – Using additional search terms
  • Aiding the translation of search strategies across databases
  • Searching and screening within an integrated system
  • Developing objectively derived search strategies

The utility of these tools depends on their different competencies, the way they are used, and the text analysed. (11)

Moreover, Li et al. have recently proposed a text mining framework to reduce the abstract screening burden as well as to provide high-level information summary while conducting SLRs. This framework includes three self-defined semantics-based ranking metrics with keyword, indexed-term and topic relevance. This framework has been reported to reduce the labour of SLRs to a large degree, while keeping comparably higher recall. (13)

An array of different issues concerning text mining makes it difficult to identify a single, most effective approach for its use in SLRs. There are, however, key messages/toolsets for applying text mining to the SLR context. Future research in this area should aim at addressing the duplication of evaluations as well as the feasibility of the toolsets for use across a range of subject-matter areas.(5)

Become a Certified HEOR Professional – Enrol yourself here!


  1. Gough D, Oliver S, Thomas J. An Introduction to Systematic Reviews. London: Sage; 2012.
  2. Gough D, Thomas J, Oliver S. Clarifying differences between review designs and methods. Syst Rev 2012; 1(28).
  3. Bastian H, Glasziou P, Chalmers I. Seventy-five trials and eleven systematic reviews a day: how will we ever keep up? PLoS Med 2010; 7(9).
  4. Gomersall A, Cooper C. Joint Colloquium of the Cochrane and Campbell Collaborations. Keystone, Colorado: The Campbell Collaboration; 2010. Database selection bias and its effect on systematic reviews: a United Kingdom perspective.
  5. O’Mara-Eves A, Thomas J, McNaught J, et al. Using text mining for study identification in systematic reviews: a systematic review of current approaches Syst Rev 2015; 4(1):5.
  6. Lefebvre C, Manheimer E, Glanville J. Searching for studies (chapter 6) In: Higgins J, Green S, editors. Cochrane Handbook for Systematic Reviews of Interventions Version 510 [updated March 2011] Oxford: The Cochrane Collaboration; 2011.
  7. Allen I, Olkin I. Estimating time to conduct a meta-analysis from number of citations retrieved. JAMA 1999; 282(7):634-5.
  8. Ananiadou S, McNaught J. Text Mining for Biology and Biomedicine. Boston/London: Artech House; 2006.
  9. Hearst M. Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (ACL 1999): 1999. 1999. Untangling Text Data Mining; pp. 3–10.
  10. Thomas J, McNaught J, Ananiadou S. Applications of text mining within systematic reviews. Res Synth Methods 2011; 2(1):1–14.
  11. Stansfield C, O’Mara-Eves A, Thomas J. Text mining for search term development in systematic reviewing: A discussion of some methods and challenges. Res Synth Methods 2017; 8(3):355-365.
  12. Gore G. Text mining for searching and screening the literature. McGill. April, 2019. 
  13. Li D, Wang Z, Wang L, et al. A Text-Mining Framework for Supporting Systematic Reviews. Am J Inf Manag 2016; 1(1):1-9.

Written by: Ms. Tanvi Laghate

Related Posts