In the first research paper of this series on AI in systematic literature reviews (SLRs), we shared our methodology for testing the performance of AI models in title/abstract (ti/ab) screening, with which we achieved high sensitivity (82% to 96% across five different categories of SLR projects). On advancing our program to evaluate the performance of AI models on screening of a large number of full-texts (~2,000) from the same five projects, we obtained a sensitivity of ≥99%. Here, we share our methodology and results from this test of AI-enabled full-text screening (FTS).
Link to White Paper