Software platforms dedicated to literature search and review
The common theme for popular commercial AI applications that explicitly pitch themselves towards ‘research’ is that they integrate LLM technology with search technology to accelerate and, to a limited extent, enhance literature searching and (limited) reviewing. This reflects the misleading lay perception of research which is often thought of as primarily just ‘reading’. There are some literature mapping tools (like (free), (free + paid), (free) and (free + paid)) that don’t use LLMs (but use limited semantic NLP) to enhance searches but are nonetheless very useful for building up literature collections. In fact they are arguably better for systematic reviews where time saving is less important than breadth of coverage.
The most popular tools that use advanced generative AI to support literature review are Elicit, Scite AI and, to a lesser extent (as it’s designed for global searching not just scholarly databases), Perplexity AI. As with any software application, while they fundamentally perform similar functions they each have distinct value propositions and interfaces that will be more useful for some researchers and contexts. Elicit is perhaps the best starting point to develop not only a collection of relevant papers but succinct AI-generated summaries, key findings and limitations. Scite’s main value is in using a form of sentiment analysis to classify paper citations that either support, contrast or merely mention the source paper, which can speed up evaluating papers as part of the review process. Perplexity is less explicitly dedicated to scholarly research but is extraordinarily fast and useful for finding any live information on the web, along with enabling a continual dialogue based on its findings.
As with all generative AI based on LLMs, any generated content must be verified independently as their non-deterministic outputs naturally create so-called ‘hallucinations’, and this can happen even when it looks like the tool is citing a direct source. Scite extracts direct quotes for individual papers rather than producing a generative AI summary, which for most researchers may be more helpful. At most, these tools can help speed up identifying relevant scholarly resources with snapshot key information summaries, but they do not even begin to substitute engaged direct reading. In fact one of the most significant risks as AI quality improves is researchers outsourcing the ‘reading’ and simply accepting summaries as faithful representations, which not only loses context and misses information, but risks weakening critical and deep engagement over time.
Elicit can help speed up exploratory (but not systematic) literature reviews thanks to using advanced AI semantic search to identify relevant papers without the need for comprehensive or exact keywords. Like any tool that relies on searching data, it is limited by what it can access, and Elicit cannot access scholarly work behind a paywall, which includes many books.
It also includes LLM functionality to extract key information and/or summarise from retrieved articles as well as PDFs the user uploads and present in a table, columns include the title, abstract summary, main findings, limitations etc. As with any LLM, results of such summaries or information retrieval cannot be relied upon and should only be seen as a starting point. The time saving may well be worth it at least for exploring a new research area compared to brute force keyword searching in Google Scholar.
As of 2024, Elicit comes with a limited free option which doesn’t include paper summarisation or exporting.
Scite is similar to Elicit in terms of LLM-enhanced semantic searching to identify papers (again, limited to the scholarly databases it has access to). While it uses LLM technology to summarise a collection of papers from the search results, rather than being able to summarise key findings, limitations etc. like Elicit, it provides a view with short direct quotations from the paper along with direct links to the paper itself. In some cases this may be more useful than a potentially flawed LLM summary.
The main value offering of Scite is that it incorporates a limited evaluation of the nature of citation statements, to help a researcher get a quicker feel for the extent to which authors who cite a particular paper are supporting or critiquing it:
In reality, the overwhelming majority of citations are neutral and so the benefit is marginal – again, it could well be worth it in certain contexts to help accelerate discovery.
As with Elicit, the value is far higher when exploring a new research area to speed up a researcher’s engagement with the literature. Identifying contrasting citations in particular can be extremely helpful to accelerate critical engagement with past studies or authors.
Scite does not offer a free version but it does offer a free trial for 7 days, after which it’s paid.
While these tools are great as a starting point especially for a new topic, they cannot be more than a starting point. They do not have access to the entire corpus of academic work, and the search results are only as good as the synonyms the AI can come up with. Moreover the value added features like summarising or extracting key information from papers are always at risk of inaccuracy. When using Elicit for instance, creating a table of the initial search results focusing on summarising abstract, methodology and limitations can be extremely helpful to help you identify which papers are more likely to be worth investigating further. But this is simply productivity enhancing assistance and cannot hope to substitute deep engagement with the texts.
The more explicit and targeted your prompts when ‘interrogating’ any specific academic text, the better. And this can be done just as well if not better with a dedicated tool like Chat GPT Plus / Team than the commercial literature review applications. Simply asking, “Please summarise the attached paper” will inevitably result in some data loss which may or may not be a problem depending on what you’re after. Something more targeted like “Please summarise the methodology in the attached paper along with stated limitations” (remember to check Publisher Policies about whether publications are permitted to be shared with a generative AI application) is much more helpful because the scale of text needed to process is much smaller, and that also gives you more tokens to play with for follow up dialogue. The guidance on Prompting includes an example template with 4 stages of reflection and quality improvement for a highly specific task on a given research paper: extracting practical research project ideas based on the paper's content.
A useful way to experiment is by using an advanced LLM to interrogate with one of your own papers. This is an output you personally spent a lot of time on and you know it inside out, so you can easily verify how good its summaries or extractions are. This can then help you create prompt templates for other papers for more useful paper interrogations in future.