IN THIS LESSON

This serves as a primer on performing evidence reviews.

Evidence Reviews

An evidence review is a catch-all term for a somewhat structured process that reviews existing evidence in relation to a research question. As practitioners of impact-oriented research, evidence reviews are an intrinsic part of virtually all research projects we undertake, as we need to know what existing research can tell us about the decisions we face.

Evidence reviews are an intrinsic part of virtually all research projects, whereby researchers seek to assess the relevant existing evidence pertaining to a research question. There is no catch-all answer to " good evidence," as this depends on the specific research question one is asking. However, there are conventions around the hierarchy of evidence for attributing cause-and-effect relationships.

Evidence reviews are usually structured in steps:

Set a research question
Search for published evidence about the research question
Assess the studies collected (this is what this week’s at-home activity will practice)
Draw conclusions based on the studies as evidence

What makes good evidence?

This is the question that keeps researchers up at night. The best answer to this question is “it depends” – which can be a terribly useless (albeit often used) answer. However, we want you to take home the message that the quality of evidence depends – to a large extent – on the questions you are asking (e.g., do you want to learn how people feel about a certain disease plaguing a community? Perhaps a participatory research exercise may give you the richest detail on this; do you want to make a causal inference about the quality of a vaccine? There’s no question about it, a randomized trial or collection of studies will be your go-to).

Quality of evidence is also relative and often practical, while randomized-controlled trials may be the “gold standard” for causal inference, they are often impractical to run, or cannot be run because of ethical concerns – sometimes it is straight-up impossible to randomize the treatment, and in those cases, a quasi-experimental approximation may be the best possible evidence for a particular question.

The hierarchy of evidence is a framework used in evidence-based medicine and social science research to assess the quality and reliability of different types of evidence for causal attribution. It organizes various study designs based on their methodological rigor, potential for bias, and ability to provide reliable answers to causal attribution research questions. The hierarchy allows researchers and healthcare professionals to determine the strength of evidence supporting a particular intervention or treatment. The hierarchy typically consists of several levels, with higher levels representing stronger evidence.

While the specific levels may vary slightly depending on the source or field of study, a common hierarchy includes the following:

(Note that the hierarchy of evidence is a rough heuristic applied specifically to causal attribution. Different studies are better suited for specific research questions. One excellent RCT may be better than four badly designed ones, a quasi-experimental study may be more informative than a meta-analysis of observational studies, and so on.)

At-home Activity (45 minutes)

The goal of this activity is to get hands-on practice reviewing research papers for an evidence review. Since this is just to get an idea of what this process looks like, you will use an LLM to expedite the exercise.

In general, the steps for gathering literature on a question can be summarized as follows:

Think of search terms pertinent to the research question (we’ve done this for you)
Identify where you will search for published literature, and decide which papers you will review, such as prioritizing the top methodology or cutting off resources from before a certain year (we’ve done this for you)
Use a spreadsheet or literature review assistance software to collect all relevant data from the studies (this is what you’ll be doing, with the help of an LLM!)
Once you have identified which studies you will review and in what relative order, you can proceed with the analysis. (You won’t be performing an analysis, we’re just here to practice performing a literature review)

The Activity

We’ll be gathering evidence to answer the following question: What is the expected effect on learning outcomes of implementing a Teaching at the Right Level (TaRL) intervention in a school?
Make a copy of this spreadsheet to use as a template. You’ll be inputting data into this spreadsheet.
Open an LLM (we recommend either Claude, ChatGPT, or Gemini. The free version should be fine.).
Paste this prompt into the LLM:
- “I’m conducting a literature review to answer the following question: What is the expected effect on learning outcomes of implementing a Teaching at the Right Level (TaRL) intervention in a school?. I am going to give you research papers, and for each paper, please give me the following for me to paste into my literature review spreadsheet: Title of paper, Authors, Year of publication, Link to source, Type of study, Intervention details, Study Period, Location of study, Context, Outcome variable, Control or Comparison Group, Study size, Description of the population, How is the Effect size measured?, Follow up period between end of intervention and main outcome, Effect size of treatment/change, Baseline, P-value, Confidence interval, Statistical power, Was the study pre-registered?, Do you perceive a risk of researcher or funder bias?, Comments on External validity to RQ/Context, Other Comments, Reference in APA.“
Pick 3 of the following papers. YOU DO NOT HAVE TO READ THE PAPERS (of course, you should if this were a real review, but this is just an exercise to learn the general process). For each paper, give the paper to the LLM, and put the data the LLM extracts into your google sheet.
- Pick 3 of the following papers:
  - Mainstreaming an Effective Intervention: Evidence from Randomized Evaluations of “Teaching at the Right Level” in India (Banerjee et al., 2016).
  - Failure of Frequent Assessment: An Evaluation of India’s Continuous and Comprehensive Evaluation Program (Berry et al., 2018)
  - Supporting Learning In and Out of School: Experimental Evidence from India (Björkman & Guariso, 2022)
  - Pitfalls of Participatory Programs: Evidence from a Randomized Evaluation in Education in India (Banerjee et al., 2010)
  - Remedying Education: Evidence From Two Randomized Experiments In India (Banerjee et al., 2007)
Ask the LLM to explain any concepts that you don’t understand from the outputs.
Once you’re done, paste these follow up questions into the LLM, and just read the answers for your own learning.
- 3 questions:
  - What overall conclusion can you draw from these five papers about the effectiveness of TaRL? What does it seem most useful for?
  - Can you make any comments as to the external validity of the studies? Pick one or two to comment on, and perhaps remark on the external validity of these studies as a group.
  - Which study is the weakest in terms of quality? Which is the strongest?
Watch this video to get an idea of what this process leads to and why it’s helpful (16 minutes)

(Based on a research exercise designed by AIM/Charity Entrepreneurship)

Skillset: Evidence Reviews

This serves as a primer on performing evidence reviews.

At-home Activity (45 minutes)