Skip to content

How to read the Guidebook

The EIF Guidebook provides information for policy-makers, commissioners, professionals and practitioners to use in their work to support, expand or improve early intervention.

Our assessment of the evidence for a programme’s effectiveness can inform and support certain parts of a commissioning decision, but it is not a substitute for professional judgment. Evidence about what has worked in the past offers no guarantee that an approach will work in all circumstances.

Crucially, the Guidebook is not a market comparison website: ratings and other information should not be interpreted as a specific recommendation, kite mark or endorsement for any programme.

What is the evidence rating?

The first rating we provide summarises the strength of the evidence that the programme has been shown to benefit child outcomes that are important to improving lives and reducing the demand for acute services or ‘late intervention’ later in life. This evidence assessment is based on the programme’s best or strongest evidence.

For more on our standards of evidence, see: EIF evidence standards

The evidence rating reflects how confident we can be that there is a causal relationship between implementing the programme and achieving outcomes for children, together with an indication of whether positive benefits actually resulted. It is not a rating of the scale of impact.

Changes in outcomes for children and young people may occur for a wide range of reasons, many of which have nothing to do with a programme or intervention. The evidence rating summarises the strength of evidence that a given set of changes can be attributed to the use of a programme, as opposed to external factors or randomness.

What do the evidence ratings mean?

The EIF Guidebook includes information on programmes that are rated 2 or higher, according to the EIF standards of evidence.

  • Level 4 recognises programmes with evidence of a long-term positive impact through multiple rigorous evaluations.
  • Level 3 recognises programmes with evidence of a short-term positive impact from at least one rigorous evaluation.
  • Level 2 recognises programmes with preliminary evidence of improving a child outcome, but where an assumption of causal impact cannot be drawn.

The term ‘evidence-based’ is frequently applied to programmes with level 3 evidence or higher, because this is the point at which there is sufficient confidence that a causal relationship can be assumed.

The term ‘preliminary’ is applied to programmes at level 2 to indicate that causal assumptions are not yet possible.

The Guidebook also includes programmes rated NE.

  • NE, or ‘No effect’, indicates where a rigorous programme evaluation (equivalent to a level 3) has found no evidence of improving one of our child outcomes or providing significant benefits to other participants.
  • This rating should not be interpreted to mean that the programme will never work, but it does suggest that there are key aspects of the programme’s logic model which require respecification and re-evaluation.

The Guidebook does not include information on programmes rated below level 2. For a list of programmes for which EIF has conducted an evaluation assessment and which have received a rating of NL2 – or ‘not level 2’ – see: Other Programmes

What do the ratings tell us about commissioning?

  • At level 4, programmes have evidence of working in more than one place and providing benefits to children lasting one year or longer. This does not mean that the intervention will provide benefits in all circumstances, however. The effectiveness of a programme at the local level is determined by a variety of factors, including the characteristics of the participants, the quality of implementation, and the extent to which the intervention provides value over and above the information or services that are already available.
  • At level 3, programmes have evidence of providing short-term benefits for children from an evaluation conducted under ideal conditions. This means that causality has been attributed to the programme’s model, but further testing is required to determine whether its benefits last over time or can be replicated in differing contexts.
  • At level 2, commissioners should recognise that these programmes have potential but do not yet have evidence of impact. They offer the opportunity to innovate and to make use of a range of different approaches, but this must be based on a careful assessment of how a programme fits with local circumstances, and a commitment to monitor, test, evaluate and adapt.
  • A rating of NE, or ‘No effect’, does not necessarily mean that the intervention should be decommissioned. However, it does highlight the need for careful monitoring to determine the extent to which the intervention is providing value at the local level.


What does a plus rating mean?

Programmes may be given an additional ‘plus’ rating where the best evidence substantively exceeds the criteria for one evidence rating but does not satisfy the criteria for the next.

  • A rating of 4+ indicates that a programme’s best evidence is level 4 standard, and there is at least one other study at level 4, and at least one of the level 4 studies has been conducted independently of the programme provider.
  • A rating of 3+ indicates that a programme’s best available evidence is level 3 standard and it has at least one other study at level 2.
  • A rating of 2+ indicates that a programme’s best available evidence is based on an evaluation that is more rigorous than a level 2 standard but does not meet the threshold for level 3.

What does an asterisk mean?

The asterisk indicates that a programme’s evidence base includes mixed findings: that is, studies suggesting positive impact alongside studies that on balance indicate no effect or negative impact. For more information, see: What happens when the evidence is mixed?

How does EIF produce an evidence rating?

For more information about the programme assessment process, see: Getting your programme assessed

What is the cost rating?

The second rating we provide is an estimation an intervention’s relative costs, based on the inputs required to deliver it. These inputs include the amount of time required to deliver the intervention, the number of families it attempts to reach, practitioner qualifications and training fees.

This rating is based on information that programme providers have supplied about the components and requirements of their programme. Based on this information, EIF rates programmes on a scale from 1 to 5, where 1 indicates the least resource-intensive programmes and 5 the most resource-intensive. When consistently applied, this scale allows for comparison between programmes in terms of the resources required for delivery.

The cost rating is not the same as the market price of an intervention, which will be negotiated and agreed commercially between providers and commissioners.

Each cost rating is associated with an indicative range of unit costs, on a per-recipient basis. These are not actual unit costs for any individual programme or delivery in any particular place.

What do the cost ratings mean?

  • A rating of 1 indicates that a programmes has a low cost to set up and deliver, compared with other interventions reviewed by EIF. This is equivalent to an estimated unit cost of less than £100.
  • A rating of 2 indicates that a programme has a medium-low cost to set up and deliver, compared with other interventions reviewed by EIF. This is equivalent to an estimated unit cost of £100–£499.
  • A rating of 3 indicates that a programme has a medium cost to set up and deliver, compared with other interventions reviewed by EIF. This is equivalent to an estimated unit cost of £500–£999.
  • A rating of 4 indicates that a programme has a medium-high cost to set up and deliver, compared with other interventions reviewed by EIF. This is equivalent to an estimated unit cost of £1,000–£2,000.
  • A rating of 5 indicates that a programme has a high cost to set up and deliver, compared with other interventions reviewed by EIF. This is equivalent to an estimated unit cost of more than £2,000.
  • A rating of NA indicates that the information required to generate a cost rating is not available at this time.

How does EIF produce a cost rating?

For more information about the cost estimation process, please see annex 3 of our report, Foundations for Life.

Child outcomes and impact

The child outcomes listed on the Guidebook are descriptions of what a programme has evidence of improving. These are based on the findings of a programme’s best available evidence. They describe specific benefits or changes that, according to this evidence, a programme has been shown to provide in past evaluations.

By impact, we mean the size of the improvements that a programme has generated for children and young people in the past. In other words, by how much has a programme improved child outcomes – has it made a big change or a small one? Impact is not the same as strength of evidence, which helps us to understand how confident we can be that the change is caused by the programme.

The Guidebook provides information on impact for programmes rated at least level 3, and within that, for studies rated at least level 3. These are the programmes and studies where we can be sufficiently confident that the impact scores are reliable and meaningful. Even within this, there will be particular cases where we are unable to provide information on impact due to limitations in how a particular study has been designed or reported.

Two kinds of impact information

Click ‘More detail’ next to each child outcome to see impact information based on past evaluations. Where possible, we include two main pieces of information on impact, although in some cases only one of these may be available.

  • In one box, we provide the effects as they were originally measured in the evaluation. This number describes the difference between the average outcomes of those who have received the programme, and the average outcomes for those who did not receive it – the difference between these outcomes is the improvement that we can attribute to the programme. Because this describes effects as they were measured in the original evaluation, this information can range from readily interpretable statistics such as ‘a 20% reduction in smoking’ or ‘a 15-percentage point reduction in the proportion of participants who have developed a major depressive disorder’, to statistics expressed in scales that may be unfamiliar to some users, such as ‘a 5-point improvement on the Problem Behaviour Scale’.
  • In another box, we provide an Improvement Index score. This is a number between 0 and 50 that captures the magnitude of an effect, and allows you to compare effects that were originally measured on different scales. This metric is sometimes called ‘percentile growth’ or ‘percentile rank improvement’. Click on the question mark icon next to each improvement index score to find out what it means.

    This approach is identified as a useful way of describing effect sizes by a number of methodologists in the field, and is also used by colleagues at the What Works Clearinghouse for Education in the US, and the Best Evidence Encyclopaedia.

Alongside each description of impact we also report the time points at which these improvements were observed, including highlighting improvements that have persisted over a longer period of time. Typically outcomes are observed immediately after the intervention has been delivered, but are sometimes they can be observed months or years after the intervention has been delivered.

What does the information on child outcomes and impact mean?

Effects as they were originally measured in the evaluation tell us something useful about the nature of the improvement that a programme has generated: a 20% reduction in smoking is easily understood, and we can learn what a 5-point improvement on the Problem Behaviour Scale means in practical terms, even if we are unfamiliar with the scale starting out.

However, there are limitations to this information. In particular, effects described as they are originally measured will not always be directly comparable. For example, child behaviour problems might be measured in one evaluation on a scale of 1–5 using the Problem Behaviour Scale, and in another evaluation on a scale of 1–12 using the Externalising Problems Inventory. A three-point change on one of these scales may mean something very different from a three-point change on the other scale, and it may be unclear, at a glance, which effect is larger or more meaningful.  

Improvement index scores tell us something useful about the relative scale of improvement, compared with improvements measured using other scales. This is because the improvement index score is based on a standardised measure of the size of effects, which allows us to compare the relative size of effects, and to compare effects across programmes that may have evaluated improvements on a given outcome using different scales. It also means that a larger improvement index value always, relatively speaking, indicates a larger effect.

The improvement index score can be interpreted as an estimate of how much we’d expect outcomes for the average participant in the control group to improve if they had received the intervention, relative to other members of the control group. That is, if you ordered all the participants in the control group from lowest to highest – worst to best – on a certain outcome, how much would the average person – the person in the middle of the range – improve if they had received the intervention. Would they move from the middle into the top 25%, into the top 10%, or to the very top 1%?

For example:

  • An improvement index score of 25 means we would expect the average participant in the comparison group who did not receive the intervention (for whom 50% of their peers have better outcomes and 50% have worse outcomes) to improve to the point where they would have better outcomes than 75% of their peers, and worse outcomes than 25% of their peers, if they had received the intervention.
  • An improvement index score of 50 means we would expect the average participant in the comparison group who did not receive the intervention to improve to the point where they would have better outcomes than 100% of their peers, and worse outcomes than 0% of their peers, if they had received the intervention. In other words, they would have the best very outcomes relative to their peers.
  • An improvement index score of 0 means that there is no improvement. The average participant in the comparison group who did not receive the intervention would maintain this ranking if they had received the intervention.

In more technical terms, the improvement index is the difference between the percentile rank corresponding to the mean value of the outcome for the intervention group, and the percentile rank corresponding to the mean value of the outcome for the comparison group distribution.

It’s worth noting that negative improvement index scores are possible, in cases where evaluations find that an intervention was harmful. However, these are not currently listed here, as the Guidebook only includes interventions with some evidence of having a positive impact on child outcomes.

How does EIF calculate impacts?

Our seven child outcomes

Information on child outcomes & impact is arranged underneath the following seven broad categories:

  • Supporting mental health & wellbeing – includes for example:
    • Improving outcomes for children diagnosed with ADHD
    • Providing children with strategies for coping with depression and/or anxiety disorders
    • Preventing teen suicide and self-harming behaviour
    • Improving children’s self-esteem, self-confidence and self-efficacy.
  • Preventing child maltreatment – includes for example:
    • Increasing children’s awareness of maltreating behaviours and methods for reporting it
    • Targeting specific risk and protective factors known to contribute to child maltreatment
    • Targeted interventions for children at the edge of care
    • Preventing children from entering the care system or reducing the time spent in out-of-home care.
  • Enhancing school achievement & future employment – includes for example improving:
    • School achievement, including scores on standardised exams
    • Behaviour in school (including self-regulatory and prosocial behaviour)
    • Teaching skills and the classroom environment
    • Communication between parents, teachers and school staff
    • Rates of school exclusion and drop-out
    • Completion of secondary school and entry into higher education or training
    • Young people’s success in finding a job or vocational skill.
  • Preventing crime, violence and antisocial behaviour – includes for example:
    • Improving children's behaviour at home or at school
    • Treating clinically diagnosed conduct or behavioural disorders
    • Preventing children from offending or re-offending.
  • Preventing substance abuse – includes for example:
    • Educating children about the risks associated with drinking and illegal drug use
    • Providing specific therapies for children with a drug or alcohol addiction.
  • Preventing risky sexual behaviour & teen pregnancy – includes for example:
    • Discouraging general risk-taking behaviours (such as binge drinking, antisocial behaviour, physically risky activities)
    • Providing specific information about contraception and safe sex
    • Targeting young women who are at risk of becoming pregnant and carrying their child to term before the age of 18.
  • Preventing obesity and promoting healthy physical development – includes for example:
    • Targeting children identified as being overweight
    • Preventing children from becoming overweight in the first place.

Key programme characteristics

Key programme characteristics describe the delivery of a programme according to its best available evidence.

In all cases, this evidence-based description may differ from how a programme is provided in a particular place, or how it is offered in the market.

Age group: Who is it for?

The age of the programme’s target population according to its best available evidence.

The age ranges given for each group are intended only as a general description or loose definition of our terminology: programmes for an age group may not have evidence of their outcomes for all children of the ages given in this list.

  • Antenatal: before a child is born
  • Perinatal: encompassing birth – beginning before birth and continuing after birth
  • Infants: 0–12 months
  • Toddlers: 12 months to 3 years
  • Preschool: 3 to 5 years
  • Primary school: 5 to 11 years
  • Preadolescents: 11 to 13 years
  • Adolescents: 13 to 18 years

Delivery model: How is it delivered?

The method or approach of a programme, according to its best available evidence.

  • Promotion-plus: Programmes that are promotional activities and/or of short duration, lasting five weeks or less. These may be delivered through health visiting, schools, libraries, children’s centres or on a one-to-one basis.
  • Individual: Programmes that are provided to families on a one-to-one basis, through psychotherapy, speech and language therapy or other specialist one-to-one support.
  • Group: Programmes that are delivered to groups of parents from multiple families.
  • Home visiting: Programmes that were developed specifically to be delivered in the home to address a range of child and parent outcomes. Home visiting programmes are typically offered on a Targeted Selected or Targeted Indicated basis (see Classification below) and take place over a relatively long period of time.

Settings: Where is it delivered?

The venues in which a programme has been delivered corresponding to its best-evidenced implementation (its main settings), or other venues in which the programme provider has indicated a programme can be delivered (other settings).

  • Home
  • Children’s centre or early-years setting
  • Primary school
  • Secondary school
  • Sixth-form or FE college
  • Community centre
  • In-patient health setting
  • Out-patient health setting

Classification: How is it targeted?

The level of need of the target population based on a programme’s primary focus, according to its best available evidence.

  • Universal: Programmes that are available to all families. These activities may take place alongside or as part of other universal services, including health visiting, schools or children’s centres.
  • Targeted Selected: Programmes that target or ‘select’ groups of families on the basis of an increased incidence or risk of broad personal or social factors. For example, families could be selected because they are experiencing economic hardship, are single or young parents, or ethnic minorities.
  • Targeted Indicated: Programmes that target a smaller group of families or children on the basis of a specific, pre-identified issue or diagnosed problem requiring more intensive support.

About the programme

This section provides a summary description of the programme and how it is delivered. The description has been compiled by EIF on the basis of the information submitted by each provider, as well as the information contained within the evaluation studies.

It is not intended to provide comprehensive information about each programme.

How does it work? (Theory of change)

The theory of change describes the steps that lead from a problem to a solution, via a programme. This ‘tells the story’ of how a programme works, according to scientific theory and evidence: it is based on a programme’s design or rationale, rather than any particular evaluation or assessment.

These steps include assumptions about cause and effect based on developmental and social science, and practical information about the actions and changes that occur through the programme. This includes:

  • how the mediators (or risk and protective factors) are related to the ultimate or primary outcomes
  • how the programme affects the mediators
  • the programme’s short-term outcomes
  • the programme’s long-term or ultimate outcomes.

This section also notes the outcomes that a programme is intended to provide. These intended outcomes may differ from the ‘actual outcomes’ listed on the main programme page, which are based on the evaluation findings from a programme’s best available evidence.

About the evidence

This section provides a summary description of and details about a programme’s best available evidence: the most rigorous study or studies evaluating the programme’s impacts and outcomes.

This set of key studies – a programme’s best available evidence – provides much of the content of a EIF Guidebook programme report, including a programme’s evidence rating, child outcomes and key programme characteristics.

A programme receives the same evidence rating as its best available evidence, based on its single most rigorous evaluation or several qualifying evaluations (that is, multiple studies at level 2 or better). For more information about what these ratings mean, see: What is the evidence rating 

A programme report is not based on all the studies or evaluations relating to that programme. On the Other programmes page, we list the studies that were identified as part of the evaluation assessment process but which did not count towards the rating, because they did not qualify as the most rigorous evaluation or in the set of most rigorous evaluations.

Back to top

Published July 2024