How do we know that an intervention works?


The pros and cons of RCTs


Randomised controlled trials (RCTs) have earned the reputation as the ‘gold standard’ for measuring effectiveness because they provide the best method for attributing causality to an intervention. This is done through random assignment and the use of one or more comparison groups. 

Random assignment takes place through random methods that assign eligible participants to the ‘treatment’ (i.e. the intervention) and one or more comparison groups (e.g. an alternative treatment or no treatment) that do not receive the intervention. Participants in the treatment and comparison groups then complete the same set of standardised measures before the start of the intervention and then again after it is completed. Random assignment helps ensure that participant attributes that could affect the treatment outcome are randomly distributed between the treatment and the comparison group(s). The comparison group(s) allows the evaluator to statistically compare for any changes that may have occurred during the course of the intervention.

Pre and post intervention change is then measured and compared between the treatment and control group(s). If the treatment group demonstrates a statistically significant (i.e. greater than chance) improvement that is also significantly greater than the changes observed in the comparison groups, it can be assumed that the intervention had a positive effect that did not take place in the other groups. The discrepancy between the intervention and comparison groups is often reported as an ‘effect size’, which is an index of the strength of the difference between the outcome of the treatment and comparison groups. 

Positive findings from a single RCT are generally considered to be a promising indication that an intervention is effective. However, they are rarely full proof that an intervention ‘works’. This is because the first RCT is usually conducted under ideal circumstances that substantially increase the likelihood of a positive effect. These ideal circumstances often include delivery by the original developers, high levels of practitioner skill and systems that make sure that the right participants have been selected and are receiving the right dose. From this perspective, a single RCT does fully differentiate between outcomes related to the intervention model vs. outcomes related to the intervention’s delivery.

Prevention scientists therefore recommend that interventions undergo multiple RCTs before conclusions are made about whether the intervention ‘works’.  And even when multiple RCTs have taken place, these conclusions are often tentative, at best. This is because RCT evidence is never proof that an intervention will ‘work’ at a specific place or time.

Intervention effectiveness should therefore also be considered through robust monitoring systems that capture information about an intervention as it is being implemented. At the very least, this information should include data about whether the intervention is being implemented as it should be (i.e. fidelity monitoring), who it is working for and whether it is achieving its intended short and long term outcomes. Some of the most highly developed interventions come with monitoring systems that capture this information on an ongoing basis.

It should also be recognised that many good ideas have not yet been tested through an RCT and in some cases, an RCT may not be practical or feasible.  Alternatives to RCTs include studies that make use of a carefully matched comparison group when random assignment is not possible, a statistically controlled cross-sectional design and longitudinal studies that statistically control for important confounding biases.  Good initial evidence can also be obtained from pre/post studies that do not include a comparison group, but measure change through the use of objective measures that are completed by the participants before the start of the intervention and then again afterwards.

Back to pages