Randomised Controlled Clinical trials (RCTs)
The randomised controlled trial involves getting the consent of a large number of patients to participate in the trial. They are then randomised into two groups. Ideally one group, the study group or treatment group, receives a single therapy. The other, control group receives no treatment or, for drug trials, a placebo. This is the only reliable method of demonstrating that a treatment is effective since it removes most of the methodological problems described below under “Criteria for relying on results from randomised trials”.
An alternative is to give the study group a new therapy and the control group an old accepted therapy. If the new therapy produces improved survival, it does not necessarily mean it is effective. It might simple be less harmful than the earlier therapy.
A third approach is to give the study group two therapies, such as surgery followed by radiotherapy and the control group just surgery. This evaluates the effect of adding radiotherapy.
The main point is that if the study group as a whole lives longer, the improvement is probably due to the therapy used (if compared with the no treatment group), the new therapy (when compared with the old therapy) or the therapy added in the study group.
The conclusions from such trials are only valid if particular criteria are satisfied. See “Criteria for relying on results from randomised trials” below.
Significance of results of trials
Even if 20 similar randomised trials with fairly large numbers were carried out to evaluate a therapy, the results of only 19 of them might show a significant effect, ie with a 95% confidence level. A 95% confidence level means there is only a one in 20, or 5%, likelihood that the result might be by chance. The 20th could produce an opposite result or no effect due to chance alone. So at least two similar trials are usually required to show that the first trial was not the 1 in 20 that produced the invalid (chance) result. Hence the requirement for replication of trials.
Taking account of various uncertainties results in there being a Confidence Interval, ie the range of results over which the effect of treatment is likely to spread. For example if a treatment reduces the number of deaths by 20%, ie from 1 to 0.8, and the 95% confidence interval ranges from 0.7 to 0.9 the result is significant.
Significant in this context therefore means that the range of possible results must not include no benefit. However if it ranges from 0.55 to 1.05 it is not significant because it includes 1 so could be due to chance.
Criteria for relying on results from randomised controlled trials (RCTs)
- It is important that the survival or mortality of the complete study group be compared with that of the complete control group irrespective of the type of trial. If the survival of a sub-group of the treated group is compared with that of the same sub-group of the control group, eg those who die of cancer, or survive 5 years without symptoms, the conclusion might be invalid. This is because if one of the treatments causes harm it might transfer patients who would have died of cancer into another sub-group such as patients who died from heart or respiratory failure. The number of cancer deaths in the study group might fall and the deaths from other causes might then rise. Only if the number of cancer deaths falls, compared with the number in the control group, and the total deaths from all other sub-groups does not rise, can the treatment be considered effective. (This was the flaw identified in the randomised controlled trials of breast cancer screening that led to an invalid assumption about the benefits of mammography screening. The increase in use of radiotherapy in the study group led to an increase in deaths from heart failure, and therefore fewer deaths from breast cancer1. The total number of deaths in the two groups was not affected. These flaws in methodology are called confounding factors.)
Similarly patients who have radiotherapy have fewer recurrences than those who don’t have it. Yet they do not necessarily show improved survival. So comparing the percentage 5-year “recurrence-free survival” in the study and control groups is invalid as a measure of efficacy. This invalid conclusion arises from an invalid paradigm that assumes that the tumour is the disease, so the absence of a recurrence must be the same as the absence of the disease so must be accompanied by an increased survival.
If more than one factor is varied it is usually not possible to conclude which factor produced the observed result. If the study group receives several treatments as part of the protocols this can produce confounding factors that render the trial results invalid. This is particularly relevant if some of the other treatments can cause harm. Trials evaluating breast and prostate cancer screening contained many of such confounding factors1,2.
- Ideally there should be at least several hundred participants in each arm of such a trial. This is required to ensure that the two groups being compared are very similar before the trial starts. This is referred to as a baseline comparison. It is also to avoid a situation where the effect of chance can produce an artefact as big as the difference observed. One trick that can be used for smaller trials is to match patients before randomising them. In this way it is possible to produce two closely matched groups, a good baseline comparison, before starting a trial with fewer participants.
Trials with smaller numbers might become significant if the difference in survival or mortality between the arms is very large, say >20-30%. Differences of more than 10% are not expected in cancer trials.
- The trial should be blinded, preferably double-blinded. This means that neither the patient nor the clinician determining the cause of death knows which patient has received the treatment and which the placebo. Double blinding is often difficult or impossible because the effects of treatment or its side effects are often obvious (e.g. surgery, radiotherapy or chemotherapy). It is however essential for the determination of the cause of death to be blinded. Because cancer treatments and cancer itself often affect many bodily systems, organ failure can often result in deaths from cancer being wrongly attributed to other causes (such as respiratory failure). To overcome this ambiguity, the rules for RCTs require that deaths that occur during cancer treatment need to be attributed to the cancer being treated. This rule is often ignored thus confounding the results.
- The randomisation should be done on an individual level rather than geographical or other factor. This is because there is no guarantee that patients attending one treatment centre would exactly match a similar group attending another centre regarding their age, gender, diet and lifestyle. Also treatment might differ slightly from one centre to the other, thus introducing another confounding The Edinburgh mammography screening trial, that had randomise participants by treatment centre, was ruled invalid when there was a 20% difference in the baseline comparison. This was probably due to a difference in those enrolled at different centres. This could have been further confounded by difference in treatment received.
Therefore proving efficacy requires:
- using the results only from blinded randomised controlled trials, properly conducted with adequate numbers to remove the likelihood of a chance result.
- evidence of significant increased survival or reduced mortality, not simply the ability of the treatment to remove or reduce a particular condition, such as shrink a tumour. Comparison of survival or deaths of sub-groups must not be considered.
- consideration of deaths from all causes, in case a treatment has caused harm.
Ideally it should be confirmed over time by showing a significant effect on mortality curves for the particular type of disease after intervention. Claims that mammography screening had not been shown to affect breast cancer mortality because of flaws in the trial methodology were later confirmed by epidemiological statistics showing that mammography screening had had no discernable impact on breast cancer mortality in any of the countries where it had been introduced on a widespread basis.
If results from trials showing clear efficacy are not available it becomes critical to recognise that most interventions can do harm. So when there is no proven benefit, the intervention with the least harm becomes the preferred choice.
REFERENCES
- Benjamin, DJ. The efficacy of surgical treatment of breast cancer. Medical Hypotheses 1996; 47 (5): 389-97.\
- Benjamin DJ. The efficacy of surgical treatment of cancer – 20 years later. Med Hypotheses (April) 2014; 82 (4): 412–420.