Open Peer Review reports
Stroke aetiological classification reliability and effect on trial sample size: systematic review, meta-analysis and statistical modelling
© The Author(s). 2019
- Published: 8 February 2019
Inter-observer variability in stroke aetiological classification may have an effect on trial power and estimation of treatment effect. We modelled the effect of misclassification on required sample size in a hypothetical cardioembolic (CE) stroke trial.
We performed a systematic review to quantify the reliability (inter-observer variability) of various stroke aetiological classification systems. We then modelled the effect of this misclassification in a hypothetical trial of anticoagulant in CE stroke contaminated by patients with non-cardioembolic (non-CE) stroke aetiology. Rates of misclassification were based on the summary reliability estimates from our systematic review. We randomly sampled data from previous acute trials in CE and non-CE participants, using the Virtual International Stroke Trials Archive. We used bootstrapping to model the effect of varying misclassification rates on sample size required to detect a between-group treatment effect across 5000 permutations. We described outcomes in terms of survival and stroke recurrence censored at 90 days.
From 4655 titles, we found 14 articles describing three stroke classification systems. The inter-observer reliability of the classification systems varied from ‘fair’ to ‘very good’ and suggested misclassification rates of 5% and 20% for our modelling. The hypothetical trial, with 80% power and alpha 0.05, was able to show a difference in survival between anticoagulant and antiplatelet in CE with a sample size of 198 in both trial arms. Contamination of both arms with 5% misclassified participants inflated the required sample size to 237 and with 20% misclassification inflated the required sample size to 352, for equivalent trial power. For an outcome of stroke recurrence using the same data, base-case estimated sample size for 80% power and alpha 0.05 was n = 502 in each arm, increasing to 605 at 5% contamination and 973 at 20% contamination.
Stroke aetiological classification systems suffer from inter-observer variability, and the resulting misclassification may limit trial power.
Protocol available at reviewregistry540.