Early warning signals have been proposed to forecast the possibility of a critical transition, such as the eutrophication of a lake, the collapse of a coral reef or the end of a glacial period. Because such transitions often unfold on temporal and spatial scales that can be difficult to approach by experimental manipulation, research has often relied on historical observations as a source of natural experiments. Here, we examine a critical difference between selecting systems for study based on the fact that we have observed a critical transition and those systems for which we wish to forecast the approach of a transition. This difference arises by conditionally selecting systems known to experience a transition of some sort and failing to account for the bias this introduces--a statistical error often known as the prosecutor's fallacy. By analysing simulated systems that have experienced transitions purely by chance, we reveal an elevated rate of false-positives in common warning signal statistics. We further demonstrate a model-based approach that is less subject to this bias than those more commonly used summary statistics. We note that experimental studies with replicates avoid this pitfall entirely.