Oxford's oldest student newspaper

Independent since 1920

The Prosecutor’s Fallacy: How flawed statistical evidence has been used to jail innocent people

Joshua Mitchell examines the use of expert statistical evidence in the criminal justice system.

CW: Discussion of murder and infanticide, mentions of rape and alcoholism. 

On the 24th October 2003, Kathleen Folbigg was sentenced to 40 years in prison for murder and manslaughter of her 4 young children. Branded as ‘Australia’s worst female serial killer’, she has spent 18 years incarcerated. The prosecution’s theory was that Folbigg had smothered all 4 children, despite the lack of any medical evidence to assert this. The case was made that the likelihood of all 4 children dying of natural causes was so statistically improbable as to render it impossible. Unfortunately, this used a line of logic known as ‘Meadow’s Law’, which has cost the freedoms of several innocent women, and is also part of a wider story about the misuse of statistics and the misuse of science generally in the courtroom.

Meadow’s Law, coined by paediatrician Sir Roy Meadow, states that ‘one sudden infant death is a tragedy, two is suspicious and three is murder, until proved otherwise’. This now discredited “law” was extremely influential and used by child protection agencies in the U.K to assess child abuse. However, its statistical reasoning is fundamentally flawed and has been rebuked by the Royal Statistical Society, but not before ruining several lives.

Meadow’s Law was used most infamously in the trial of Sally Clark, an English woman who was convicted of murdering both her infant sons in 1999. The defence argued both children died of cot death, a fact which wasn’t confirmed until years later. Sir Roy Meadow was called as an expert witness on the case and asserted that the probability of both children dying of cot death in the same family was 1 in 73 million, hence a conviction beyond a reasonable doubt for double murder could be obtained. This figure was obtained by assuming that both deaths were statistically independent (i.e. one death isn’t connected or influenced by another), neglecting the glaring problem that cot deaths within the same family aren’t independent events, and genetic and environmental factors both contribute. There are several other problems with this figure as well which we will come back to. This statistical evidence was challenged at her second appeal and Clark was eventually released from prison, having served more than 3 years behind bars. She never recovered from her wrongful conviction and loss of her two sons. She died of acute alcohol poisoning in 2007.

Sadly, the story doesn’t end there for Roy Meadow. In 2002 he was again used as an expert witness at the trial of Angela Cannings. Cannings tragically lost 3 of her 4 children to cot death a few years prior, and was tried and convicted of smothering 2 of them, with Meadow’s Law once again the main driver of conviction. She spent a year in prison before being released following an appeal in 2003. Many more women still fell victim to false murder convictions after cot deaths as a result of Meadow’s testimony, namely Donna Anthony, who spent more than 6 years in prison before her release after the death of her 2 children, as well as Trupti Patel, who was thankfully acquitted at her murder trial.

Though it isn’t just mothers who are victims of misuses of statistics in the courtroom. More generally, Meadow’s Law is an example of the ‘Prosecutor’s fallacy’. This assumes that the probability a defendant is innocent given the evidence that we observe, is equal to the probability of seeing that same evidence given the defendant is guilty. Perhaps this sounds minor, but it can have profound effects, and is best illustrated with an example: Say a man is accused of a robbery he didn’t commit. There was blood found at the scene which matches the defendant, and this particular blood type is only present in 1% of the population. The prosecution argues that the chance that this blood matches the defendant, given that the man innocent, is just 1% and therefore he’s very likely to be guilty. While this seems to make sense, this analysis is incorrect. What is relevant here is the probability that the man is innocent, given that his blood matches the blood at the scene. This order switching seems subtle, but it can have drastic effects on the outcomes. If there’s 1000 people in the city that all could’ve committed the robbery, 10 people’s blood types match with the crime scene, meaning that the probability this specific man is innocent is not 1%, but 90%! This is obviously an idealised example, but very real-world cases have been litigated with this logic.

In 2010, a man named Troy Brown was convicted in the U.S for the rape of a young girl. The compelling factor for the jury was the claim that only 1 in 3 million random people would have the same DNA profile as the rapist, so there was only a 0.00003% he was innocent. This was a classic prosecutor’s fallacy, which was thankfully overruled on appeal, as it once again assumes that the probability of a DNA match given Brown was innocent equalled the probability of innocence given a DNA match. Given that the other evidence was circumstantial and not particularly strong, and depending on how many other possible suspects there were, the chance of his innocence could be as high as 25%, which is of course certainly not grounds for a criminal conviction.

Another tragic miscarriage of justice occurred in the Netherlands in 2003, this time, to a nurse named Lucia De Berk who worked at a children’s hospital. After a series of unexplained deaths at the hospital that occurred while she was on shift, De Berk was charged and sentenced to life in prison. Once again, a problematic but shocking statistic was front and centre in the case. An expert witness, a law psychologist, found that the chance of De Berk being present at so many unexplained deaths was 1 in 342 million, hence De Berk must have played some sinister role in these fatalities. This, like all the other previously mentioned probabilities, was erroneously calculated, and made sweeping, unrealistic assumptions, leading to such an inflated probability. For context, subsequent analysis by prominent statisticians, factoring in all the possible biases, concluded that the probability of this sequence of events happening to a nurse at any hospital was approximately 1 in 9. After subsequent appeals, she was eventually found not guilty in 2010 after review of the statistical evidence, as well as other flawed medical evidence used to convict her. She spent 4 years in prison in total.

This exact scenario is scarily common. And several medical professionals have been charged when a string of unexplained deaths occur in a hospital setting. Tragically some, like English nurse Ben Geen, are still in prison. Geen was arrested in 2004 over the deaths of several patients over the course of a year. At trial, the prosecution argued there had been an “unusual pattern” that had emerged, and was branded as the “the nurse who killed for kicks”. After it seemed like this “unusual pattern” of deaths under Geen’s care was occurring, many more incidents started to be attributed to him, and with little exploration of natural causes being explored. This is known as diagnostic suspicion bias. He was charged with 2 murders and intentional grievous bodily harm against 15 patients. However, the prosecution had failed to consider the likelihood of a string of these incidents occurring compared with the background rate. They let their biases of seeing a potentially suspicious cluster of cases overcome the actual data and disregarded any natural explanations for some of the incidents and disregard just how uncommon this type of cluster was. After the trial, taking this all into account, a number of prominent statisticians analysed the data, and found that this “unusual pattern” simply isn’t there, and that Geen was prosecuted on completely foundationless grounds. In 2020 a further wave of statisticians came out in support of Geen, but he unfortunately remains in prison to this day.

All the examples presented here are tragic, with most incidents occurring in the late 90s/early 2000s, but there haven’t been many good remedies put in place in the legal system to address this fundamental issue of abuse of statistical and scientific evidence. Generally, expert witnesses, who are allowed to give scientific evidence and their own “expert” opinion in court, are admitted at the discretion of the judge. However, a huge problem with this approach is how is a judge expected to know whether the credentials of an academic or medical professional are credible? And how is a judge supposed to know whether a particular witness is actually an expert in the field of which they’re being used? For example, in the De Berk case involving the Dutch nurse, the erroneous figure of 1 in 342 million was admitted to the court via Henk Elffers, who was a law psychologist, and not an expert in the field of statistics. Also, Roy Meadows who gave evidence in the Sally Clark case as well as many others, while an esteemed paediatrician, was not an expert in the field of statistics nor was his ‘Meadow’s Law’ an established truth in the medical community. So perhaps the best solution is that for an expert scientific witness to be permitted to give their expertise in court, they should have to be sponsored by several of their peers in the scientific community, or at least have several academics educate the judge in which members are credible witnesses in a particular field. Otherwise, the judge is essentially guessing at who’s actually qualified.

Another reason why so many miscarriages of justice have occurred thanks to the misuse of statistics and medical evidence is that science isn’t really meant to be practised in a courtroom. In a trial, quick, definitive evidence is desirable, and the ability for a single witness to be able to sum up all the data and make a conclusion. Unfortunately, science doesn’t work that way. To publish a scientific paper, other academics have to check that work in a process called peer review before it enters the scientific literature. This can take months, not to mention the fact that the body of knowledge is always evolving as new evidence emerges. Furthermore, what if there are two scientists giving opposing opinions on either side, who does the jury believe? The jury simply won’t know who has the better facts and is giving a more honest assessment of the situation, so they’re essentially going to favour whichever expert who laid out their argument in the most convincing manner, regardless of whether their argument is factual or not. A resolution here would be for a report compiled by leading experts in whichever particular field should be prepared ahead of the trial. For example, if a group of respected statisticians produced a report summarising the statistical evidence, which was prepared ahead of any of the previously mentioned cases, and checked by the wider statistical community, I think it’s fair to say none of the false imprisonments would’ve occurred.

So is ‘Australia’s worst female serial killer’ Kathleen Folbigg really guilty of filicide? An inquiry of her case commenced in 2018, with the judge finding no doubt of her guilt. An assertion that was based on Meadow’s Law, and by interpretation of Folbigg’s diary entries, in the absence of any evidence of smothering. However, in March 2021, a letter signed by 90 eminent scientists to the governor of New South Wales demands her immediate release based on new evidence. It was found that 2 of the children had a specific gene mutation, known to cause cot death in infants, and likely caused cardiac arrythmia in the 2 girls. Furthermore, world-leading experts in pathology have collectively given medical explanations for the deaths of all 4 children. If released, it will be the biggest miscarriage of justice in Australia’s history.

Artwork by Rachel Jung. 

Check out our other content

Most Popular Articles