A speech about … me

The following speech was spoken by Aad van der Vaart (boss, colleague, successor, and most importantly, friend) as first official reply to my farewell  lecture.

Mr. Rector, Ladies and Gentlemen, Dear Richard,

It is about 35 years ago that we first met, when I was a Leiden student, or maybe when I applied for a PhD position at CWI, where you were the chief statistician, the head of the Department of Mathematical Statistics, in the middle of the 1980s. I ended up doing my PhD at our beloved Leiden University, but I still had the benefit of your presence, both because CWI at that time was the national meeting place for statistics and because you were a one-day-a-week professor at Leiden. I vividly remember the course you gave, centred on Hadamard differentiability, and the enthusiasm and inspiration that came with it. It gave a window on scientific facts known and on things to be discovered. You have been an inspiration ever since.

It is my pleasure to speak here today both as a younger colleague, and in the more formal role of scientific director of the MI, as your last boss so to say. Let me slip into the second role for a moment and list some basic facts of your career. Your training as a mathematician and in statistics was at Cambridge University, but you wrote your PhD thesis in Amsterdam, having followed the love of your life to the Netherlands. Kobus Oosterhoff, my later boss at the Vrije Universiteit, acted as your advisor, and more than once he told me that he had taught you English, while you had taught him mathematics. After completing your PhD, you remained at CWI, but accepted a one-day-a-week professorship in Leiden, where you gave an inaugural lecture under the title “Missed Chances” in 1987. Then in 1988 you were appointed professor of Mathematical Stochastics at Utrecht University. It was only in 2006 that you returned to Leiden University as professor of Mathematical Statistics, thus taking the opportunity to give your valedictory address today in exactly the same room as your inaugural lecture. A lot happened in-between. A bit of trivia is that in 2006 you were the first successor of Sara van de Geer, who had left for ETH, while Sara was your first PhD student, graduating in Leiden. During all those years you spent sabbaticals in Denmark (twice; to learn about the real world, as we heard today from NK, a type of world not available locally), Germany, Australia, and Cambridge, and at the National Institute of Advanced Studies for Social Science, in Wassenaar as a Distinguished Lorentz Fellow. 16 students wrote a PhD thesis under your guidance, and there I count only the ones for whom you were the true advisor, not the number of signatures you put.

Your scientific work is extraordinary broad. In the workshop we ran during the past week we had a taste of the very diverse subjects, represented by scientists from equally diverse communities. It was tough at times, and you were probably the only one who could really appreciate all the talks. There is a rough time order in your interests, and we could divide up your career in the young Richard, the middle Richard and the late Richard. As I realise “the late Richard” does not sound so good in English, let me say Richard I, Richard II and Richard III. During the conference week we have seen photographs from these stages, even one photograph showing you doing intense mathematical research while showing off a naked torso, and, alas time passes, and maybe young and old are also adjectives that could have come to mind.

As Richard I you were concerned with survival analysis and causal inference. It includes your thesis work on the application of stochastic calculus in survival models, which became a well cited book, and it includes what you yourself have described as the “weight-lifters guide to survival analysis”, a beautiful account of the models and methods used to analyse stochastic processes of events in time, widely used to study the effectiveness of medical treatments, and 25 years after its publication still an important reference. A Dutch colleague once described your work to me by confiding “Richard Gill, every time there is no method available to answer a question, then not only does he invent one, but he also builds a completely new mathematical theory for the situation”. Well I suspected that he had no clue about statistics, but the admiration expressed in this description I found hugely impressive. And it is very justified.

As mathematicians we are usually not so very impressed by numbers, I mean counts, of papers, citations, etc., but rather value quality and depth. There is an ample amount of these in your work, but in my role as administrator let me mention a bit of trivia again. In a recent report where the University of Leiden investigates the factors that determine the ranking of the university, an important issue these days, I saw Richard’s name come up on the fairly short list of researchers who make a difference, being an ISI highly cited author. A little footnote was added to his name, though, as in the particular ranking this counted towards the fame of Utrecht University, not Leiden University. Shame on Richard for being a Leiden professor before and after, but not during.

The interests of Richard II came about, I think, as a conscious choice to do something totally different, and of course through genuine interest in the wonders of quantum probability and statistics. Richard was quite ahead of his time. This is something I, and many of the colleagues, have come to understand too late, even sitting through many introductions to Bell inequalities and other strange facts, which often seemed far removed from ordinary statistics. It is not easy to change a successful career midway, but you managed. One achievement that materialised recently was the development of the statistical methods used to analyse the entanglement experiment in Delft. Quantum information is everywhere now, and I wished I had paid more, or better at least, attention.

The interests of Richard III arrived certainly not as a choice, but quickly became a moral obligation. It started with the case of the nurse, who was not a killing nurse, of course, but a nurse in need of help. It was intense and a time full of emotion. Getting involved was a duty attached to being a scientist, a principle that you did not apply only to yourself, but also pointed out to your colleagues, often. The involvement in this particular case developed in interest in statistics and the law in general, and scientific integrity, with concrete cases as deaths in a clinical trial on probiotics to DNA evidence in murder cases, scientific fraud or questionable research practices. One particularly big case was connected to the UN tribunal for Libanon, which was all work done in secrecy, but I do remember the time that an impressive London barrister made a powerful speech in our Snellius building pointing out the importance of the newly developed statistical techniques for international law. Now the big crooks would be caught. Richard himself came under attack too. He may be the only mathematician in Leiden to have been advised by university lawyers how to avoid huge claims, or to have appeared before the integrity committee to fend off accusations of having damaged the integrity of a very well connected colleague. It is admirable how you coped under all the pressure, with difficulties, but still.

We do not know Richard as being extremely well organised. In presentations enthusiasm and inspiration may win from finishing in time. Inspiration, brilliance, creativity are qualities that come sooner to mind than being totally prepared. Nevertheless, a look at Richard’s cv reveals an impressive number of organisational achievements. Numerous program committee memberships and chairmanships, of really big conferences, president of the Netherlands Statistical Society, scientific secretary of the Bernoulli Society, numerous editorships, council member of the IMS, twice, liaison officer, chair of the examination committee, and many, many more. It is really a very long list. I would also mention that the Statistical Science master, which we run since some six years, was conceived from his office, albeit in a joint effort.

And not to forget, while at some time Richard was the only statistician in the MI, look at the big group that he is leaving behind.

Will you leave it behind? I hope not. The renaming of “the late Richard” to Richard III leaves many possibilities for the future. Some of the speakers already expressed it this week: Richard, we continue to count on your presence and your brilliant mind.

My farewell lecture

September 15, 2017

My Farewell Lecture



From killer nurses to quantum entanglement

Being a statistician has been, for me, a tremendous prerogative. It has been an opportunity to enjoy doing mathematics in the most varying fields of application you can imagine, combining the joy of discovering mathematical beauty and the satisfaction of contributing to real world problem solving, as well as the excitement of learning about new fields, perhaps far from mathematics. Over the years I’ve worked both on fundamental problems within mathematical statistics, as well as rather specific problems from varying applied fields. Today I want to tell you some stories about my most recent experiences in two outside areas: criminal law, and quantum physics.

“Forensic statistics” has become an established field, with its own research community, its own journals and conferences. An apparently strong degree of agreement among practitioners has emerged as to what a statistician is supposed to do, when asked by a court to quantify the weight of evidence. Courts are becoming accustomed to having to evaluate statistical or probabilistic evidence, as epitomized by the probability of a chance DNA match.

My introduction to this field was the case of the Dutch hospital nurse Lucia de Berk, sentenced to life imprisonment for alleged murders and attempted murders of her patients. The case was triggered by the supposedly unexpected death of a 6 month old baby at the Juliana children’s hospital in the Hague, a few days before 9-11. Lucia’s initial conviction was largely based on a statistical analysis of the distribution of incidents on her ward between shifts when she was on duty and shifts when she was not on duty. This generated data which could be summarized in a 2×2 contingency table. Later, data from two other wards at a hospital where she had earlier worked, were added. The standard analysis (Fisher’s exact test, based on the hypergeometric distribution) suggested that many, many more incidents occured in Lucia’s shifts, than should be expected to occur by chance. Anyone could understand the argument. It was enough for the judges at the lower court in the Hague.

On appeal, the verdict was the same but the argument adopted by the court completely changed. Instead of statistical calculations there was now new, apparently hard medical evidence. Baby Amber had died of an overdose of digoxin; Lucia had opportunity and motive to administer it. This made her a murderer and a liar. Successive cases required successively less evidence to convince the court that here too, Lucia was murdering her patients: the famous “chain argument”.

At some point brother and sister Metta de Noo (a medical doctor) and Ton Derksen (emeritus professor philosophy of science) started to reach the media with claims that the case was completely bungled. A book by Derksen exploded the reasoning of the court and cast doubt on the medical “facts” of the case. A popular movement calling for a re-trial began to grow and a number of statisticians were among the first to loudly voice their support. One of the important things to do was to neutralise the effect of the “one in 342 million” chance that was part of the initial conviction. It had had a profound effect on everyone’s thinking about the case, yet it turned out to have been a fantasy. Aside from a major technical error concerning the proper way to combine the results of three 2×2 tables (the product of three p-values is not a p-value), doubts arose as to the integrity of the data used in the analysis. It basically consisted of two lists: at which shifts was Lucia present; at which shifts was there an incident. How were those lists compiled? What is an incident?

A formal definition was never made. Most deaths and some but not all reanimations were included. It is hard to escape the conclusion that the main ingredient was (a) some element of medical surprise, and (b) Lucia’s presence. Since no-one has ever gone back to original hospital records, taken an objective criterium, and checked every single shift, we will never get to see objective data. What we do know now is that the list was compiled by medical doctors at the hospital who were highly suspicious of Lucia, who had been the subject of malicious gossip for half a year already at the time of Amber’s death.

After a long struggle the case was reopened, the court focussed on the medical evidence only, and Lucia was completely exhonerated.

The president of the court at the Hague has several times called for some kind of post-mortem into the Lucia case. I believe that the court actually behaved quite wisely and carefully, but that it was fooled by what effectively was a subconscious conspiracy of hospital staff (doctors and administrators) who fed one another half truths concerning a number of cases where real medical errors had been made and who found it easy to connect at the time unexpected bad treatment outcomes to a notable nurse. As a consequence of the social mechanisms operating inside the hospital, the data (in a broad sense) passing from the hospital to police investigators and later to a judicial investigation was always strongly biased. How could an outsider have known?

It was this background that got me involved in two more “serial killer nurse” cases: Ben Geen in England, Daniela Poggiali in Italy. I was shocked to the core by the similarities I found in those cases with the Lucia case; but shocked even more that it seemed impossible to have any impact on the outcome. Which raises the question: did we actually have any impact on the outcome of Lucia’s case? How did it come about that her conviction was overturned? More constructively: what should the next statistician do who gets involved in a case like this?

I’ll talk today about the British case only. This case had played out during the middle years of the Lucia events, but as far as I know, nobody had ever made a link between the two cases. In 2004 Ben Geen was charged with causing grievous bodily harm, resulting in two deaths, to 18 patients at Horton General Hospital, Banbury, during a few winter months (December 2003 – February 2004). He was ultimately convicted in 17 of the 18 cases. In 7 of these cases, respiratory arrest was claimed to have been caused by administration of a muscle relaxant. There was essentially no direct evidence for any wrong doing by anyone, let alone by Geen.

The case seemed to turn on the question whether or not “respiratory arrest” is an unusual occurrence in a hospital’s accident and emergency unit. Lawyers for Geen, attempting to get the case reopened by the UK’s CCRC (criminal cases review commission) used FOI requests to obtain data from a large number of hospitals giving the monthly numbers of events of various kinds, over a period of nearly 10 years. I made some analysis of this data and concluded that though respiratory arrest is three times less common than cardiac arrest, it certainly cannot be called “rare”. However, this whole exercise turned out to be a wild goose case. On the one hand, it was already known that the number of respiratory, cardiac and hypoglycaemic arrests causing sudden transfer from A&E to intensive care was only one larger than the previous year (7 versus 6; actually, the number of admittances was 10% larger as well). This was offered as excuse why the hospital was not able to earlier stop Geen: the number of such events was close to what one would expect in the winter months. Cardiac arrests were strikingly down, respiratory up. “Publicity bias”? (aka “awareness bias”): once one case has been so diagnosed, we observe many more. The distinction is often not clear cut. The main reason used by the CCRC to turn down the application to reopen the case was that the key point was not “respiratory arrest” per se, but “unexplained respiratory arrest”. Indeed, a key prosecution witness, professor of anaesthesiology, had stated that unexplained RA is very rare: in fact he had never ever experienced such a case.

So where did the 18 cases of the charge come from? They were selected by hospital doctors alarmed by a “trigger case”, going on a trawling expedition to find out “which other patients did Ben harm”. We are back with Lucia, and with medical doctors who are already suspicious of a nurse, themselves deciding case by case whether what they see can be medically explained or not … knowing case by case whether or not a particular nurse, already suspected of murder, was present or not. In fact, they are only investigating the patients of the nurse who is already under suspicion! Experts called by the defence did not find any of the 18 events suspicious. Some were difficult to understand. Even the prosecution experts couldn’t decide what Ben Geen had done to the patient!

In the Netherlands, a board of wise judges weighs all the evidence. In England, a jury is directed by a judge. In the Ben Geen case, the judge gave pretty clear directions to the jury, what they should think. In particular, the evidence from a medical statistician called by the defense, which addressed the multitude of ways that bias can enter in medical diagnosis, was discarded by the judge on the grounds that “it was barely more than common sense”.

Ben Geen, like Lucia, stood out in the crowd. He had been associated by his own colleagues with unexpected events: they gave him the nickname ‘Bev Allit”, after an earlier UK Lucia case. His career aim was to join the army and he was enthusiastic to get tough experience. And (supposing he’s innocent) he made a terrible mistake. He arrived at the hospital with a half-full syringe of muscle relaxant in his pocket at exactly the moment when the police were waiting there to interogate him. His claim: he had accidentally taken it home in his nurse’s scrubs after a particularly chaotic day in Emergency. His girl-friend, also a nurse, had found it and told him to return it to the hospital to be disposed of properly.

I have noticed that people’s judgement of the Ben Geen case depends almost entirely on whether or not they find his claim believable (I do). And indeed, this is the hardest piece of evidence which there is in the whole case: the medical evidence on all those 18 patients is truly wafer thin.

So, what did I learn from these experiences? What is a forensic statistician to do, the next time around?

The current dogma in forensic statistics is “compute the likelihood ratio”. That is the ratio of the probability of getting the data (for instance, a DNA match), in the case that the suspect is guilty, to the probability of getting the data, in the case that the suspect is innocent. There are many challenging issues with this paradigm. Is “the probability” as well defined as the words suggest? That depends on what we mean by probability and this is where a major confict between concepts rises to the fore. In short: does probability mean “personal degree of belief”, or is it an objective property of a physical system? In either case, anyway, do we know it? If statisticians are going to report probabilities in court, they are going to have to explain to the court what they mean by probability. Unfortunately there are several meanings available. As a mathematician, one can be neutral: the *rules* of probability are the same. However as soon as one wants to apply this mathematics in practice, one has to make a choice. Presently, the community seems moving to an uneasy compromise position. Bayesian (subjective) probabilities are fine, but the prior probabilities should be chosen objectively. This allows both data dependent priors, and priors based on principles of “non informativeness”. In both cases however, the “pure” Bayesian position has to be abandoned. The court is not asking the statistician for their actual personal belief, but for the beliefs of a mythical independent, capable, well-informed scientist. These beliefs must be scientifically reproducible: from the same assumptions as to prior knowledge anyone gets the same result as to statistical conclusions, and everyone agrees that the assumptions are justified.

For serial killer nurse cases, we are far, far away from being able to write down well justified and mathematically tractable statistical models for nursing roster and patient admission and discharge (or death) data. Even if we could resolve the issues of Bayesian versus frequentist approaches, there is no way to follow the likelihood ratio dogma. I think that each case needs to be studied on its own merits.

What is really important is to understand the generation of the data. Data generated by a hospital in the course of its own investigations into a suspected killer nurse is suspect data. It is data prepared by a biased witness (and possibly even by a culprit). The statistician has to convince judges, defense lawyers, journalists to be suspicious. Do not take data at face value. Hospitals are not independent forensic research institutes. Hospital doctors are not forensic scientists.

Also in the Italian case I mentioned, Daniela Poggiali, we discovered extraordary anomalies in the data. It became absolutely clear that the time of death written in hospital records is the result of a complex process influenced by administrative processes as much as medical truth. One cannot take hospital records at face value.

I’m convinced that Ben Geen is innocent. His case seems hopeless. There is no hard medical evidence. The statistical evidence is in his favour. But it has all been heard in court and successive juries have found him guilty. His only chance seems to me that a new medical analysis of some of the cases of his alleged victims turns up convincing medical evidence that Ben had nothing to do with the alleged medical incident. For this it will be necessary to get powerful medical supporters on Ben’s side. I don’t see it happening. The syringe is firmly in the way.

Did the efforts of so many outsiders, laypersons and scientists, save Lucia? I’m now inclined to believe that the success of the Lucia story was actually to a large extent a matter of extremely good luck; just as the initial case against her was based on what for her was largely bad luck. The surplus of deaths on her shifts was real but was just bad luck. It was part of a chain of events which can be understood as a social phenomenon: a witch hunt led to a witch trial. Everybody involved played their part and the wheels of justice turned and did their job. How did Lucia get freed again? Mainly through the extraordinary good luck that a whistle-blower from medicine, Metta de Noo became involved. Metta had inside knowledge through a personal connection at the hospital and strong and broad medical expertise. I’m afraid that Ben Geen will remain lost as long as no influential, authoritative, medical expert stands up on his behalf.

There is a book to be written about these and other cases. Not just about statistics but also about social psychology and the modern medical world.

Let me turn to a more cheerful topic. Over the years I have kept coming back to the famous Bell inequalities, and Bell theorem: a theoretical analysis made by physicist John S. Bell in 1964, which shows that quantum physics is dramatically different from classical physics: it predicts phenomena which could only occur in a classical physical world with the help of “action at a distance”, which roughly means: changes that you make to a physical system in one place can be felt (can be noticed) at other distant locations, instantaneously.

Just as shocking is the weaker deduction that information is being transmitted faster than the speed of light. But what does “changes can be felt” mean, exactly? The correct deduction is that *if* the quantum predictions actually have a classical physical explanation, *then* action at a distance takes place in a hidden world “behind the scenes” which is only partially visible to us. Action at a distance at a hidden level is needed to explain the observed facts in a mechanistic way. But there are no observable instantaneous changes at one place due to changes at some distant location. There is information flow in the hidden world behind the scenes which allows coordination of events at distant locations, but without information flow occuring in the outside world.

Bell’s 1964 analysis was theoretical. He described a thought experiment and contrasted the predictions of quantum mechanics with the restrictions which a classical picture of the experiment would entail. But can we actually do the experiment? It has taken more than 50 years before physicists were able to succesfully perform a rigorous experiment. In recent years they were on the brink and several groups around the world were racing to be first. At last, in 2015, Delft won the race; Vienna, NIST, and Munich were close behind. The experiment needs a statistical analysis and one of the statistical ideas in that analysis was contributed by myself, a number of years earlier. I will try to explain it to you.

The actual experiment involves lasers, photons, mirrors and crystals; I’ll replace these with some persons playing a game called the Bell game.

Many times (rounds), the following is repeated. Alice and Bob, in separate rooms, each receive a key. Their key may be silver and it may be gold. What keys they get is completely random. They don’t know the other’s key, only their own. Now they may each use their key just once, turning it clockwise or anticlockwise. Their aim is to open a box and share the prize inside. The box only opens in the following circumstances: if one or both keys are silver then both keys have to turned in the same direction; but if both keys are gold then they have to be turned in opposite directions. Trouble is, Alice does not know which key Bob has, and Bob does not know which key Alice has. In order to maximise the chance that they get the box open it would seem wise to coordinate their actions in advance, telling each other what they’ll do under either eventuality. It’s not difficult to see that any such plan results in failure at least one time out of four.

There are just 16 different plans. Because each plan is an assignment of clockwise or anti-clockwise to each of the four: Alice silver, Alice gold, Bob silver, Bob gold. We could just list them and check for each plan for which of the four eventualities it succeeds and which it fails. That’s tedious but easily doable. But I’ll try to give a shorter analysis.

One of the 16 plans is: Alice and Bob both turn clockwise, anyway. It fails for gold, gold, but wins otherwise.

To compensate for the gold-gold failure when both Alice and Bob always turn their key the same way, whatever it is, it’s natural to modify the plan by saying: the same (always clockwise), except that when Bob gets gold, he turns his key anti-clockwise. This plan now fails for the combination silver-gold, wins otherwise.

Trying out a few more combinations one will quickly make the following discovery. Every time we change just one assignment, we reverse the status (win/lose) in two of the four silver/gold combinations (Alice’s key and Bob’s key). This means that we can win three out of four times, or one out of four times, but nothing else. In particular, always winning is impossible; and the best winning chance is three out of four. There are eight ways to achieve this; the other eight of the sixteen all win only once in four times.

Alice and Bob might use those different plans with different probabilities. The average success rate will be something between a quarter and three quarters. It can’t be more.

Is there anything else which Alice and Bob could do? What if we replace Alice and Bob by computers running some computer programs? What if we replace them by arbitrary physical systems?

The argument that it’s not possible to better a 75% success rate goes through as long as we can argue that Alice and Bob might just as well decide, in advance, what each is going to do in each eventuality. For instance, a random choice of direction, depending on which key Alice is given, might just as well be implemented by making a random choice for each possible key, in advance. So using computers or physical systems to implement random choices (even with probabilities depending on the key) is just the same as choosing one of the 16 fixed plans at random (perhaps according to some very elaborate probability distribution).

At least, this is true if we use physical systems whose physics satisfies a property called realism; some people prefer the term “counterfactual definiteness”. It means: if a physical system can be measured in two different ways, then it’s possible to imagine the outcomes of both measurements, even if only one can actually be performed. The potential outcome of the not performed measurement still exists (in a mathematical sense), even if not revealed. The choice of measurement merely selects which of two preexisting values gets to be seen.

This would certainly be true of computers set up to simulate our experiment. And it is an obvious property of all pre-quantum physical theories. The final state of a physical system depends deterministically on its initial state. We may not be able to do the computation, or we may not know the initial state, and tiny variations in initial state might have enormous impact on the outcome, but even if the outcomes are effectively random we can still understand their statistics from deterministic reasoning. The physics is essentially deterministic.

Quantum mechanics was a revolutionary step in physics since it never even attempts to explain why what happens does happen. It merely tells us what are the probabilities of what can happen. And since the birth of quantum mechanics, physicists have dreamt of eventually coming up with theories which explained how those probabilities arose, as the reflection of uncontrolled and unknowable initial conditions of a richer and essentially deterministic system. At least, that was the dream, one might say, till Bell came along.

Bell not only showed that classical physics did not allow a bigger success rate than 75%, but also showed that using randomisation devices involving a phenomenon from quantum mechanics called entanglement it was theoretically possible to break the 75% success rate barrier. This means that if the randomness in this experiment is merely the reflection of random initial conditions in hidden layers of a richer, deterministic, description of what is going on, then that deterministic description exhibits action at a distance, in short “non locality”.

Bell described a schematic experiment which since 1964 has actually been performed many times, confirming quantum mechanics and achieving a higher than 75% success rate; however, till 2015, all the experiments suffer from major defects. A higher success rate could mean that Alice’s key is somehow known to Bob, and vice-versa. To rule out any mundane explanations it is necessary to choose Alice’s key (silver of gold) at random, at Alice’s location, just before Alice turns it clockwise or anti-clockwise, and within such a short time interval, that there is no way that what is happening at Bob’s place can be known before Alice is done. So Alice and Bob should be far apart in space, while the time interval between creating inputs and obtaining outputs must be small.

Satisfying these constraints makes the experiment “loophole free”. And performing a succesful, loophole free experiment, has been a holy grail in quantum physics for many decennia … till 2015 and the Delft experiment.

In Delft, at each of the two locations there is a diamond with a Nitrogen-vacancy defect, home to an electron spin which can be manipulated and interogated with lasers. Before each round, the two spins, in laboratories one at each end of the campus, are brought into their entangled state by a process called entanglement swapping which involves tricks with lasers, mirrors and crystals at a central location as well as at the two laboratories. Call this phase “preparation”, it’s the stage at which a strategy for Alice and Bob can be thought of as being established. Then the inputs are fed into the spins and the decisions are read out. All within a tiny time-interval preventing any kind of communication by any subluminal means.

In Delft, the game was played 245 times and the success rate was 80%. That is just statistically significantly bigger than 75%. The result has been confirmed by other research groups. I think we can be quite sure that the Delft result is not a mere chance fluctuation. By the way, according to quantum mechanics the highest possible success rate is 85% (more precisely: probability a half plus a quarter of the square root of two). There has been significant progress in recent years in showing that any physical theory which would allow higher success rates still would have counter-intuitive features even more weird than quantum mechanics.

Since Delft achieved an 80% success rate, statistically significantly larger than the theoretical long term optimum 75%, must we conclude that those two quantum spins do actually communicate with one another, at superluminal speed? (That is what the newspapers would have you believe). Following Bell I’ve shown you that any classical explanation of the process whereby query is converted into response would need superluminal communication. That is because we can imagine that after the joint system is prepared, each component is cloned, and each copy gets a different one of the two inputs. After that we simply select which copy is needed in the specific case at hand. This thought experiment has shown that the whole system is acting as if it had just chosen one of those sixteen decision rules and then followed the rule blindly. But then it would only have had a 75% success rate. The only way to do better, classically, is to have prior information as to what the inputs will be, and this requires communication between the two locations.

But there is an alternative, which is to suppose that classical explanations of the underlying process do not exist. Instead, the quantum mechanical description is the bottom line. The measurements of those quantum spins are truly random. Quantum systems can’t be cloned. The two responses to the two possible inputs are not predetermined. Instead, the one which is asked for is created afresh. It is irreducibly random: unlike the outcome of the toss of a coin or a dice, which is a deterministic function of the initial conditions of the coin or dice throw.

Quantum randomness is actually a creative phenomenon which makes things possible in a world run according to the laws of quantum mechanics, which would be impossible in a classical physical world. Indeed, the up to 85% success rate possible in the Bell game can be harnessed to give applications in cryptography, for instance, which are impossible in a classical world.

Quantum physics is a far more radical departure from classical physics than most physicists imagine. Randomness has become a “ground-level” part of the description of reality, it is not an emergent feature. Irreducible randomness is necessary in order to reconcile the statistics of the Bell game with no-action-at-a-distance. Quantum randomness is real and needs to be included in the axiomatic ground-floor of physics, not added as an after-thought. Irreducible randomness creates in some cases barriers (there are a number of famous “impossibility” theorems in quantum mechanics, such as the no-cloning theorem) but it also creates opportunities, new possibilities.

I still need to explain to you the contribution which I made to this history. Quite a few years ago I was engaged in discussions with a well-known mathematician who believed that the Bell experiment could be explained in classical physical terms. In fact, he believed that a succesful experiment could even be simulated on a network of ordinary computers. I was sure that he was wrong, and came up with the idea of making a bet. The idea was to play the Bell game with a network of computers. My opponent was to have complete freedom in programming the three computers. I insisted on the inputs being supplied completely at random, by a trusted third party. And I proposed a number of rounds and a win/lose criterion. For instance: at least 80% success rate, 10 thousand rounds. I wanted to be pretty certain of winning, and therefore I had some worries. As the game progresses, my opponent might be learning from the earlier rounds, and might be adapting his strategy. There is simply no guarantee that the rounds are independent and the probabilities of winning each round constant. Everything might be changing as time goes on. Moreover, my opponent is just running programs on classical computers: the results are deterministic. How can I do any probability calculations to find out if my bet is safe or not?

The answer was to use the probabilities which are under my control: the repeated, random choices of inputs. I noticed how to express the win/lose criterium from a traditional statistical analysis in terms of a simple count of success/fail over the rounds and how to use martingale theory to get probability bounds on the final total number of successes. The point here is that the argument that the success probability is at most 75% also applies to each round separately, given all information gathered in preceding rounds; and the probability comes from the deliberate randomisation, not from the physics. To my delight the physicists have taken up (and both simplified and refined) this idea so that finally the statistical conclusions of the Delft experiment, as well as of the others, cannot be fought on the grounds that time-dependence and time-variation invalidate the statistical analysis, as long as one is prepared to trust the randomness of the inputs in the experiment: the keys.

Coming to the end of my lecture I would like to briefly connect serial-killer nurses and quantum entanglement. There is an argument that nurse Lucia de Berk was actually saved by quantum statistics, and here it is. At some point I set up an internet petition asking for a re-trial in the case of Lucia de B. I canvassed for support and was able to use my quantum connection to Gerhard ’t Hooft, nobel prize winner, to have a conversation with him at which I asked him if he too would sign my petition. He said he would think about it. And after a weekend in which I believe he had consulted his family, some of whom are medical doctors, he signed the petition, leaving a remark there which went to the heart of the problem. This was noticed in the media and, I suspect, in the top of Netherlands legal system. It meant that in the subsequent legal proceedings, some of the best people were involved and the case was handled with care and respect.

I want to conclude my lecture with some words of thanks. Leiden University has been the perfect environment in which I could follow my scientific instinct wherever it led. The faculty of science and especially the mathematical institute has been a warm home where teaching and research stimulate one another. Working together with students and PhD students is motivating and rewarding. Official retirement will not put an end to it, I am sure.

In particular during the last years the creation of the master programme “Statistical Science” has been a dream coming true, the driving force behind it not me, but Jacqueline Meulman. I think statistics has a brilliant future in Leiden, cutting across traditional department and faculty divisions.

My family (especially wife and children) and friends have had to put up with my obsessions whether with nurses or the quantum or whatever. I hope that in the future I’ll be a little more attentive to you.

I’ve just been treated to a wonderful symposium organised by Aad van der Vaart, Peter Grunwald and Giulia Cereda at which there have been beautiful contributions by many colleagues around the world, both former students and former teachers, with whom I have such enjoyed working together over the years.


Paul Snively http://psnively.github.io/blog/2015/01/22/Fallacy/#comment-1881981921 asks:


“I think Christian’s response #4 is no response to the critique on his work, because it responds to a non-issue. So it may have some interesting content but it is besides the point: whether or not Bell was wrong? I think you don’t understand Bell’s theorem well enough hence you think that Christian came up with a smart answer to his critics.”

You don’t have to “understand” Bell’s theorem, as such, at all to demonstrate how it fails. I’m willing to break it down, step by step, if you are. Disqus hasn’t uttered a peep about our extravagant consumption of their resources—yet. 🙂 I will ask one yes-or-no question per comment. You may answer with either “yes” or “no.” If you wish to elaborate, please do so on your own blog or other freely publicly accessible forum of your choice and provide a link along with your “yes” or “no.” Let’s begin.

Do you understand that Dr. Christian’s work consists of two arguments:

1. Bell’s theorem fails to be a no-go theorem.
2. Given 1), here is a locally realistic model that predicts what we find in quantum mechanical experiments.

Yes or no?

My answer was no and I gave three links: http://arxiv.org/abs/1207.5103, http://arxiv.org/abs/1203.1504, http://arxiv.org/abs/1412.2677

However three links was too many so now I only give one, to this blog.

Actually first of all I answered yes but with a proviso, because I do understand, of course, that this is the intended content of Christian’s work, but on the other hand, both the two arguments are actually wrong. Paul’s question was “ill-formed”. A bit like “when did you stop beating your wife?” (you must reply by giving a date, otherwise I will delete your answer from my blog).

(1) is wrong: see J.O. Weatherall (2013). The Scope and Generality of Bell’s Theorem.
Found. Phys. 43, 1153–1169.

(2) is wrong: Christian’s “model” is logically flawed. Simulate the model in the one-page paper and it certainly doesn’t reproduce the singlet correlations. But anyway, who cares: Bell’s theorem proves it is impossible. It is impossible to simulate the singlet correlations by a LHV model in the rigorous constraints of a decent Bell-CHSH type experiment. Bell’s theorem can be seen as a theorem about distributed (classical) computing. What can be computed, what can’t. It tells us, in view of the fact that apparently nature can violate Bell inequalities (ie. according to quantum mechanics) that nature cannot be understood, even approximately, as a discrete stochastic classical automaton.

The difficulty with answering yes/no questions is that both answers can often be wrong. Either answer, without proviso or explanation, can be misleading. “Ask a stupid question, get a stupid answer”. This is why when you ask a Zen master a question he answers yes but nods his head for no (or vice versa).

Next time I’ll answer “yes and no”.

Paul Snively shows by his questions, so far, that he doesn’t know what he’s talking about. Which in itself is interesting, of course. And nothing to be ashamed or, either.

The UK Lucia

Today saw the publication of a paper entitled “In Search of the ‘Angels of Death’: Conceptualising the Contemporary Nurse Healthcare Serial Killer” by Elisabeth Yardley and David Wilson (Birmingham City University, Centre for Applied Criminology). It appears (2014) in Wiley’s “Journal of Investigative Psychology and Offender Profiling” (J. Investig. Psych. Offender Profil.) http://onlinelibrary.wiley.com/doi/10.1002/jip.1434/abstract

I have many concerns about this paper but my first is the following: it is based on 16 past cases, at least 2 of which seem almost certain to have been rather serious miscarriages of justice. In fact an earlier version of the methodology proposed by the authors was used against Lucia de Berk – the prosecution brought in an FBI expert who has contributed to one of the first publications on HCSK’s (Health Care Serial Killers) … and in which he made use of his inside knowledge of the Lucia case.

So looking at this sample, N = 16 with at least 2 false positives (Colin Norris, Ben Geen), what can be concluded? (Probably at least 3 false positives: the Finnish case of Aino Nykopp-Koski  is highly controversial). The checklist of “red flags” certainly identifies those who are going to be successfully prosecuted as an HCSK. In fact the reason they are being prosecuted is precisely because of some of the red flags. Others are subjective evaluations by fellow nurses and doctors and are contributed to the prosecution case *after* it has started. The most pernicious point is that an investigation into a suspected HCSK starts internally in a hospital and is carried out by doctors who themselves, formally, should be seen as suspects (their own patients are involved…). How objective will they be? What we know from the cases of Lucia, Colin Norris and Ben Geen, is that in this situation, doctors trawl for recent “suspicious” cases, re-writing or at least re-explaining medical events, which till then had not been considered suspicious at all. This results in a compelling dossier which goes to the police … media, police and courts do the rest.

The appearance of this landmark paper was reported in the Guardian newspaper. Ben Geen got some free publicity, the newspaper even included his photograph, taken during trial or appeal. Don’t expect a photograph of a young man in the middle of an appalling nightmare to present a “picture of innocence”. http://www.theguardian.com/uk-news/2014/nov/22/study-identified-key-traits-serial-killer-nurses

Here’s the checklist. Lucia had an enormous high score according to the evidence presented to the court. But not if the evidence is “corrected” with a view to correspondence to truth.

1. Moves from one hospital to another

2. Secretive/difficult personal relationships

3. History of mental instability/depression

4. Predicts when someone will die

5. Makes odd comments/claims to be ‘jinxed’

6. Likes to talk about death/odd behaviours when someone dies

7. Higher incidences of death on his/her shift

8. Seems inordinately enthused about his/her skills

9. Makes inconsistent statements when challenged about deaths

10. Prefers nightshifts—fewer colleagues about

11. Associated with incidents at other hospitals

12. Been involved with other criminal activities

13. Makes colleagues anxious/suspicious

14. Craves attention

15. Tries to prevent others checking on his/her patients

16. Hangs around during investigations of deaths

17. In possession of drugs at home/in locker

18. Lied about personal information

19. In possession of books about poison/serial murder

20. Has had disciplinary problems

21. Appears to have a personality disorder

22. Has a substance abuse problem

I emailed Prof. David Wilson in order to start an academic discussion with him about his research methodology. He said that I couldn’t criticise the paper because it has been published in a peer reviewed journal. And now he has blocked my emails. I have had the same experience when attempting to start a conversation with earlier researchers in this field. It seems that criminology is a field of science with impeccable research methodology and therefore above any criticism from other scientists.

Inside the Mind of HCSK Professionals

De Ene Zijn Dood is de Ander Zijn Brood, or One Man’s Death is Another Man’s Living. The public has a great fascination with murder, carried out by people whom we trust in situations in which we are helpless. Here are two books which capitalize on this: Inside the Minds of Health-Care Serial Killers by Katherine Ramsland; and Engelen des doods (Angels of Death), by Paula Lampe.
Continue reading

Justice in the Netherlands: Guilty until proven Innocent

Kevin Sweeney is serving a life sentence for murder of his wife by arson.

Here is a link to his own site http://www.justiceforkevinsweeney.com. You can read the conclusion of the court under LJN number AB0493 at www.rechtspraak.nl. Investigative reporter Peter R. de Vries reports on the case in his dossier Suzanne Davis. Curiously, Peter de Vries admits that there is absolutely no proof of arson and murder yet he appears to have a lot of sympathy for the victim of the fire, and none for the person who is accused of starting it. He has no interest now in supporting Sweeney’s claim that he is innocent and the fire was an accident (probably started by smoking in bed).

Negative publicity about Sweeney, demonstrably untrue and much of it spread by certain of the close relatives of the victim, has poisoned a lot of the news items on his case, see for instance the story on Sweeney at www.expatica.com, a web-site for Brits in the Netherlands. However independent lawyers and scientists who have become involved in his case are convinced that the fire was an accident, see for instance Fair Trial International’s dossier on the case http://www.fairtrials.net/index.php/cases/spotlight/kevin_sweeney.

On studying the scientific evidence which secured the conviction (results of experiments by TNO on reproducing the fire damage, and the pathology evidence concerning the cause and time of death), I can only agree that the prosecution’s story is totally in contradiction with all known facts, while that of the defense is totally in agreement with them.

It seems that Sweeney’s charming personality, high intelligence (his IQ of 144 is one of the legally established facts supporting his conviction) and spirited and elaborate defense of himself convinced Dutch judges that he was evil and manipulative. The fact that the fire started about the same time Sweeney arrived in Brussels one hundred kilometers away only confirms these judgements. The TNO experiments showed that the fire could NOT have been caused by a naked flame applied to 8 litres of fuel, and did nothing to disprove that it was caused by a burning cigarette fallen on bed-linen or whatever. Police investigators stated that the idea a fire could be caused by smoking in bed “belongs in the realm of fables” (a turn of phrase much admired by the judge, who quotes it in the summing up) whereas this is one of the most common types of fires, and the most common cause of fire deaths, the world over. Statistical evidence that such fires also occur in the Netherlands was not admitted by the judge who preferred to believe in the word of a policeman, out to secure a conviction. The TNO experiments were so complex and expensive that they had to incriminate the suspect, whatever the outcome, despite the fact that they were spectacularly ill-designed and inconclusive.

I notice the following similarities with the case of Lucia de Berk: the suspect is intelligent, a strong personality, a sympathetic but not run-of-the-mill person, with an unusual (complex) background and personal history involving much time spent abroad; the suspect never stops asserting their innocence; the crimes are so perfect that they are actually impossible; the case involves a huge amount of complex multidisciplinary scientific evidence. Scientists from any particular field know that the evidence of their colleagues in their own field was worthless but don’t find it necessary to protest, since everyone knows that the suspect was a bad person who probably did kill his wife. The police suppressed evidence supporting the defence case, manipulated forensic evidence, got witnesses to change their statements and to lie. The prosecution spread slander and gossip about the suspect and paints a beautiful picture of the victim which was greedily repeated by the media, known by scientific experts giving witness in the case, and embellished by the judges in their conclusions.

At the first hearing, the case collapsed; the prosecution appealed and spent three years on the TNO fire experiments. Fully documented support of all the statements I have just made were available to the Appeal Court (which found Sweeney guilty), to the Supreme Court, and to the European Court of Human Rights. All these courts have ignored them totally. Justice by gossip, with science as a willing accessory

Here is a report I’m writing for Sweeney’s lawyers.


Continue reading

Little Baba Must be Hung

“Barbertje moet hangen” (Little barber must be hung). This will require some explanation, and not just for the non-Dutch reader. Yesterday I asked a class if anyone had read Max Havelaar by Multatuli. 7 had not, one had read about half. And thrown the book away in disgust because it was all just the same as ever. Those who forget their history are condemned to repeat it. It seems the Dutch have forgotten their literature and hence their history and are repeating it. I must write more on this elsewhere and add the good links. In the meantime try Googling some of the unknown words and phrases here. 

Continue reading

Lies, damned lies and amateur statistics

Lies, damned lies and amateur statistics by Piet Groeneboom 
Piet’s blog actually has a different title, but same theme. Here’s my take on the matter. 
It has been known for a long time that “careless statistics costs lives” and I am referring specifically to bad statistics in medical research. 15 years ago about 90% of statistics in medical journals was wrong, things have improved, now it’s only 50%. Consistently across journals, across sub-fields. The most frequent error is the misunderstanding of p-values and a common recommendation is to have them banned. All this does literally cost lives: the good treatments are not discovered, time and resources … and hence lives … are lost following up “spurious correlations” (often discovered during fishing expeditions and/or using inappropriate statistical methods). Sally Clarke is another example of a life lost to amateur statistics (amateur statistics of an arrogant and self-satisfied medical specialist who transferred his “scientific conclusions” into legal brains with ease). For a good laugh (but perhaps the laugh of a farmer with toothache, as we say here in the polder) enjoy Peter Donnelly’s TED lecture. One of the many scientific papers carefully analysing abuse of statistics remarked how strange it is that we insist on getting brain surgery from a professional brain-surgeon, but are happy to have our statistics done by an amateur. Well, people who rely on amateur statistics, or worse still, are proud of their own, ought to go and see a brain-surgeon (for my very special friends: sshhh, I know this great Polish plumber … ).

Continue reading

Sticky Balls

Sticky Balls

Today (26 May) the new notion of sticky balls statistics (kleverige knikkers, in Dutch) was born. Well, the notion already existed: overdispersion due to the confounding factor time. But the new name should make it easier to get the idea across to sharp legal minds. The prosecution in the Lucia de Berk case only knew about the statistics of pulling nice new shiny balls out of a vase. The latest research shows that the balls were sticky. (Actually, the nice hospital administrators and policemen had a bit sticky fingers too, but that’s another issue). It was chance all right. Just a little bit of bad luck. Bad statistics. That was very bad luck. Doing Tarot cards was not a good idea either (especially in combination with keeping an odd diary and having overdue Stephen King books from the library in your bookshelf). Illustrations: the Chinese delicacy “sticky balls” eaten at the Spring Festival, and aptly symbolizing “reunion”: fromChengdu (the white ones), and Suzhou (green).

Continue reading