In 1950, the brilliant Alan Turing proposed what is commonly referred to today as the Turing Test. Common descriptions of the Turing Test describe it as a person, acting as an interrogator, submitting questions to both a computer and a person, each of which is secluded in a room away from the interrogator. Communication of the answers is done only by typed text back to the interrogator. If, after a period of time, the interrogator is unable to correctly identify which answers come from the computer vs. the person, the computer has then passed the test and is considered as intelligent as a human.
The actual Turing Test as described in Turing’s original article was a bit different from that. His original description consisted of what he called The Imitation Game in which a man and a woman were secluded into two separate rooms and the goal of the interrogator was to determine which was which judging from the typed text answers to queries. The man and woman were instructed to intentionally disguise their sex identity by their answers. The Turing Test would then consist of substituting a computer to take on the role of one of the sexes and then determine if it was equally capable of fooling the interrogator. Turing predicted that by the turn of the century (fifty years later in 2000), “that an average interrogator will not have more than 70 per cent chance of making the right identification after five minutes of questioning.”
Whatever version of the Turing Test one uses—and there have been many over the years—it has been criticized from many points of view as being an inadequate method to determine when computers reach the level of human intelligence. Who would be the “average” interrogator? Who would be the person being interrogated?—too variable and too imprecise. Which of the many definitions of human intelligence is really being tested? What questions would be asked? If you limited the questions to chess moves, IBM’s Deep Blue would have passed the test long ago, yet no one considers it to have human level intelligence. An excellent review article of the Turing Test done 50 years later described the many flaws of the many possible versions. It concluded by saying, “We believe that in about fifty years’ time, someone will be writing a paper titled "Turing Test: 100 Years Later".
Marsden S. Blois was a physician on the faculty of the University of California, San Francisco School of Medicine until his death in 1988. He is not as well known as Alan Turing except in narrow professional circles. He pioneered the field of medical informatics and was a founding member of the American College of Medical Informatics. In chapter 11 on electronic evolution, I discuss his article in the New England Journal of Medicine in 1980 entitled Clinical Judgment and Computers. It is considered a classic in his field. Below is a figure from this article.
The funnel is meant to show the decreasing cognitive span of a physician during the process of making a diagnosis regarding a patient. At first, represented by point A in the figure, any diagnosis is possible. The physician, with his or her general knowledge of medicine and of the world in general and the ability to interact with other humans by talking with them (taking a history), examining them and observing their behavior must narrow down the huge number of possibilities to a reasonably small number called the differential diagnosis. Selecting from the smaller number of choices in the differential diagnosis is represented by point B in the figure. The process at point B to get to the correct diagnosis often requires the use of laboratory tests and other diagnostic procedures and very specific detailed knowledge of the narrow disease spectrum. Blois’ contention was that physicians are far superior to computers at point A whereas computers, if programmed properly with the right rules, are superior to physicians at point B.
The funnel could represent any domain of knowledge, not just medicine. Somehow, the human brain is born with the capability to get to point A just by living in the world. It is common sense. I propose a new test to replace the Turing Test—lets call it the Blois Test. As artificial intelligence (AI) becomes closer to human intelligence, Point B in the figure will move to the left and become closer to Point A. For example, Google’s AlphaGo software in its DeepMind computer is at point B with regard to playing the game of Go, but it totally lacks common sense. It is far from Point A. I would postulate that IBM’s Watson computer, as demonstrated in its victory in the game of Jeopardy is closer to Point A, but still far from it. When AI finally reaches what is called artificial general intelligence, i.e., equal to human intelligence, Point B will be superimposed on Point A. It will then pass the Blois Test.
OK, dear readers, fire away with your critique of this suggestion.