The Turing Test isn’t really intended to identify a computer – Turing’s problem wasn’t that we needed a way to identify computers.
At the time – well, and to some extent today – some people firmly felt that a computer could not actually think, that that is something “special” that only humans can do.
It’s intended to support Turing’s argument for a behavioral approach to thinking – that if a computer can behave indistinguishably from a human that we agree thinks, then that should be the bar for what we talk about when talking about thinking.
There have been people since who have aimed to actually work towards such chatbot, but for Turing, this was just a hypothetical to support his argument.
https://en.wikipedia.org/wiki/Turing_test
The test was introduced by Turing in his 1950 paper “Computing Machinery and Intelligence” while working at the University of Manchester.[5] It opens with the words: “I propose to consider the question, ‘Can machines think?’” Because “thinking” is difficult to define, Turing chooses to “replace the question by another, which is closely related to it and is expressed in relatively unambiguous words.”[6]
Turing did not intend for his idea to be used to test the intelligence of programs—he wanted to provide a clear and understandable example to aid in the discussion of the philosophy of artificial intelligence.[82] John McCarthy argues that we should not be surprised that a philosophical idea turns out to be useless for practical applications. He observes that the philosophy of AI is “unlikely to have any more effect on the practice of AI research than philosophy of science generally has on the practice of science.”[83][84]
The problem with the Turing test and current AI is that we didn’t teach computers to think, we taught them to talk.
There is, however, still the concept of the Chinese Room thought experiment, and I don’t think AI will topple that one for a while.
For those who don’t know and don’t wish to browse off the site, the thought experiment posits a situation in which a guy who does not understand Chinese is sat in a room and told to respond to sets of Chinese characters that come into the room. He has a little booklet of responses—all completely in Chinese—for him to use to send responses out of the room. The thought experiment questions whether or not the system of the Chinese Room itself can be thought to understand Chinese or even the man himself.
With the Turing Test getting all of the media spotlight in AI, machine learning, and cognitive science, I think the Chinese Room should enter into the conversation as the field of AI looks towards G.A.I.
The Chinese Room has already been surpassed by LLMs, which have shown to contain neurons that activate in such high correlation to abstract concepts like “formal text” or “positive sentiment”, that tweaking them is one of the options that LLM based chatbots are presenting to the user.
Analyzing the activation space, it’s also been shown that LLMs categorize and cluster sequences of text representing similar concepts closer to each other, which allows them to present reasonably accurate zero shot responses that have never been in the training set (that “weren’t in the book” for the Chinese Room).
I don’t understand what you mean by “The Chinese Room has already been surpassed by LLMs”. It’s not a test that can be surpassed. It’s just a thought experiment.
In any case, you do bring up a good point. Perhaps this understanding is in the organization of the information. So if you have a Chinese room where all the query-response pairs are in arbitrary orders, then maybe you wouldn’t consider that to be understanding. But if you have the data organized such that similar queries/responses are close to each other and this person in the room doing the answering can make mistakes such as accidentally copying out the response next to the correct response and still make sense, then maybe we can consider this system to have better understanding.
It’s not making Turing test obsolete. It was obvious from day 1 that Turing test is not an intelligence test. You could simply create a sufficiently big dictionary of “if human says X respond with Y” and it would fool any person that its talking with a human with 0 intelligence behind it. Turing test was always about checking how good a program is at chatting. If you want to test something else you have to come up with other test. If you want to test chat bots you will still use Turing test.
Voight-Kampff test maybe?
Imagine someone asked you “If Desk plus Love equals Fruit, why is turtle blue?”
AI will actually TRY to solve it.
Human nature would be to ask if the person asking the question is having a stroke or requires medical attention.So, I asked this to the three different conversation styles of Bing Chat.
The Precise style actually tried to solve it, came to the conclusion the question might be of philosophical nature, including some potential meanings, and asked for clarification.
The Balanced style told me basically the same as the other reply by admiralteal, that the question makes no sense and I should give more context if I actually want it answered.
The Creative style told me it didn’t understand the first part, but then answered the second part (the turtles being blue) seriously.
Would it be safe to say that all 3 answers would fail the test?
Not sure, I’m not familiar with the test, just figured I’d tell the results from asking the AI.
I think based on what you said about it
AI will actually TRY to solve it.
Human nature would be to ask if the person asking the question is having a stroke or requires medical attention.That the Balanced style didn’t fail, because while it didn’t ask about strokes or medical attention, it did point out I’m asking a nonsense question and refused to engage with it.
The Precise style did try to find an answer and the Creative style didn’t realize I’m fucking with it, so I do think based on the criteria they’d fail the test.
Though, honestly, I’d fail the test too. When asked such a question, I’d think there has to be an answer and it’s stupid of me not to see it and I’d look for it. I think the Precise style’s answer is very much where I’d end up.
Nope, ChatGPT tells you it is a nonsequitor and asks for more context or intention if the question is sincere.
You’re saying the test would work.
In 43+ years on this planet I’ve never HEARD someone seriously use “non sequitur” properly in a sentence.
Asking if the intention is sincere would be another flag given the circumstances (knowing they were being tested).Toss in a couple real questions like: “What is the 42nd digit of pi?”, “What is the square root of -i ?”, and you’d find the AI pretty quick.
Cool.
Both the phrases you’re calling out as clearly AI came from me. Not used by ChatGPT, just how I summarized its response. I wonder if this is the first time someone has brazenly accused me of being an AI bot?
LoL, no I took you at your word which was my mistake
“ChatGPT tells you” read to me like you attempted and got that response.
It’s just too scary to acknowledge. Same thing with aliens. They’re both horrifying literally beyond imagination, and both for the same reason, and so it’s more natural to avoid acknowledging it.
Everything we’ve ever known is a house of cards and it’s terrifying to bring that to awareness.
The point of logic is to carry you when your emotions try to stop you from thinking.
Yes AI is scary. No, that doesn’t mean we get to through out our definition of AI in order to avoid recognizing its presence.
I’m reminded of the apocryphal Ghandi quote “first they ignore you, then they laugh at you, then they fight you, then you win.” It seems like the general zeitgeist is in between the laugh/fight stages for AI right now.