The rapid rate of advancement that is being seen in the world of AI has caused many to wonder if we will be able to tell the difference in the future. Large Language Models such as ChatGPT have already begun to adopt human mannerisms, and while it might seem easy to tell whether or not the entity at the other end is an AI or a human being, further innovation will certainly make this harder than might have been the case otherwise.
One of OpenAI’s biggest competitors, AI21 Labs, recently conducted a social experiment dubbed “Human or Not”, and it revealed that around 32% of the people that participated in this experiment were unable to distinguish between AI and real human beings. This is perhaps the biggest Turing Test that has been conducted so far, and its findings reveal the dire need for more effort to be put into marking AI for the purposes of allowing people to identify it.
Participants were asked to have two minute long conversations with an AI bot that leveraged LLMs sourced from both ChatGPT-4 as well as Jurassic 2 which hails from AI21 Labs. With all of that having been said and now out of the way, it is important to note that participants were able to figure out if the person was a human being in 73% of instances.
In spite of the fact that this is the case, whenever there was a bot on the other end, the success rate dropped to 40%. This seems to suggest that 60% of respondents were unable to figure out that they were talking to a bot, something that does not bode well for the future of AI and how it can be used to manipulate others.
However, the researchers behind this social experiment were able to glean some useful insights pertaining to how users go about trying to figure out whether or not they are talking to a bot or a human. Most of these techniques are based on false assumptions, such as the notion that bots never use typos or grammatical errors.
Researchers were aware that these tactics might be deployed, so they specifically trained their bots to strategically utilize typos and other forms of errors in syntax and grammar to make them seem more human. Personal questions were also used fairly frequently, with participants trying to get the bots to talk about their backgrounds, assuming that bots would not be able to respond to such queries.
Once again, these bots were trained on datasets that included a wide range of personal stories, and that led to them being able to answer these questions in a way that is surprisingly similar to human beings. Hence, 32% of participants were unable to successfully identity AI during this experiment with all things having been considered and taken into account.
There is a high level of likelihood that this will factor into the upcoming US elections because of the fact that this is the sort of thing that could potentially end up leading to more misinformation being spread. It will be interesting to see how the US adapts its policy to respond to increasingly human-like AI and other types of bots.
Read next: Risky AI Incidents See 690% Increase
One of OpenAI’s biggest competitors, AI21 Labs, recently conducted a social experiment dubbed “Human or Not”, and it revealed that around 32% of the people that participated in this experiment were unable to distinguish between AI and real human beings. This is perhaps the biggest Turing Test that has been conducted so far, and its findings reveal the dire need for more effort to be put into marking AI for the purposes of allowing people to identify it.
Participants were asked to have two minute long conversations with an AI bot that leveraged LLMs sourced from both ChatGPT-4 as well as Jurassic 2 which hails from AI21 Labs. With all of that having been said and now out of the way, it is important to note that participants were able to figure out if the person was a human being in 73% of instances.
In spite of the fact that this is the case, whenever there was a bot on the other end, the success rate dropped to 40%. This seems to suggest that 60% of respondents were unable to figure out that they were talking to a bot, something that does not bode well for the future of AI and how it can be used to manipulate others.
However, the researchers behind this social experiment were able to glean some useful insights pertaining to how users go about trying to figure out whether or not they are talking to a bot or a human. Most of these techniques are based on false assumptions, such as the notion that bots never use typos or grammatical errors.
Researchers were aware that these tactics might be deployed, so they specifically trained their bots to strategically utilize typos and other forms of errors in syntax and grammar to make them seem more human. Personal questions were also used fairly frequently, with participants trying to get the bots to talk about their backgrounds, assuming that bots would not be able to respond to such queries.
Once again, these bots were trained on datasets that included a wide range of personal stories, and that led to them being able to answer these questions in a way that is surprisingly similar to human beings. Hence, 32% of participants were unable to successfully identity AI during this experiment with all things having been considered and taken into account.
There is a high level of likelihood that this will factor into the upcoming US elections because of the fact that this is the sort of thing that could potentially end up leading to more misinformation being spread. It will be interesting to see how the US adapts its policy to respond to increasingly human-like AI and other types of bots.
Read next: Risky AI Incidents See 690% Increase