OpenAI Comes Up With New CriticGPT Model That Finds Errors In ChatGPT’s Replies

OpenAI revolutionized the tech world with the launch of its popular ChatGPT tool. But it wouldn’t be wrong to mention that the rollout came with a debate on how accurate the responses were.

While it did not go down to the extent of inaccuracy that Google was making, which asked people to use glue to stick toppings on their pizza, it was never error-free.

Now, we are hearing more about how the company is in the works on a new AI model dubbed CriticGPT that finds errors in replies of ChatGPT.

The latest tool is being used to assist human trainers of the GPT-4 model so that they can find errors quicker inside answers. Therefore, the goal right now is to limit as many inaccuracies as possible.

While the launch is yet to be heard of anytime soon, we can confirm that the model is being used inside the company for internal use only where methods like Reinforcement Learning through human feedback are used.

As it is, reports confirmed how AI trainers are having a harder time than normal finding errors in GPT-4 models because they keep on improving with time. But with tools like this, the company is confident that it can better rate responses and determine real from inauthentic with much more ease and at a faster pace.

Remember, one of the biggest limitations of the RLHF methodology was related to the challenges faced by trainers to align models because they’re getting more knowledgeable than the human trainers themselves. So providing feedback in this circumstance is hard.

The tool just might be what the firm needed to save the day and replies may not be 100% right all of the time. Moreover, the fact that it could be susceptible to issues like hallucination that’s common with AI is another challenge worth mentioning.


Models can assist humans do better and point out errors than when the job is being done by the human alone. But again, it’s AI so nothing can be 100% trusted, experts explain.

During experts carried out so far, OpenAI says that it’s seen promising results and the fact that human trainers prefer doing critiques with this new tool plus their own intervention means a lot.

One trial spoken about in detail by OpenAI revealed how the tool was used to search for errors that were deliberately added by humans so they appeared like a natural occurrence. Seeing the model figure that out was impressive as was catching bugs caught by trainers in the past.

The CriticGPT model was trained via short ChatGPT replies and the latest methods must be developed to assist trainers in comprehending long tasks. Meanwhile, hallucinations might have some serious consequences as trainers coming across this could make errors during the labeling process.

For now, the tool is said to have a great command when it comes to spotting GPT-4’s errors in short responses. But at the same time, OpenAI does know how mistakes can be widespread so it must be trained on longer replies in the future to tackle any shortcomings.

Read next: Meta’s Oversight Board Raises Eyebrows As Annual Report Shows It Issued Just 53 Decisions Out Of 398,597 Appeals
Previous Post Next Post