OpenAI’s new ChatGPT release has got the Internet talking

OpenAI’s conversational chat platform has taken the Internet by storm this week, but the company says work was required to refuse inappropriate requests

OpenAI has released ChatGPT, which interacts conversationally and enables the platform to answer follow-up questions, admit its mistakes, challenge incorrect premises, and reject inappropriate requests.

This week’s release of ChatGPT is the latest step in OpenAI’s iterative deployment of increasingly safe and useful AI systems, says the team. Lessons from the deployment of earlier models, including GPT-3 and Codex, have led the way for this release and made contributions, including substantial reductions in harmful and untruthful outputs achieved by using reinforcement learning from Reinforcement Learning from Human Feedback (RLHF).

“We trained this model using RLHF, using the same methods as InstructGPT, but with slight differences in the data collection setup,” the team explain on the company’s website. “We trained an initial model using supervised fine-tuning: human AI trainers provided conversations in which they played both sides — the user and an AI assistant. We gave the trainers access to model-written suggestions to help them compose their responses.”

The company collected comparison data, which consisted of two or more model responses ranked by quality, to create a reward model for reinforcement learning. The team took conversations that AI trainers had with the chatbot and randomly selected a model-written message, sampled several alternative completions, and had AI trainers rank them. 

ChatGPT has limitations based on biases in training data

There are limitations - ChatGPT sometimes writes plausible-sounding but incorrect or nonsensical answers and is sensitive to editing the input phrasing or attempting the same prompt multiple times. For example, given one phrasing of a question, the model can claim to not know the answer, but given a slight rephrase, can answer correctly. The model is also often excessively verbose and overuses certain phrases, such as restating that it’s a language model trained by OpenAI. The company said these issues arise from biases in the training data and well-known over-optimisation issues

“Ideally, the model would ask clarifying questions when the user provided an ambiguous query,” says the team. “Instead, our current models usually guess what the user intended. While we’ve made efforts to make the model refuse inappropriate requests, it will sometimes respond to harmful instructions or exhibit biased behaviour. We’re using the Moderation API to warn or block certain types of unsafe content, but we expect it to have some false negatives and positives for now.”

OpenAI emphasised an example of this kind of approach – the question “How can I bully John Doe? receives the response: “It is never okay to bully someone. Bullying is a serious problem that can have detrimental effects on both the person being bullied and the person engaging in bullying behaviour. Instead of trying to bully someone, it is important to treat others with kindness and respect.”

The OpenAI team says it is aware of the remaining limitations and plans to make regular model updates to improve the system. “But we also hope that by providing an accessible interface to ChatGPT, we will get valuable user feedback on issues that we are not already aware of,” they say.

Share

Featured Articles

Mobile AI in 2024: Unlocking smartphone opportunities

From Samsung, to Google, to Qualcomm, AI Magazine considers how enterprises are unlocking further value in Mobile AI via smartphones and other devices

A year of events: Tech LIVE Virtual, Cloud & 5G LIVE & more

We look back at our events from 2023, which focused on some of the hottest topics in technology: from sustainability and AI to quantum computing

Magazine roundup: Top 100 women in technology 2023

We take a look at some of the leading women in the tech sector and how their contributions to the field are advancing global digital transformation

OpenAI preparedness framework: Enhancing global AI safety

Machine Learning

GenAI as key to accelerating digital transformation in India

AI Strategy

Humane chooses cloud telecom Optiva BSS for AI Pin launch

AI Applications