Founders of Anthropic, who are also former employees of OpenAI, are attempting to make artificial intelligence safe. In an interview, Anthropic co-founder Jared Kaplan concentrated on an approach known as “constitutional AI” to achieve this.
He claims that this approach aims to train chatbots and other AI systems to adhere to predetermined laws or constitutions.
Historically, the development of chatbots like ChatGPT has relied on human moderators to assess the system’s performance for toxicity and hate speech. The system then makes adjustments to its answers using this feedback. Reinforcement Learning through Human Feedback, or RLHF, is the term used to describe this process. However, with constitutional AI, the chatbot itself is mostly in charge of this work. However, an individual is still needed for further analysis.
Read More: OpenAI Closes $300 Million Funding Round Between $27-$29 billion Valuation
Anthropic has long discussed constitutional AI and trained its own chatbot, Claude, using the technique. The corporation is now disclosing the exact written rules it uses for such operations, which are found in its constitution. A number of sources, including the Universal Declaration of Human Rights and Apple’s terms of service, are cited in this paper. The goal of many of them is to avoid being rude.
While there are many unanswered questions in this situation, Kaplan emphasizes that his company’s goal is to demonstrate the overall effectiveness of its approach which is the notion that constitutional AI is superior to RLHF when it comes to managing the systems’ input.