On October 22nd, 2024, Anthropic announced an enhanced version of Claude 3.5 Sonnet that can deliver better results than its predecessor. It also announced the release of a new model, Claude 3.5 Haiku, that matches the performance of Claude 3 Opus.
In this new Claude 3.5 Sonnet release, Anthropic offers users the capability to direct Claude to interact directly with the computer. It allows the automation of cursor movement, clicking buttons, and typing text.
Although still in the beta version, the computer use model efficiently performs functions like filling online forms by interacting with the spreadsheet on your computer. Anthropic has released this version so that users can interact with the model and provide feedback to improve performance.
Read More: LambdaTest Launches KaneAI for End-to-End Software Testing
The upgraded Sonnet 3.5 model has already shown significant improvements in various tasks and software benchmarks. Responses for SWE-bench Verified improved from 33.4% to 49.0%. On the other hand, TAU-bench results increased from 62.6% to 69.2% in retail and 36.0% to 46% in the advanced airline domain.
Early feedback from developers outlined a significant leap in the performance of AI-powered coding, with a 10% enhancement in logical reasoning.
With these improvements, the new model is being used by multiple renowned names, including Cognition, which uses the model for autonomous AI evaluations. The browse company is using this model to automate web-based AI workflows.
To ensure data safety, Anthropic partnered with the US AI Safety Institute (US AISI) and the UK Safety Institute (UK AISI). The US AISI and UK AISI conducted the pre-deployment testing of the new Claude 3.5 Sonnet model.
In hindsight, Claude 3.5 Haiku marks the release of the fastest model in the Anthropic ecosystem. For the same cost and slightly better speed than Claude 3, the 3.5 Haiku model offers enhanced results in almost every skill set.
With both models already available for use, Anthropic aims to revolutionize generative artificial intelligence and redefine modern processes by introducing automation.