Friday, April 12, 2024
ad
HomeNewsTechnology Innovation Institute Open-Sourced Falcon LLMs Falcon-40B and Falcon-7B

Technology Innovation Institute Open-Sourced Falcon LLMs Falcon-40B and Falcon-7B

On the OpenLLM Leaderboard, these models outperforms models like LLaMA, StableLM, RedPajama, and MPT in terms of performance.

Technology Innovation Institute (TII) created Falcon-40B, a potent decoder-only model that was trained on a huge amount of data made up of 1,000B tokens from RefinedWeb and curated corpora. The TII Falcon LLM Licence makes this model available publically. This open-source model, on the OpenLLM Leaderboard, outperforms models like LLaMA, StableLM, RedPajama, and MPT.

The Falcon-40B’s inference-optimized design is one of its noteworthy features. It features multi-query, which Shazeer et al first detailed in 2019, and FlashAttention, which Dao et al first introduced in 2022. The model performs exceptionally well and is extremely effective when doing inference tasks as a result of these architectural improvements.

Another one is Falcon 7-B. A very sophisticated causal decoder-only model called Falcon-7B was created by the Technology Innovation Institute (TII). It has been trained on a sizable dataset of 1,500B tokens obtained from RefinedWeb, further augmented with curated corpora, and has an outstanding parameter count of 7B. The TII Falcon LLM Licence is used to make this model available.

Read More: Microsoft Announces AI Personal Assistant Windows Copilot for Windows 11

Falcon-7B’s superior performance in comparison to other comparable open-source models like MPT-7B, StableLM, and RedPajama is one of the main justifications for picking it. As seen on the OpenLLM Leaderboard, its better skills are a result of extensive training on the enriched RefinedWeb dataset.

An architecture that is specifically designed for inference tasks is included in Falcon-7B. The model gains from the incorporation of multi-query, introduced by Shazeer et al in 2019, and FlashAttention by Dao et al. Same as Falcon-40B, the model is more effective and efficient while performing inference operations as a result of these architectural improvements.

Subscribe to our newsletter

Subscribe and never miss out on such trending AI-related articles.

We will never sell your data

Join our WhatsApp Channel and Discord Server to be a part of an engaging community.

Sahil Pawar
Sahil Pawar
I am a graduate with a bachelor's degree in statistics, mathematics, and physics. I have been working as a content writer for almost 3 years and have written for a plethora of domains. Besides, I have a vested interest in fashion and music.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular