Meta says its LLaMA language model is more promising than OpenAI’s GPT-3

After OpenAI launched ChatGPT, a large language model, competitors caught up. Bard, Google (90.51, 0.21, 0.23%), Microsoft (246.27, 3.15, 1.26%) with the new bing. Now, Meta has also launched a new model, LLaMA. Currently, the model is only available to researchers.

LLaMA, which is understood to be an acronym for Large Language Model Meta AI, is smaller than existing models because it is built for research communities that do not have access to large amounts of infrastructure. LLaMA comes in a variety of sizes, ranging from 7 billion parameters to 65 billion parameters.

Meta said that despite its smaller size and 162 billion fewer parameters, Lama-13b outperformed OpenAI’s GPT-3 “in most benchmarks.

The largest model, Lama-65B, is reportedly comparable to models such as DeepMind’s Chinchilla70B and PaLM-540B.

LLaMA is a base model: it is trained on a lot of unlabeled data, which makes it easier for researchers to fine-tune the model for specific tasks. Because the model is smaller, it is easier to retrain for use cases.

LLaMA is not just built using English text. Meta trains its model using 20 languages that use either Latin or Cyrillic scripts. However, most of the training data is in English, so the model performs better.

Meta researchers claim that access to current large language models is limited due to the size of the models.

Meta argues that “this restricted access limits researchers’ ability to understand how and why these large language models work, hindering efforts to improve their robustness and address known issues, such as bias, insult, and the potential for misinformation.”

In addition to making the model smaller, Meta is trying to make LLaMA more accessible, including releasing it under a non-commercial license.

Access to the various LLaMA models will only be granted on a case-by-case basis to academic researchers, such as those affiliated with governments, civil organizations, and academia.

Like ChatGPT, LLaMA suffers from generating biased or inaccurate information just like other language models. Meta’s LLaMA statement acknowledges this and says that by sharing models, researchers can “more easily test new methods to limit or eliminate these problems in large language models.”

Meta previously launched a large language model for researchers called OPT-175B in May last year, and late last year released another model, Galactica, which was quickly taken down within 48 hours after it was found to regularly share biased or inaccurate information.

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *