logo

FX.co ★ Meta To Launch Chameleon Multi-modal LLM

Meta To Launch Chameleon Multi-modal LLM

Meta Platforms (META) has unveiled its latest advancement in artificial intelligence—a sophisticated multi-modal large language model named Chameleon.

According to the company’s research paper, Chameleon is designed to undertake a wide array of tasks that previously required multiple models. This innovative model demonstrates superior integration of information compared to its predecessors.

Chameleon employs an 'early-fusion token-based mixed-modal' architecture, enabling it to learn from a variety of inputs, including images, code, text, and more. The model uses a combination of image, text, and code tokens to generate sequences.

"The unified token space of Chameleon allows it to seamlessly reason over and generate interleaved image and text sequences, eliminating the need for modality-specific components," states the research paper.

The model’s training involves two stages and utilizes a dataset comprising 4.4 trillion tokens from text, image-text combinations, and interwoven text and image sequences. Two versions of Chameleon have been trained—one with 7 billion parameters and another with 34 billion parameters—over the span of more than 5 million hours on Nvidia A100 80GB GPUs.

In the competitive landscape, OpenAI has recently launched GPT-4o, while Microsoft (MSFT) introduced its MAI-1 model a few weeks ago.

*The market analysis posted here is meant to increase your awareness, but not to give instructions to make a trade
Go to the articles list Open trading account