Meta Announces "Llama3.2," a Small and Medium-Sized Large Language Model AI for Edge and Mobile Devices at Meta Connect 2024.
At its annual developer conference, Meta Connect 2024, Meta announced "Llama3.2," a small and medium-sized large language model AI for edge and mobile devices.
Although Meta had just released Llama3.1 in July, Llama3.2 introduces a significant update, including the addition of the first multimodal model.
Llama3.2 comes in two models: 11B (11 billion) and 90B (90 billion), both of which support image recognition capabilities.
With this new feature, image reasoning use cases such as understanding tables and graphs, generating image captions, and providing natural language instructions for objects within images (visual grounding) are now possible.
Mark Zuckerberg personally showcased the new AI capabilities at the event.
A video demonstrating Ray-Ban smart glasses equipped with Llama3.2 analyzing ingredients on screen to suggest recipes and giving opinions on clothes displayed on a store rack received applause from the audience.
Multilingual Support and Video Creation
Zuckerberg also discussed Meta's experimental AI features. These include live translation between Spanish and English, an app that automatically translates videos into various languages, and avatars that answer fan questions on behalf of content creators.
Voice mode was introduced to compete with ChatGPT's voice mode, while photo editing offers features surpassing ChatGPT’s image analysis. It allows users to remove objects, add hats, and change backgrounds. The voice translation feature makes it easy to create videos for international audiences.
The lightweight 1B and 3B models excel at multilingual text generation and tool invocation. Operating in a closed environment, developers can build applications optimized for individual devices without data leaving the device, ensuring strong privacy protection. The models run locally, providing near-instantaneous response speeds to prompts.
Apps will have explicit control over which queries are handled on-device and which are processed by larger models in the cloud.
These models have been optimized for use on Qualcomm and MediaTek hardware.
The Llama Stack API serves as an interface that standardizes toolchain components for customizing Llama models and building agent applications. To make the API easier to use, Meta has built reference implementations for inference, tool usage, and Retrieval-Augmented Generation (RAG). Furthermore, Meta has developed a Llama Stack distribution as a method for packaging multiple APIs, providing developers with a single endpoint. Currently, this feature allows Llama models to be deployed across multiple environments, including on-premises, in the cloud, on single nodes, and on devices.
Llama3.2 is available for download from llama.com and Hugging Face.
Reference: META announcement
Image: Shutterstock
Related articles:
Meta, which had high hopes for the spread of NFTs, withdraws
OpenAI announces inference AI "OpenAi o1"