What to know
- Mistral AI has launched a new multimodal AI model called Pixtral Large, featuring 124 billion parameters.
- It performs well in benchmarks such as MathVista, DocVQA, and ChartQA, surpassing several leading models.
- Pixtral Large supports multilingual optical character recognition (OCR) and can analyze documents, charts, and images effectively.
- The Le Chat platform has been upgraded with several new features, including web search capabilities with citations, and a Canvas tool for content editing.
Mistral AI has introduced significant updates in the AI field with its latest offerings. The company launched Pixtral Large, a multimodal AI model that includes a 124 billion parameter multimodal decoder and a 1 billion parameter vision encoder, allowing for the processing of both text and images. The model has a context window of 128,000 tokens, enabling it to handle up to 30 high-resolution images or approximately a 300-page document in a single input.
Pixtral Large performs well in various benchmarks, achieving notable scores in MathVista for mathematical reasoning, DocVQA for document question answering, and ChartQA for chart analysis. It surpasses several leading models, including GPT-4o and Gemini-1.5 Pro. The model is capable of understanding and analyzing documents, charts, and natural images. It also supports multilingual optical character recognition (OCR), which enhances its functionality in practical applications.
Pixtral Large can perform tasks such as analyzing receipts, calculating totals, and interpreting complex data visualizations. Its design makes it suitable for environments that require document analysis and image understanding.
The model is available under a custom Mistral AI Research License for academic use and a commercial license for business applications. These features make Pixtral Large a useful tool for organizations looking to utilize AI for data processing tasks.
- Download Pixtral Large from Hugging Face
Mistral also introduced a new version of its flagship text-only model line, Mistral Large. Named Mistral Large 24.11, the updated model offers “significant improvements” in long context understanding, making it ideal for applications such as document analysis and task automation.
Alongside Pixtral Large, Mistral has enhanced its Le Chat platform. This generative AI assistant can now perform web searches, providing users with citations similar to those found in other AI tools.
The new “Canvas” tool allows users to edit and transform content easily, enabling the creation of documents, presentations, and code without needing to regenerate.
Le Chat’s features have expanded further as it can now analyze and summarize complex PDF documents and images. This capability is particularly useful for professionals needing to extract information from extensive materials. Additionally, Le Chat includes advanced image generation through a partnership with Black Forest Labs, allowing users to create visuals directly within the platform.
To improve efficiency, Mistral has introduced “agents” that automate repetitive tasks like expense reporting and invoice processing. These features make Le Chat a strong alternative to existing AI productivity tools, especially for students and professionals looking for effective solutions. All these enhancements are available for free during the beta phase, allowing users to explore Mistral’s offerings while the company continues to refine its services.
Discussion