LLM RAG Mistral

v 18.0 Third Party 40

Availability	Odoo Online Odoo.sh On Premise
Odoo Apps Dependencies	Discuss (mail)
Community Apps Dependencies	Show • LLM Knowledge • LLM Tool • Mistral AI LLM Integration • LLM Integration Base • LLM Vector Store Base • OpenAI LLM Integration • LLM Training Management • Web JSON Editor
Lines of code	8690
Technical Name	`llm_knowledge_mistral`
License	LGPL-3
Website	https://github.com/apexive/odoo-llm
Versions	16.0 18.0

You bought this module and need support? Click here!

Availability	Odoo Online Odoo.sh On Premise
Odoo Apps Dependencies	Discuss (mail)
Community Apps Dependencies	Show • LLM Knowledge • LLM Tool • Mistral AI LLM Integration • LLM Integration Base • LLM Vector Store Base • OpenAI LLM Integration • LLM Training Management • Web JSON Editor
Lines of code	8690
Technical Name	`llm_knowledge_mistral`
License	LGPL-3
Website	https://github.com/apexive/odoo-llm
Versions	16.0 18.0

Description
Documentation

Mistral Vision OCR for Knowledge Base

Turn Images & Handwriting
Into Searchable Knowledge

Use Mistral AI's vision models to extract text from images, receipts, handwritten notes, and scanned documents. Make everything searchable in your knowledge base.

Vision AI

OCR Parsing

Auto-Extract

The Image Problem

Your knowledge base can search text documents, but what about images, scanned receipts, handwritten notes, and photos of documents? They're invisible to search.

Without OCR

Images are just binary blobs
Handwritten notes can't be searched
Scanned documents are dead weight
Receipts and invoices unusable
Knowledge stays locked in images

With Mistral OCR

AI extracts text from any image
Handwriting becomes searchable
Scanned docs fully indexed
Receipt data automatically parsed
Everything is findable

What Can It Parse?

Mistral's vision AI handles virtually any image with text

Handwritten Notes

Meeting notes, to-do lists, sticky notes, journal entries - any handwriting style

Scanned Documents

PDFs from scanners, faxes, photocopies, and document images with any layout

Receipts & Invoices

Extract data from receipts, invoices, bills, and financial documents

Screenshots

UI screenshots, error messages, dashboards, charts - extract all visible text

Product Labels

Packaging text, ingredient lists, warning labels, serial numbers

Forms & Tables

Structured data from forms, spreadsheets, tables, and grid layouts

How It Works

Automatic OCR processing powered by Mistral AI vision models

Upload Image

Add image to your knowledge collection (JPEG, PNG, PDF, etc.)

Select Mistral OCR

Choose Mistral OCR parser for the resource

AI Extracts Text

Mistral vision model reads and converts to markdown

Now Searchable

Text is chunked, embedded, and ready for AI search

See It In Action

Actual handwritten grocery list parsed by Mistral OCR

Original Image

Extracted Text

- potatoes
- peas & carrots
- pastina
- garbage bags
- dog treats
- aluminum foil
- almond milk
- creamer - vanilla
- eggs (2)
- crushed tomatoes
- hot sauce
- paper towels?

Result: The handwritten list is now fully searchable in your knowledge base. Ask your AI assistant "What items are on the grocery list?" and it will find this document and list all items.

How to Set It Up

Configure Mistral OCR for your knowledge base in minutes

Set Up Mistral OCR Models

The module comes pre-configured with Mistral's OCR models. View them under LLM â Configuration â Models, filtered by "ocr".

Available OCR Models:

mistral-ocr-latest: Latest OCR model (recommended)
mistral-ocr-2505: May 2025 version
mistral-ocr-2503: March 2025 version

Note: Set up the Mistral provider via llm_mistral module and click "Fetch Models" to download the available OCR models from Mistral AI.

Use Mistral OCR Parser

When creating or editing a knowledge resource, select "Mistral OCR Parser" and choose your preferred OCR model. Upload images and the parser will automatically extract text.

Parser Configuration:

Parser: Select "Mistral OCR Parser"
Provider: Mistral AI (auto-selected)
OCR Model: Choose mistral-ocr-latest or specific version
Supported formats: Images (PNG, JPG, WEBP), PDFs, scanned documents

Process & Start Searching!

Click "Process Resources" to extract text from your images. Once processed, all text becomes searchable through your AI assistant. Ask questions about the content and get instant answers with source citations.

Processing Pipeline

When you process an image resource:

Mistral OCR Parser sends image to Mistral AI vision model
Vision model analyzes image and extracts all text content
Extracted text is saved to the resource's content field
Text is chunked using your collection's chunker settings
Chunks are embedded and stored in your vector database
AI can now search and cite this content in responses

Real-World Use Cases

How teams use Mistral OCR to unlock knowledge from images

Expense Management

Upload receipt photos, extract vendor, amount, date, and items automatically for searchable expense records

"Find all Starbucks receipts from last month"

AI: Found 8 receipts totaling $127.50 (Sources: receipt_001.jpg through receipt_008.jpg)

Meeting Notes Archive

Scan handwritten meeting notes and make every decision, action item, and idea searchable

"What did we decide about the Q4 budget?"

AI: "Approved $50K increase for marketing" (Source: Meeting Notes Oct 15, 2024)

Legacy Document Digitization

Convert old scanned contracts, faxes, and archived paperwork into searchable digital knowledge

"Find the 2015 lease agreement terms"

AI: "5-year lease at $2500/month" (Source: Scanned Lease Agreement 2015)

Product Catalog

Extract product specs, ingredients, and details from packaging photos to build searchable catalogs

"Which products contain wheat?"

AI: Found 12 products with wheat in ingredients (Sources: Product labels from catalog)

Powerful Features

Everything you need for vision-based knowledge extraction

Multi-Language Support

Extract text in multiple languages including English, French, Spanish, and more

Markdown Output

Extracted text formatted as clean markdown for better chunking and retrieval

Table Extraction

Preserves table structure and relationships when extracting from forms and spreadsheets

Multi-Page Processing

Handles multi-page PDFs and image sets with proper page organization

Image Attachment Handling

Embedded images are saved as Odoo attachments with proper references

Seamless Integration

Works with existing llm_knowledge collections - just select the Mistral OCR parser

Quick Setup

Get started in 4 steps

Install Dependencies

Requires llm_knowledge and llm_mistral modules

Install This Module

Search for "LLM Knowledge Mistral" in Apps and click Install

Set Up Mistral Provider

Go to LLM â Configuration â Providers

Configure your Mistral AI provider with API key
Click "Fetch Models" to download available OCR models

Use Mistral OCR Parser

When adding images to collections, select "Mistral OCR Parser" and choose an OCR model

LLM Knowledge Mistral

Turn images into searchable knowledge with Mistral AI's vision models.

Module Type: 🔌 Extension (OCR for Knowledge)

Architecture

┌───────────────────────────────────────────────────────────────┐
│                      Image Sources                            │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────┐   │
│  │  Receipts   │  │ Handwritten │  │   Scanned Docs      │   │
│  └──────┬──────┘  └──────┬──────┘  └──────────┬──────────┘   │
└─────────┼────────────────┼────────────────────┼──────────────┘
          └────────────────┼────────────────────┘
                           ▼
              ┌───────────────────────────────────────────┐
              │  ★ llm_knowledge_mistral (This Module) ★  │
              │         Mistral OCR Parser                │
              │  👁️ Vision │ 📝 Text Extract │ 🔍 Index   │
              └─────────────────────┬─────────────────────┘
                                    │
                        ┌───────────┴───────────┐
                        ▼                       ▼
    ┌───────────────────────────┐   ┌───────────────────────────┐
    │       llm_knowledge       │   │        llm_mistral        │
    │      (RAG Pipeline)       │   │    (Mistral Provider)     │
    └───────────────────────────┘   └───────────────────────────┘

Installation

What to Install

For image OCR in knowledge base:

odoo-bin -d your_db -i llm_knowledge_mistral

Auto-Installed Dependencies

llm (core infrastructure)
llm_knowledge (RAG infrastructure)
llm_mistral (Mistral AI provider)

Why Use This Module?

Feature	llm_knowledge_mistral
OCR	👁️ Mistral vision models
Handwriting	✍️ Handwritten text support
Multi-format	📄 PDF, PNG, JPG, WEBP
Searchable	🔍 Images become searchable

Common Setups

I want to...	Install
OCR + RAG	`llm_knowledge_mistral` + `llm_pgvector`
Chat + OCR + RAG	`llm_assistant` + `llm_openai` + `llm_knowledge_mistral` + `llm_pgvector`

Extract text from handwritten notes, receipts, scanned documents, screenshots, and product labels. Make every image searchable in your knowledge base with automatic OCR processing.

Overview

This module extends llm_knowledge with Mistral AI's vision capabilities, enabling OCR (Optical Character Recognition) for images and scanned documents. Upload an image, and Mistral's vision models extract all text content, making it fully searchable through your AI assistant.

The Problem

Without OCR:

Images are just binary blobs in your knowledge base
Handwritten notes can't be searched
Scanned documents are dead weight
Receipts and invoices are unusable
Knowledge stays locked in images

The Solution

With Mistral OCR:

AI extracts text from any image
Handwriting becomes searchable
Scanned docs fully indexed
Receipt data automatically parsed
Everything is findable

Features

Mistral Vision OCR

State-of-the-art accuracy: Powered by Mistral's multimodal vision models
Handwriting recognition: Extracts text from handwritten notes and forms
Multi-format support: Images (PNG, JPG, WEBP), PDFs, scanned documents
Automatic extraction: No manual data entry required

OCR Models

Three Mistral OCR models available:

mistral-ocr-latest: Latest OCR model (recommended)
mistral-ocr-2505: May 2025 version
mistral-ocr-2503: March 2025 version

Mistral OCR Parser

Seamless integration with llm_knowledge processing pipeline
Automatic text extraction from image attachments
Preserves original images while extracting text content
Works with existing chunking and embedding systems

Installation

Install dependencies:
- llm_knowledge module (required)
- llm_mistral module (required)

Install this module:

# Via Odoo Apps interface
Apps → Search "LLM Knowledge Mistral" → Install

Set up Mistral provider:
- Go to LLM → Configuration → Providers
- Configure your Mistral AI provider with API key
- Click "Fetch Models" to download available OCR models from Mistral
- This populates the OCR models list automatically

Configuration

Step 1: View OCR Models

The module comes pre-configured with Mistral's OCR models. View them under LLM → Configuration → Models, filtered by "ocr".

Available models:

mistral-ocr-latest - Latest OCR model (recommended)
mistral-ocr-2505 - May 2025 version
mistral-ocr-2503 - March 2025 version

Step 2: Configure Parser

When creating or editing a knowledge resource:

Select "Mistral OCR Parser" from the Parser dropdown
Choose your preferred OCR model (mistral-ocr-latest recommended)
Upload images as attachments
Click "Process Resources"

Parser settings:

Parser: Mistral OCR Parser
Provider: Mistral AI (auto-selected)
OCR Model: mistral-ocr-latest or specific version
Supported formats: PNG, JPG, WEBP, PDF

Step 3: Process and Search

Click "Process Resources" to extract text from your images. The extracted text becomes searchable through your AI assistant.

Usage Examples

Handwritten Grocery List

Input: Photo of handwritten grocery list

Output: Extracted text

- potatoes
- peas & carrots
- pastina
- garbage bags
- dog treats
- aluminum foil
- almond milk
- creamer - vanilla
- eggs (2)
- crushed tomatoes
- hot sauce
- paper towels?

Result: Fully searchable in knowledge base. Ask "What items are on the grocery list?" and AI finds and lists all items.

Expense Management

Goal: Track business expenses from receipt photos

Setup:

Upload receipt photos to knowledge collection
Use Mistral OCR Parser
Process resources

Result: Extract vendor, amount, date, and items from receipts. Search "Find all Starbucks receipts from last month" → AI finds all matching receipts and totals.

Meeting Notes Archive

Goal: Make handwritten meeting notes searchable

Setup:

Scan or photograph handwritten meeting notes
Upload to knowledge base
Process with Mistral OCR

Result: Every decision, action item, and idea becomes searchable. Ask "What did we decide about the Q4 budget?" → AI cites exact meeting notes.

Product Label Extraction

Goal: Index product information from label photos

Setup:

Photograph product labels
Add to product knowledge collection
Process with Mistral OCR

Result: Extract ingredients, nutritional info, warnings, and instructions. AI can answer product questions using label data.

How It Works

Processing Pipeline

When you process an image resource with Mistral OCR:

Upload: Attach image to llm.resource
Parse: Mistral OCR Parser sends image to Mistral AI vision model
Extract: Vision model analyzes image and extracts all text
Save: Extracted text saved to resource's content field
Chunk: Text chunked using collection's chunker settings
Embed: Chunks embedded and stored in vector database
Search: AI can now search and cite this content in responses

Supported Image Types

Handwritten text: Notes, forms, letters
Printed text: Documents, books, manuals
Receipts: Business expenses, invoices
Screenshots: Error messages, UI text
Product labels: Ingredients, instructions
Whiteboards: Brainstorming sessions, diagrams
Forms: Filled-out applications, surveys
Scanned documents: PDFs, legacy files

Technical Details

Mistral OCR Models

The available OCR models are fetched from Mistral AI when you configure the provider:

Model	Description	Recommended
`mistral-ocr-latest`	Latest OCR model	✓ Yes
`mistral-ocr-2505`	May 2025 version	-
`mistral-ocr-2503`	March 2025 version	-

Note: You must set up the Mistral provider via llm_mistral module and click "Fetch Models" to download the available OCR models from Mistral AI.

Parser Registration

The Mistral OCR Parser is registered in models/mistral_resource_parser.py on the llm.resource model:

@api.model
def _get_available_parsers(self):
    parsers = super()._get_available_parsers()
    parsers.extend([
        ("mistral_ocr", "Mistral OCR Parser"),
    ])
    return parsers

The parser method parse_mistral_ocr() processes images:

def parse_mistral_ocr(self, record, field):
    mimetype = field["mimetype"]
    if not self.llm_model_id or not self.llm_provider_id:
        raise ValueError("Please select a model and provider.")
    value = field["rawcontent"]
    ocr_response = self.llm_provider_id.process_ocr(
        self.llm_model_id.name, value, mimetype
    )
    final_content = self._format_mistral_ocr_text(ocr_response, record.id)
    self.content = final_content
    return True

Fields Added to llm.resource

This module extends llm.resource with:

llm_model_id: Many2one to OCR model (domain: model_use = 'ocr')
llm_provider_id: Many2one to Mistral provider (domain: service = 'mistral')

Troubleshooting

OCR not extracting text

Verify image quality is sufficient (not too blurry)
Check Mistral API credentials are configured
Review system logs for API errors
Try different OCR model version

Handwriting not recognized

Ensure handwriting is legible
Use high-resolution images
Try mistral-ocr-latest (best for handwriting)
Avoid low-light or skewed photos

Wrong text extracted

Check image orientation (rotate if needed)
Verify image is not corrupted
Ensure sufficient contrast between text and background
Try cropping to focus on text area

Best Practices

Image quality: Use high-resolution images (at least 1024px width)
Lighting: Ensure good lighting and contrast
Orientation: Rotate images to correct orientation before upload
File format: Use PNG or JPG for best results
File size: Keep images under 10MB for faster processing
Batch processing: Process multiple images at once for efficiency

Requirements

Odoo: 18.0+
Python: 3.11+
Dependencies:
- llm_knowledge module
- llm_mistral module
API: Mistral AI API key required

License

LGPL-3

Author

Apexive Solutions LLC

Website: https://github.com/apexive/odoo-llm
Email: info@apexive.com

Contributing

Issues and pull requests welcome at https://github.com/apexive/odoo-llm

Please log in to comment on this module

The author can leave a single reply to each comment.
This section is meant to ask simple questions or leave a rating. Every report of a problem experienced while using the module should be addressed to the author directly (refer to the following point).
If you want to start a discussion with the author, please use the developer contact information. They can usually be found in the description.

LLM RAG Mistral

Turn Images & Handwriting Into Searchable Knowledge

The Image Problem

Without OCR

With Mistral OCR

What Can It Parse?

Handwritten Notes

Scanned Documents

Receipts & Invoices

Screenshots

Product Labels

Forms & Tables

How It Works

Upload Image

Select Mistral OCR

AI Extracts Text

Now Searchable

See It In Action

Original Image

Extracted Text

How to Set It Up

Set Up Mistral OCR Models

Use Mistral OCR Parser

Process & Start Searching!

Processing Pipeline

Real-World Use Cases

Expense Management

Meeting Notes Archive

Legacy Document Digitization

Product Catalog

Powerful Features

Multi-Language Support

Markdown Output

Table Extraction

Multi-Page Processing

Image Attachment Handling

Seamless Integration

Quick Setup

Install Dependencies

Install This Module

Set Up Mistral Provider

Use Mistral OCR Parser

LLM Knowledge Mistral

Architecture

Installation

What to Install

Auto-Installed Dependencies

Why Use This Module?

Common Setups

Overview

The Problem

The Solution

Features

Mistral Vision OCR

OCR Models

Mistral OCR Parser

Installation

Configuration

Step 1: View OCR Models

Step 2: Configure Parser

Step 3: Process and Search

Usage Examples

Handwritten Grocery List

Expense Management

Meeting Notes Archive

Product Label Extraction

How It Works

Processing Pipeline

Supported Image Types

Technical Details

Mistral OCR Models

Parser Registration

Fields Added to llm.resource

Troubleshooting

OCR not extracting text

Handwriting not recognized

Wrong text extracted

Best Practices

Requirements

License

Turn Images & Handwriting
Into Searchable Knowledge