| Availability |
Odoo Online
Odoo.sh
On Premise
|
| Odoo Apps Dependencies |
Discuss (mail)
|
| Community Apps Dependencies | Show |
| Lines of code | 8342 |
| Technical Name |
llm_knowledge_mistral |
| License | LGPL-3 |
| Website | https://github.com/apexive/odoo-llm |
| Versions | 16.0 18.0 |
| Availability |
Odoo Online
Odoo.sh
On Premise
|
| Odoo Apps Dependencies |
Discuss (mail)
|
| Community Apps Dependencies | Show |
| Lines of code | 8342 |
| Technical Name |
llm_knowledge_mistral |
| License | LGPL-3 |
| Website | https://github.com/apexive/odoo-llm |
| Versions | 16.0 18.0 |
Turn Images & Handwriting
Into Searchable Knowledge
Use Mistral AI's vision models to extract text from images, receipts, handwritten notes, and scanned documents. Make everything searchable in your knowledge base.
The Image Problem
Your knowledge base can search text documents, but what about images, scanned receipts, handwritten notes, and photos of documents? They're invisible to search.
Without OCR
- Images are just binary blobs
- Handwritten notes can't be searched
- Scanned documents are dead weight
- Receipts and invoices unusable
- Knowledge stays locked in images
With Mistral OCR
- AI extracts text from any image
- Handwriting becomes searchable
- Scanned docs fully indexed
- Receipt data automatically parsed
- Everything is findable
What Can It Parse?
Mistral's vision AI handles virtually any image with text
Handwritten Notes
Meeting notes, to-do lists, sticky notes, journal entries - any handwriting style
Scanned Documents
PDFs from scanners, faxes, photocopies, and document images with any layout
Receipts & Invoices
Extract data from receipts, invoices, bills, and financial documents
Screenshots
UI screenshots, error messages, dashboards, charts - extract all visible text
Product Labels
Packaging text, ingredient lists, warning labels, serial numbers
Forms & Tables
Structured data from forms, spreadsheets, tables, and grid layouts
How It Works
Automatic OCR processing powered by Mistral AI vision models
Upload Image
Add image to your knowledge collection (JPEG, PNG, PDF, etc.)
Select Mistral OCR
Choose Mistral OCR parser for the resource
AI Extracts Text
Mistral vision model reads and converts to markdown
Now Searchable
Text is chunked, embedded, and ready for AI search
See It In Action
Actual handwritten grocery list parsed by Mistral OCR
Original Image
Extracted Text
Result: The handwritten list is now fully searchable in your knowledge base. Ask your AI assistant "What items are on the grocery list?" and it will find this document and list all items.
How to Set It Up
Configure Mistral OCR for your knowledge base in minutes
Set Up Mistral OCR Models
The module comes pre-configured with Mistral's OCR models. View them under LLM â Configuration â Models, filtered by "ocr".
Available OCR Models:
- mistral-ocr-latest: Latest OCR model (recommended)
- mistral-ocr-2505: May 2025 version
- mistral-ocr-2503: March 2025 version
Use Mistral OCR Parser
When creating or editing a knowledge resource, select "Mistral OCR Parser" and choose your preferred OCR model. Upload images and the parser will automatically extract text.
Parser Configuration:
- Parser: Select "Mistral OCR Parser"
- Provider: Mistral AI (auto-selected)
- OCR Model: Choose mistral-ocr-latest or specific version
- Supported formats: Images (PNG, JPG, WEBP), PDFs, scanned documents
Process & Start Searching!
Click "Process Resources" to extract text from your images. Once processed, all text becomes searchable through your AI assistant. Ask questions about the content and get instant answers with source citations.
Processing Pipeline
When you process an image resource:
- Mistral OCR Parser sends image to Mistral AI vision model
- Vision model analyzes image and extracts all text content
- Extracted text is saved to the resource's content field
- Text is chunked using your collection's chunker settings
- Chunks are embedded and stored in your vector database
- AI can now search and cite this content in responses
Real-World Use Cases
How teams use Mistral OCR to unlock knowledge from images
Expense Management
Upload receipt photos, extract vendor, amount, date, and items automatically for searchable expense records
"Find all Starbucks receipts from last month"
AI: Found 8 receipts totaling $127.50 (Sources: receipt_001.jpg through receipt_008.jpg)
Meeting Notes Archive
Scan handwritten meeting notes and make every decision, action item, and idea searchable
"What did we decide about the Q4 budget?"
AI: "Approved $50K increase for marketing" (Source: Meeting Notes Oct 15, 2024)
Legacy Document Digitization
Convert old scanned contracts, faxes, and archived paperwork into searchable digital knowledge
"Find the 2015 lease agreement terms"
AI: "5-year lease at $2500/month" (Source: Scanned Lease Agreement 2015)
Product Catalog
Extract product specs, ingredients, and details from packaging photos to build searchable catalogs
"Which products contain wheat?"
AI: Found 12 products with wheat in ingredients (Sources: Product labels from catalog)
Powerful Features
Everything you need for vision-based knowledge extraction
Multi-Language Support
Extract text in multiple languages including English, French, Spanish, and more
Markdown Output
Extracted text formatted as clean markdown for better chunking and retrieval
Table Extraction
Preserves table structure and relationships when extracting from forms and spreadsheets
Multi-Page Processing
Handles multi-page PDFs and image sets with proper page organization
Image Attachment Handling
Embedded images are saved as Odoo attachments with proper references
Seamless Integration
Works with existing llm_knowledge collections - just select the Mistral OCR parser
Quick Setup
Get started in 4 steps
Install Dependencies
Requires llm_knowledge and llm_mistral modules
Install This Module
Search for "LLM Knowledge Mistral" in Apps and click Install
Set Up Mistral Provider
Go to LLM â Configuration â Providers
- Configure your Mistral AI provider with API key
- Click "Fetch Models" to download available OCR models
Use Mistral OCR Parser
When adding images to collections, select "Mistral OCR Parser" and choose an OCR model
LLM Knowledge Mistral
Turn images into searchable knowledge with Mistral AI's vision models.
Extract text from handwritten notes, receipts, scanned documents, screenshots, and product labels. Make every image searchable in your knowledge base with automatic OCR processing.
Overview
This module extends llm_knowledge with Mistral AI's vision capabilities, enabling OCR (Optical Character Recognition) for images and scanned documents. Upload an image, and Mistral's vision models extract all text content, making it fully searchable through your AI assistant.
The Problem
Without OCR:
- Images are just binary blobs in your knowledge base
- Handwritten notes can't be searched
- Scanned documents are dead weight
- Receipts and invoices are unusable
- Knowledge stays locked in images
The Solution
With Mistral OCR:
- AI extracts text from any image
- Handwriting becomes searchable
- Scanned docs fully indexed
- Receipt data automatically parsed
- Everything is findable
Features
Mistral Vision OCR
- State-of-the-art accuracy: Powered by Mistral's multimodal vision models
- Handwriting recognition: Extracts text from handwritten notes and forms
- Multi-format support: Images (PNG, JPG, WEBP), PDFs, scanned documents
- Automatic extraction: No manual data entry required
OCR Models
Three Mistral OCR models available:
- mistral-ocr-latest: Latest OCR model (recommended)
- mistral-ocr-2505: May 2025 version
- mistral-ocr-2503: March 2025 version
Mistral OCR Parser
- Seamless integration with llm_knowledge processing pipeline
- Automatic text extraction from image attachments
- Preserves original images while extracting text content
- Works with existing chunking and embedding systems
Installation
Install dependencies:
- llm_knowledge module (required)
- llm_mistral module (required)
Install this module:
# Via Odoo Apps interface Apps → Search "LLM Knowledge Mistral" → Install
Set up Mistral provider:
- Go to LLM → Configuration → Providers
- Configure your Mistral AI provider with API key
- Click "Fetch Models" to download available OCR models from Mistral
- This populates the OCR models list automatically
Configuration
Step 1: View OCR Models
The module comes pre-configured with Mistral's OCR models. View them under LLM → Configuration → Models, filtered by "ocr".
Available models:
- mistral-ocr-latest - Latest OCR model (recommended)
- mistral-ocr-2505 - May 2025 version
- mistral-ocr-2503 - March 2025 version
Step 2: Configure Parser
When creating or editing a knowledge resource:
- Select "Mistral OCR Parser" from the Parser dropdown
- Choose your preferred OCR model (mistral-ocr-latest recommended)
- Upload images as attachments
- Click "Process Resources"
Parser settings:
- Parser: Mistral OCR Parser
- Provider: Mistral AI (auto-selected)
- OCR Model: mistral-ocr-latest or specific version
- Supported formats: PNG, JPG, WEBP, PDF
Step 3: Process and Search
Click "Process Resources" to extract text from your images. The extracted text becomes searchable through your AI assistant.
Usage Examples
Handwritten Grocery List
Input: Photo of handwritten grocery list
Output: Extracted text
- potatoes - peas & carrots - pastina - garbage bags - dog treats - aluminum foil - almond milk - creamer - vanilla - eggs (2) - crushed tomatoes - hot sauce - paper towels?
Result: Fully searchable in knowledge base. Ask "What items are on the grocery list?" and AI finds and lists all items.
Expense Management
Goal: Track business expenses from receipt photos
Setup:
- Upload receipt photos to knowledge collection
- Use Mistral OCR Parser
- Process resources
Result: Extract vendor, amount, date, and items from receipts. Search "Find all Starbucks receipts from last month" → AI finds all matching receipts and totals.
Meeting Notes Archive
Goal: Make handwritten meeting notes searchable
Setup:
- Scan or photograph handwritten meeting notes
- Upload to knowledge base
- Process with Mistral OCR
Result: Every decision, action item, and idea becomes searchable. Ask "What did we decide about the Q4 budget?" → AI cites exact meeting notes.
Product Label Extraction
Goal: Index product information from label photos
Setup:
- Photograph product labels
- Add to product knowledge collection
- Process with Mistral OCR
Result: Extract ingredients, nutritional info, warnings, and instructions. AI can answer product questions using label data.
How It Works
Processing Pipeline
When you process an image resource with Mistral OCR:
- Upload: Attach image to llm.resource
- Parse: Mistral OCR Parser sends image to Mistral AI vision model
- Extract: Vision model analyzes image and extracts all text
- Save: Extracted text saved to resource's content field
- Chunk: Text chunked using collection's chunker settings
- Embed: Chunks embedded and stored in vector database
- Search: AI can now search and cite this content in responses
Supported Image Types
- Handwritten text: Notes, forms, letters
- Printed text: Documents, books, manuals
- Receipts: Business expenses, invoices
- Screenshots: Error messages, UI text
- Product labels: Ingredients, instructions
- Whiteboards: Brainstorming sessions, diagrams
- Forms: Filled-out applications, surveys
- Scanned documents: PDFs, legacy files
Technical Details
Mistral OCR Models
The available OCR models are fetched from Mistral AI when you configure the provider:
| Model | Description | Recommended |
|---|---|---|
| mistral-ocr-latest | Latest OCR model | ✓ Yes |
| mistral-ocr-2505 | May 2025 version | - |
| mistral-ocr-2503 | March 2025 version | - |
Note: You must set up the Mistral provider via llm_mistral module and click "Fetch Models" to download the available OCR models from Mistral AI.
Parser Registration
The Mistral OCR Parser is registered in models/mistral_resource_parser.py on the llm.resource model:
@api.model def _get_available_parsers(self): parsers = super()._get_available_parsers() parsers.extend([ ("mistral_ocr", "Mistral OCR Parser"), ]) return parsers
The parser method parse_mistral_ocr() processes images:
def parse_mistral_ocr(self, record, field): mimetype = field["mimetype"] if not self.llm_model_id or not self.llm_provider_id: raise ValueError("Please select a model and provider.") value = field["rawcontent"] ocr_response = self.llm_provider_id.process_ocr( self.llm_model_id.name, value, mimetype ) final_content = self._format_mistral_ocr_text(ocr_response, record.id) self.content = final_content return True
Fields Added to llm.resource
This module extends llm.resource with:
- llm_model_id: Many2one to OCR model (domain: model_use = 'ocr')
- llm_provider_id: Many2one to Mistral provider (domain: service = 'mistral')
Troubleshooting
OCR not extracting text
- Verify image quality is sufficient (not too blurry)
- Check Mistral API credentials are configured
- Review system logs for API errors
- Try different OCR model version
Handwriting not recognized
- Ensure handwriting is legible
- Use high-resolution images
- Try mistral-ocr-latest (best for handwriting)
- Avoid low-light or skewed photos
Wrong text extracted
- Check image orientation (rotate if needed)
- Verify image is not corrupted
- Ensure sufficient contrast between text and background
- Try cropping to focus on text area
Best Practices
- Image quality: Use high-resolution images (at least 1024px width)
- Lighting: Ensure good lighting and contrast
- Orientation: Rotate images to correct orientation before upload
- File format: Use PNG or JPG for best results
- File size: Keep images under 10MB for faster processing
- Batch processing: Process multiple images at once for efficiency
Requirements
- Odoo: 18.0+
- Python: 3.11+
- Dependencies:
- llm_knowledge module
- llm_mistral module
- API: Mistral AI API key required
License
LGPL-3
Contributing
Issues and pull requests welcome at https://github.com/apexive/odoo-llm
Please log in to comment on this module