LLM usage is optional and must be explicitly configured. You can maintain full control by running local models on your own systems through Ollama, LM Studio, or similar tools. No data is sent to external services unless you configure a cloud provider.
Configuration via settings
Configure AI settings through the web interface:- Go to Settings → Self-Hosting
- Scroll to the AI Provider section
- Configure:
- OpenAI Access Token - Your API key
- OpenAI URI Base - Custom endpoint (leave blank for OpenAI)
- OpenAI Model - Model name (required for custom endpoints)
OpenAI compatible API
Sure supports any OpenAI-compatible API endpoint, giving you flexibility to use:- OpenAI - Direct access to GPT models
- Ollama - Run models locally on your hardware
- LM Studio - Local model hosting with a GUI
- OpenRouter - Access to multiple providers (Anthropic, Google, etc.)
- Other providers - Groq, Together AI, Anyscale, Replicate, and more
OpenAI
gpt-4.1- Default, best balance of speed and qualitygpt-5- Latest model, highest qualitygpt-4o-mini- Cheaper, good quality
Ollama (local)
LM Studio (local)
- Download from lmstudio.ai
- Download a model through the UI
- Start the local server
- Configure Sure:
OpenRouter
Access multiple providers through a single API:google/gemini-2.5-flash- Fast and capableanthropic/claude-sonnet-4.5- Excellent reasoninganthropic/claude-haiku-4.5- Fast and cost-effective
Evaluation system
Test and compare different LLMs for your specific use case. The eval system helps you benchmark models for transaction categorization, merchant detection, and chat assistant functionality. See the evaluation framework documentation for details on:- Running evaluations
- Comparing models
- Creating custom datasets
- Langfuse integration for tracking experiments