Ollama Profiler
Ollama Profiler is a performance profiling tool for Ollama models. It benchmarks and compares local LLM performance side-by-side, measuring tokens/sec, time-to-first-token, and other runtime metrics across multiple models using the same prompt.
Loading...
Features
- Side-by-side model comparison — run the same prompt against multiple models and compare performance directly
- Dual interface (TUI + CLI) — interactive terminal UI by default; pass model names as arguments for non-interactive CLI mode
- Fair scheduling strategies — sequential, round-robin, randomized rounds, or Latin-square balanced ordering
- Statistical summary — mean, standard deviation, min/max, relative percentages, and color-coded winners
- Multiple export formats — JSON (with metadata), self-contained HTML reports, and retina PNG charts
- Reproducibility — deterministic seed, configurable token limits, warmup runs, and cooldown periods
- Remote server support — target a remote Ollama instance with
--url
Installation
Install via pip, Go, or pre-compiled binary:
# Run without installing
uvx ollama-profiler
# Install from PyPI
pip install ollama-profiler
# Install via Go
go install github.com/piercecohen1/ollama-profiler@latest
Usage
Interactive TUI
Launch the TUI to select models and configure benchmarks interactively:
ollama-profiler
CLI Mode
Pass model names directly for non-interactive benchmarking:
ollama-profiler llama3 mistral phi3 --rounds 5 --prompt "Explain quantum computing"
Export Results
# Export as JSON
ollama-profiler llama3 mistral --export results.json
# Export as HTML report
ollama-profiler llama3 mistral --export report.html
# Export as PNG chart
ollama-profiler llama3 mistral --export chart.png
Advanced Options
# Use warmup and cooldown for rigorous benchmarking
ollama-profiler llama3 mistral --warmup 2 --cooldown 5
# Latin-square scheduling for statistically fair comparisons
ollama-profiler llama3 mistral phi3 --schedule latin-square
# Target a remote Ollama server
ollama-profiler llama3 --url http://remote-host:11434
License
This project is released under the MIT License.