🚨 Alerting
Get alerts for:
- Hanging LLM api calls
- Failed LLM api calls
- Slow LLM api calls
- Budget Tracking per key/user:
- When a User/Key crosses their Budget
- When a User/Key is 15% away from crossing their Budget
- Spend Reports - Weekly & Monthly spend per Team, Tag
- Failed db read/writes
As a bonus, you can also get "daily reports" posted to your slack channel. These reports contain key metrics like:
- Top 5 deployments with most failed requests
- Top 5 slowest deployments
Quick Start
Set up a slack alert channel to receive alerts from proxy.
Step 1: Add a Slack Webhook URL to env
Get a slack webhook url from https://api.slack.com/messaging/webhooks
Step 2: Update config.yaml
- Set
SLACK_WEBHOOK_URL
in your proxy env to enable Slack alerts. - Just for testing purposes, let's save a bad key to our proxy.
model_list:
model_name: "azure-model"
litellm_params:
model: "azure/gpt-35-turbo"
api_key: "my-bad-key" # 👈 bad key
general_settings:
alerting: ["slack"]
alerting_threshold: 300 # sends alerts if requests hang for 5min+ and responses take 5min+
environment_variables:
SLACK_WEBHOOK_URL: "https://hooks.slack.com/services/<>/<>/<>"
SLACK_DAILY_REPORT_FREQUENCY: "86400" # 24 hours; Optional: defaults to 12 hours
Step 3: Start proxy
$ litellm --config /path/to/config.yaml