Anthropic
llmprof speaks the Anthropic Messages API (/v1/messages) as well as OpenAI. It
attributes the system prompt, message history, tool schemas, tool calls, and
tool results, and reads Anthropic’s usage (including cache reads).
Start the proxy
Section titled “Start the proxy”One instance routes Anthropic and OpenAI traffic to their own upstreams, so no flags are needed:
llmprof up(To point at a non-default Anthropic-compatible host, use
--anthropic-upstream.)
Point the client’s base URL at the proxy
Section titled “Point the client’s base URL at the proxy”import anthropic
client = anthropic.Anthropic(base_url="http://localhost:4000") # api key unchanged
client.messages.create( model="claude-sonnet-4-6", max_tokens=1024, system="You are a helpful assistant.", messages=[{"role": "user", "content": "Summarize this contract..."}], tools=[...],)Or set it via environment variable:
export ANTHROPIC_BASE_URL=http://localhost:4000- Prompt caching is detected: when Anthropic reports
cache_read_input_tokens, the call is marked as cached and the waste detector stops suggesting caching for it. - Claude model pricing (3.x and 4.x families, including the newer Opus tiers) is built in; see Providers & pricing.
- For the Claude Code CLI specifically, see Claude Code.