Skip to content

Anthropic

llmprof speaks the Anthropic Messages API (/v1/messages) as well as OpenAI. It attributes the system prompt, message history, tool schemas, tool calls, and tool results, and reads Anthropic’s usage (including cache reads).

One instance routes Anthropic and OpenAI traffic to their own upstreams, so no flags are needed:

Terminal window
llmprof up

(To point at a non-default Anthropic-compatible host, use --anthropic-upstream.)

Point the client’s base URL at the proxy

Section titled “Point the client’s base URL at the proxy”
import anthropic
client = anthropic.Anthropic(base_url="http://localhost:4000") # api key unchanged
client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
system="You are a helpful assistant.",
messages=[{"role": "user", "content": "Summarize this contract..."}],
tools=[...],
)

Or set it via environment variable:

Terminal window
export ANTHROPIC_BASE_URL=http://localhost:4000
  • Prompt caching is detected: when Anthropic reports cache_read_input_tokens, the call is marked as cached and the waste detector stops suggesting caching for it.
  • Claude model pricing (3.x and 4.x families, including the newer Opus tiers) is built in; see Providers & pricing.
  • For the Claude Code CLI specifically, see Claude Code.