usertrust
API Reference

Pricing

Usertoken costs per model, estimation functions, and the fallback rate.

All costs in the usertrust SDK are denominated in usertokens (UT).

1 UT = $0.0001 (one basis point of a cent).

A budget of 100_000 UT equals $10.00. A budget of 1_000_000 UT equals $100.00.

Pricing Table

The SDK includes a 20-model pricing table in ledger/pricing.ts. Rates are in usertokens per 1,000 LLM tokens.

Anthropic

ModelInput (per 1K tokens)Output (per 1K tokens)
claude-sonnet-4-630150
claude-haiku-4-51050
claude-opus-4-650250

OpenAI

ModelInput (per 1K tokens)Output (per 1K tokens)
gpt-4o25100
gpt-4o-mini1.56
gpt-5.425150
o32080
o4-mini5.522

Google Gemini

ModelInput (per 1K tokens)Output (per 1K tokens)
gemini-2.5-flash325
gemini-2.5-pro12.5100
gemini-3.1-pro20120

Other Providers

ModelProviderInput (per 1K tokens)Output (per 1K tokens)
mistral-largeMistral515
deepseek-chatDeepSeek2.84.2
deepseek-reasonerDeepSeek2.84.2
grok-3xAI30150
llama-4-maverickMeta (Bedrock)2.49.7
command-aCohere25100
sonar-proPerplexity30150
qwen-72bAlibaba2.93.9
nova-proAmazon832

Rates reflect the SDK's built-in table. Check packages/core/src/ledger/pricing.ts for the canonical source.

Fallback Rate

Models not in the pricing table use sonnet-class pricing as a conservative fallback:

DirectionRate (per 1K tokens)
Input30 UT
Output150 UT

This ensures unknown models are never free. The fallback is intentionally high to avoid under-billing.

Model Matching

getModelRates() first attempts an exact match against the pricing table. If no exact match is found, it tries prefix matching (longest key first) to handle versioned model names. If neither matches, the fallback rate is used.

getModelRates("claude-sonnet-4-6")         // exact match → { 30, 150 }
getModelRates("claude-sonnet-4-6-20260301") // prefix match → { 30, 150 }
getModelRates("unknown-model")              // fallback → { 30, 150 }

Functions

getModelRates()

function getModelRates(model: string): ModelRates

Look up the input and output rates for a model. Falls back to prefix matching, then to the fallback rate.

estimateCost()

function estimateCost(model: string, inputTokens: number, outputTokens: number): number

Calculate the cost in usertokens for a model call given actual token counts. The result is always at least 1 (floor to prevent zero-amount transfers in TigerBeetle).

// 1,000 input tokens + 500 output tokens on claude-sonnet-4-6
estimateCost("claude-sonnet-4-6", 1000, 500)
// → Math.max(1, Math.ceil((1000/1000 * 30) + (500/1000 * 150)))
// → Math.max(1, Math.ceil(30 + 75))
// → 105 UT

estimateInputTokens()

function estimateInputTokens(messages: unknown[]): number

Estimate the input token count from a messages array before the LLM call. Uses a heuristic of approximately 4 characters per token with a 1.5x safety margin so the PENDING hold exceeds actual cost in the majority of cases.

This function handles:

  • String content and multi-part content blocks
  • Nested arrays (tool result payloads)
  • Per-message overhead (role, structure)
  • Tool-call overhead

Returns at least 1.

ModelRates

interface ModelRates {
  inputPer1k: number;   // usertokens per 1,000 input tokens
  outputPer1k: number;  // usertokens per 1,000 output tokens
}

Cost Estimation Flow

During the two-phase spend lifecycle, costs are estimated twice:

  1. Before the call (PENDING): estimateInputTokens() produces a conservative input estimate. Combined with max_tokens for the output estimate, estimateCost() calculates the hold amount. The 1.5x safety margin means the hold usually exceeds actual cost.

  2. After the call (POST): Actual token counts from the provider response (or accumulated from stream chunks) produce the real cost. The difference between the hold and the actual cost is released back to the available budget.