LLM Echo 0.4 Adds Input and Output Token Tracking

LLM Echo 0.4: Track Input & Output Tokens to Cut AI Costs (2026)

LLM Echo 0.4, developed by Simon Willison, now includes native input_tokens and output_tokens fields in API responses — giving developers precise, real-time visibility into LLM usage. This update transforms LLM Echo from a simple proxy into a powerful tool for cost optimization, prompt efficiency, and AI accountability.

Why Token Tracking Matters for Cost Control

Token consumption directly impacts inference costs on platforms like OpenAI, Anthropic, and others. Before version 0.4, developers relied on external estimators or manual calculations — often leading to budget overruns. Now, with input_tokens and output_tokens returned in every response, teams can automate cost tracking, set token budgets, and identify inefficient prompts before they scale.

How to Use input_tokens and output_tokens in Your API Pipeline

Integration is seamless. No changes to auth or endpoints are needed. Simply parse the JSON response:

{
  "response": "The capital of France is Paris.",
  "input_tokens": 12,
  "output_tokens": 7
}

Use these values to calculate cost per request (e.g., $0.000015 per input token on GPT-4), monitor token-to-value ratios, or trigger alerts when usage exceeds thresholds. Python, Node.js, and other stacks support this natively.

Simon Willison’s Vision for Transparent AI

According to Simon Willison’s official blog, this update reflects a broader philosophy: "AI tools should make the invisible visible." By exposing token metrics, LLM Echo 0.4 empowers developers to understand model behavior, reduce waste, and build ethical, audit-ready AI systems — critical for enterprise adoption and compliance.

Real-World Use Cases: From Education to Enterprise

Education: Instructors use LLM Echo 0.4 to show how concise prompts reduce output tokens — teaching prompt engineering through data.
Startups: Teams monitor token usage across 50+ microservices to cap monthly LLM spend at $500.
Enterprise: Compliance teams audit token logs for regulatory reporting and sustainability disclosures.

Zero Disruption, Maximum Impact

LLM Echo 0.4 maintains full backward compatibility. Existing applications continue working without changes. The new fields are purely additive — no dependency updates, no breaking changes. This design encourages organic adoption across teams, from hobbyists to Fortune 500 companies.

As AI becomes mission-critical, transparency isn’t optional — it’s foundational. LLM Echo 0.4 delivers the metrics developers need to build efficient, cost-aware, and accountable large language model systems. Download LLM Echo 0.4 on GitHub and explore the official release notes for full examples and schema details.

AI-Powered Content

Sources: Simon Willison Blog • OpenAI Tokenization Guide • LLM Echo 0.3 Release Notes