We're in development! Things may crash or break

v1.0 · sdk for openai · anthropic · gemini

You have an AI bill. You have no idea what's in it.

Two lines of code. Every LLM request in your app, tracked by feature, user, and model. Finally know which part of your product is burning money.

Wraps OpenAI, Anthropic, Gemini, and any OpenAI-compatible API. No architecture changes. No proxy.

LIVE · prod.us-east-1
spend.today
total estimated spend · today
$41.83
updated 09:04:49 AM
chat$29.0769.50%
summarize$9.4122.50%
embed$3.358.00%
requests/sec · 14.2p95 latency · 412ms
02 · the situation

Does this sound familiar?

case 01verbatim
Our OpenAI bill was $1,200 last month. I spent two hours trying to figure out what changed between October and November. Still don't know.
Every team we talked to has had this conversation.
case 02verbatim
I added a userId tag to every API call by hand, built a spreadsheet, and checked it every Friday. That's how we tracked AI costs. For six months.
Engineering time is expensive. This shouldn't be a spreadsheet problem.
case 03verbatim
We launched a summarization feature. Two weeks later our bill tripled. Turns out we were using GPT-4 for something gpt-4o-mini handles just as well at 1/20th the cost.
The model was right. The routing wasn't. You can't fix what you can't see.
03 · the product

From invisible API spend to operational data — in two lines of code.

server/openai.ts
1npm install @trackllm/sdk
2 
3import { track } from '@trackllm/sdk'
4import OpenAI from 'openai'
5 
6const openai = track(new OpenAI(), {
7 projectKey: 'proj_xxxx',
8 feature: 'chat',
9 userId: req.user.id
10})

// That's it. Every call you make through this client is now tracked — with model, tokens, cost, latency, user and feature attached.

cost · last 7 days
$519.82
chat$412.30 +22%
summarize$89.40 −4%
embed$18.12 +1%
feature
model
avg/req
chat
gpt-4o
$0.0091
summarize
gpt-4o-mini
$0.0011
embed
text-embedding-3-small
$0.0002
chat ↑ 22% vs last weekview all →
04 · how it works

Up and running before your next deploy.

step 01setup

Wrap your client

SDK wraps your existing OpenAI or Anthropic client. One import, one function call. No proxy, no architecture change, no new environment to manage.

$
step 02setup

Tag your calls

Pass a feature name and user ID per call. Optional — but this is what turns raw telemetry into the breakdown that actually tells you something.

>
step 03result

See where the money goes

Every request logged: model, tokens in, tokens out, estimated cost, latency, user, feature. The dashboard builds the breakdown for you.

05 · what you get

Everything you've been building in spreadsheets.

Cost by feature

See which features drive spend. Not the whole bill — the specific feature. Built a new summarizer last week? See exactly what it cost.

chat
$406
summarize
$198
embed
$62

Cost by model

gpt-4o at $0.0091/request. gpt-4o-mini at $0.0011. The same feature, two models. Now you have the data to make the switch.

modelavg/reqmonthly
gpt-4o$0.0091$412.30
gpt-4o-mini$0.0011$48.20
claude-haiku$0.0008$31.10

Cost by user

Rank users by what they cost to serve. Find outliers, detect abuse, decide if your pricing model actually covers your per-user AI spend.

usr_8a4f…
outlier$48.12
usr_2c91…
$12.04
usr_5b7d…
$9.88

Spend alerts

Set a daily or monthly threshold. Get an email or Slack message when you cross it — not when you open the invoice.

⚠ alert · daily14:22
Spend crossed $50.00 threshold · +38% vs 7d avg.

Request explorer

Every API call, searchable and filterable. Sort by cost. Filter by user. Click into a request to see tokens, latency, finish reason.

POST /chatgpt-4o$0.01921.4s
POST /chatgpt-4o$0.00880.9s
POST /sum4o-mini$0.00110.4s

Weekly digest

Every Monday: last week's cost, biggest mover, projected month-end spend. The data you used to build manually — delivered automatically.

mon · 09:00
Last week: $519.82
Biggest mover: chat ↑22%
Projected month: $2,140
06 · pricing

Start free. Pay when it's clearly worth it.

Free

no card
$0/mo

For solo builders and early prototypes

  • Up to 20k requests/month
  • 1 project
  • Basic cost breakdown (model-level)
  • Basic threshold alerts
  • 7 days of request history
  • Email notifications only
start for free — no card required
recommended

Growth

$49/mo

For teams running AI features in production

  • Up to 150k requests/month included
  • 3 projects
  • Cost by user + feature breakdown
  • Advanced alerting (per-user and per-feature thresholds)
  • Multi-channel alerts (Email, Telegram)
  • Weekly digest reports
  • Request explorer (limited depth)
  • Revenue tracking per user
  • 60 days of request history
  • Team access (5 seats)
start growth

Scale

$99/mo

For high-usage teams needing deep insights

  • Up to 1M requests/month included
  • Unlimited projects
  • Full alerts system (user, feature, model-level thresholds and cost spikes)
  • Email, Telegram and Slack alerts
  • Full request explorer
  • Session & prompt logging ( coming soon )
  • Revenue tracking per user
  • 180 days of request history
  • Unlimited seats
  • CSV export
  • Priority support
start scale

Both plans include a weekly digest email, SDK for all major providers, and no changes to your application architecture.

07 · the why

Built from real pain.

We interviewed 40+ engineering teams about their AI cost management. 38 of them were doing it manually — spreadsheets, console logs, or not at all.

The 2 who weren't had built their own internal tooling. That tool is what this is.

hn · sep 2024
We used GPT-5 for phone number extraction for six months before we noticed.
1,693 upvotes·412 comments·"this is me"

Two lines of code. Know where your AI money goes.

The first event usually arrives in under a minute. No proxy. No architecture change. No monthly invoice surprise.

terminal
1npm install @trackllm/sdk
08 · faq

Real questions, real answers.

Most questions fall into four buckets: latency, privacy, scope, and how this differs from observability tools. Real answers below.

Anything missing? founders@trackllm.dev

No. The SDK wraps your existing OpenAI/Anthropic/Gemini client in-process. Your requests go directly from your server to the model provider. We never sit in your request path. If FluxGate goes down, your app keeps shipping tokens.