Install
npm install @fluxgate/sdk @fluxgate/openai openai
openai (≥ 6.34.0) is a peer dependency — install it alongside the wrapper.
One-time setup
Wrap your existing OpenAI client once. The wrapped client is a drop-in replacement for every method you already use.
// lib/openai.ts
import OpenAI from "openai";
import { FluxGate } from "@fluxgate/sdk";
import { createOpenAICostTracker } from "@fluxgate/openai";
const _client = new OpenAI({
apiKey: process.env.OPENAI_API_KEY!,
});
const fg = new FluxGate({
apiKey: process.env.FLUXGATE_API_KEY!,
});
export const openai = createOpenAICostTracker(_client, fg);
Do not construct a new
OpenAIorFluxGateinstance per request. Module-level singletons prevent unnecessary connection overhead.
Chat completions
import { openai } from "@/lib/openai";
const completion = await openai
.withContext({
feature: "chat-assistant",
user: { id: session.user.id, email: session.user.email },
sessionId: session.id,
})
.chat.completions.create({
model: "gpt-4o",
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: userMessage },
],
});
// All standard OpenAI fields are unchanged
const text = completion.choices[0].message.content;
// FluxGate telemetry is attached to every response
const { cost, trackingId, status } = completion.fluxGateCostTrackingResponse;
Streaming
Streaming is tracked transparently — fluxGateCostTrackingResponse is available after the stream is fully consumed.
const stream = await openai
.withContext({ feature: "streaming-chat" })
.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: prompt }],
stream: true,
});
for await (const chunk of stream) {
const delta = chunk.choices[0]?.delta?.content ?? "";
process.stdout.write(delta);
}
// Safe to read after the loop — data is captured once the stream closes
console.log(stream.fluxGateCostTrackingResponse);
Embeddings
const embedding = await openai
.withContext({ feature: "semantic-search" })
.embeddings.create({
model: "text-embedding-3-small",
input: document,
});
const vector = embedding.data[0].embedding;
Responses API
const response = await openai
.withContext({ feature: "reasoning-agent" })
.responses.create({
model: "gpt-4o",
input: userQuery,
});
console.log(response.output_text);
Multiple feature contexts
Instantiate separate context handles for each feature — the underlying client and credentials are shared.
const chatClient = openai.withContext({ feature: "chat" });
const codeClient = openai.withContext({ feature: "code-review" });
const summaryClient = openai.withContext({ feature: "summarizer" });
// Each call is attributed to the correct feature in FluxGate
await chatClient.chat.completions.create({ ... });
await codeClient.chat.completions.create({ ... });
Error handling
Errors are tracked automatically with status: "ERROR". Do not swallow exceptions — let them propagate normally and FluxGate will record them.
try {
const completion = await openai
.withContext({ feature: "chat" })
.chat.completions.create({ model: "gpt-4o", messages });
} catch (err) {
// The failed event is already recorded in FluxGate
throw err;
}
FluxGateCostTrackingResponse shape
Every tracked response extends the provider's native type with one extra property:
interface FluxGateCostTrackingResponse {
status:
| "SUCCESS"
| "ERROR"
| "BLOCKED"
| "MAX_TOKENS"
| "CONTENT_FILTER"
| "RECITATION"
| "MALFORMED_REQUEST";
cost: number | null; // USD
trackingId: string | null;
createdAt: string | null; // ISO 8601
errorMessage?: string;
}
Tracked automatically: input tokens, output tokens, cached tokens, model name, latency (ms), stream duration, and finish reason.
Supported methods
| Method | Non-streaming | Streaming |
|---|---|---|
chat.completions.create | ✅ | ✅ |
completions.create (legacy) | ✅ | ✅ |
responses.create | ✅ | ✅ |
embeddings.create | ✅ | — |