smai.aismai.ai
Billing Related

Usage Query

Query API usage

View Usage

Console View

Log in to api.smai.ai, and you can view in the console:

  • Account balance
  • Today's/This month's consumption
  • Usage of each model
  • Consumption details

Usage in API Response

Each API call response includes Token usage:

{
  "id": "chatcmpl-xxx",
  "choices": [...],
  "usage": {
    "prompt_tokens": 100,
    "completion_tokens": 50,
    "total_tokens": 150
  }
}

Statistics of Usage

Python Example

from openai import OpenAI

client = OpenAI(
    api_key="sk-your-api-key",
    base_url="https://api.smai.ai/v1"
)

# Record usage
total_input_tokens = 0
total_output_tokens = 0

response = client.chat.completions.create(
    model="gpt-4.1",
    messages=[{"role": "user", "content": "Hello!"}]
)

# Accumulate Tokens
total_input_tokens += response.usage.prompt_tokens
total_output_tokens += response.usage.completion_tokens

print(f"This input: {response.usage.prompt_tokens} tokens")
print(f"This output: {response.usage.completion_tokens} tokens")
print(f"Total input: {total_input_tokens} tokens")
print(f"Total output: {total_output_tokens} tokens")

JavaScript Example

import OpenAI from 'openai';

const client = new OpenAI({
    apiKey: 'sk-your-api-key',
    baseURL: 'https://api.smai.ai/v1'
});

let totalInputTokens = 0;
let totalOutputTokens = 0;

const response = await client.chat.completions.create({
    model: 'gpt-4.1',
    messages: [{ role: 'user', content: 'Hello!' }]
});

totalInputTokens += response.usage.prompt_tokens;
totalOutputTokens += response.usage.completion_tokens;

console.log(`This input: ${response.usage.prompt_tokens} tokens`);
console.log(`This output: ${response.usage.completion_tokens} tokens`);
console.log(`Total input: ${totalInputTokens} tokens`);
console.log(`Total output: ${totalOutputTokens} tokens`);

Token Estimation

Estimate the number of Tokens before sending a request:

Using tiktoken (Python)

import tiktoken

def count_tokens(text, model="gpt-4"):
    encoding = tiktoken.encoding_for_model(model)
    return len(encoding.encode(text))

text = "Hello, how are you?"
tokens = count_tokens(text)
print(f"Estimated Token count: {tokens}")

Empirical Estimation

  • English: About 4 characters = 1 Token
  • Chinese: About 1-2 characters = 1 Token
  • Code: Varies greatly, actual testing is recommended

Usage of Streaming Responses

Note

In streaming responses, the usage information is usually returned only in the last chunk, or not returned at all. For precise statistics, it is recommended to use non-streaming requests or calculate it at the application layer.

Set Usage Alerts

It is recommended to set in the console:

  • Low balance alerts
  • Daily consumption limit alerts
  • Abnormal consumption alerts

On this page