Billing Related
Usage Query
Query API usage
View Usage
Console View
Log in to api.smai.ai, and you can view in the console:
- Account balance
- Today's/This month's consumption
- Usage of each model
- Consumption details
Usage in API Response
Each API call response includes Token usage:
{
"id": "chatcmpl-xxx",
"choices": [...],
"usage": {
"prompt_tokens": 100,
"completion_tokens": 50,
"total_tokens": 150
}
}Statistics of Usage
Python Example
from openai import OpenAI
client = OpenAI(
api_key="sk-your-api-key",
base_url="https://api.smai.ai/v1"
)
# Record usage
total_input_tokens = 0
total_output_tokens = 0
response = client.chat.completions.create(
model="gpt-4.1",
messages=[{"role": "user", "content": "Hello!"}]
)
# Accumulate Tokens
total_input_tokens += response.usage.prompt_tokens
total_output_tokens += response.usage.completion_tokens
print(f"This input: {response.usage.prompt_tokens} tokens")
print(f"This output: {response.usage.completion_tokens} tokens")
print(f"Total input: {total_input_tokens} tokens")
print(f"Total output: {total_output_tokens} tokens")JavaScript Example
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: 'sk-your-api-key',
baseURL: 'https://api.smai.ai/v1'
});
let totalInputTokens = 0;
let totalOutputTokens = 0;
const response = await client.chat.completions.create({
model: 'gpt-4.1',
messages: [{ role: 'user', content: 'Hello!' }]
});
totalInputTokens += response.usage.prompt_tokens;
totalOutputTokens += response.usage.completion_tokens;
console.log(`This input: ${response.usage.prompt_tokens} tokens`);
console.log(`This output: ${response.usage.completion_tokens} tokens`);
console.log(`Total input: ${totalInputTokens} tokens`);
console.log(`Total output: ${totalOutputTokens} tokens`);Token Estimation
Estimate the number of Tokens before sending a request:
Using tiktoken (Python)
import tiktoken
def count_tokens(text, model="gpt-4"):
encoding = tiktoken.encoding_for_model(model)
return len(encoding.encode(text))
text = "Hello, how are you?"
tokens = count_tokens(text)
print(f"Estimated Token count: {tokens}")Empirical Estimation
- English: About 4 characters = 1 Token
- Chinese: About 1-2 characters = 1 Token
- Code: Varies greatly, actual testing is recommended
Usage of Streaming Responses
Note
In streaming responses, the usage information is usually returned only in the last chunk, or not returned at all. For precise statistics, it is recommended to use non-streaming requests or calculate it at the application layer.
Set Usage Alerts
It is recommended to set in the console:
- Low balance alerts
- Daily consumption limit alerts
- Abnormal consumption alerts
