@tokenring-ai/ai-client
Multi-provider AI integration client for the Token Ring ecosystem. Provides unified access to various AI models through a consistent interface, supporting chat, embeddings, image generation, video generation, reranking, speech synthesis, and transcription capabilities.
Overview
The AI Client package acts as a unified interface to multiple AI providers, abstracting away provider-specific differences while maintaining full access to provider capabilities. It integrates with the Token Ring agent system through seven model registry services that manage model specifications and provide client instances. Models are automatically discovered and registered from each provider, with background processes checking availability and updating status.
Key Features
- 15+ AI Providers: Anthropic, OpenAI, Google, Groq, Cerebras, DeepSeek, ElevenLabs, Fal, xAI, OpenRouter, Perplexity, Azure, Ollama, llama, and more
- Seven AI Capabilities: Chat, Embeddings, Image Generation, Video Generation, Reranking, Speech, and Transcription
- Seven Model Registries: Dedicated service registries for managing model specifications and capabilities
- Dynamic Model Registration: Register custom models with availability checks
- Model Status Tracking: Monitor model online, cold, and offline status
- Auto-Configuration: Automatic provider setup from environment variables
- JSON-RPC API: Remote procedure call endpoints for programmatic access via plugin registration
- Streaming Support: Real-time streaming responses with delta handling for text and reasoning
- Agent Integration: Seamless integration with Token Ring agent system through services
- Feature System: Rich feature specification system supporting boolean, number, string, enum, and array types with validation
- Cost Tracking: Automatic cost calculation and metrics integration
- Model Requirements: Query models by capabilities (context length, reasoning, intelligence, speed, etc.)
Installation
bun add @tokenring-ai/ai-client
Providers
The package supports the following AI providers through dedicated integrations:
| Provider | SDK/Model Support | Key Features |
|---|---|---|
| Anthropic | Claude models | Reasoning, analysis, web search, context caching, image input, file input |
| OpenAI | GPT models, Whisper, TTS, Image Generation | Reasoning, multimodal, real-time audio, image generation, web search |
| Gemini, Imagen | Thinking, multimodal, image generation, web search, video input, audio input | |
| Groq | LLaMA-based models | High-speed inference, Llama, Qwen, Kimi models |
| Cerebras | LLaMA-based models | High performance, Llama, Qwen, GLM models |
| DeepSeek | DeepSeek models | Reasoning capabilities, chat and reasoner |
| ElevenLabs | Speech synthesis and transcription | Multilingual voice generation, speaker diarization |
| Fal | Image generation | Fast image generation, Flux models |
| xAI | xAI models | Reasoning and analysis, image generation, video generation |
| OpenRouter | Aggregated access | Multiple provider access, dynamic model discovery |
| Perplexity | Perplexity models | Web search integration, deep research |
| Azure | Azure OpenAI | Enterprise deployment |
| Ollama | Self-hosted models | Local inference, chat and embedding models |
| Llama | Meta Llama models | Local/remote inference via llama.com |
| OpenAI Compatible | Any OpenAI-compatible API | Flexible provider configuration |
Additional providers can be configured using the openaiCompatible provider for OpenAI-compatible APIs.
Core Components
Model Registries
The package provides seven model registry services, each implementing the TokenRingService interface:
- ChatModelRegistry: Manages chat model specifications
- ImageGenerationModelRegistry: Manages image generation model specifications
- VideoGenerationModelRegistry: Manages video generation model specifications
- EmbeddingModelRegistry: Manages embedding model specifications
- SpeechModelRegistry: Manages speech synthesis model specifications
- TranscriptionModelRegistry: Manages speech-to-text transcription model specifications
- RerankingModelRegistry: Manages document reranking model specifications
Client Classes
- AIChatClient: Chat completion and structured output generation
- AIEmbeddingClient: Text vectorization and embeddings
- AIImageGenerationClient: Image generation from text prompts
- AIVideoGenerationClient: Video generation from text or images
- AISpeechClient: Text-to-speech synthesis
- AITranscriptionClient: Audio-to-text transcription
- AIRerankingClient: Document relevance ranking
Utilities
- modelSettings: Parses and serializes model names with feature settings
- resequenceMessages: Resequences chat messages to maintain proper alternation
Services
The package registers seven service registries for different AI capabilities. Each registry implements the TokenRingService interface and provides methods for managing model specifications and retrieving clients.
ChatModelRegistry
Manages chat model specifications and provides access to chat completion capabilities.
Methods:
registerAllModelSpecs(specs): Register multiple chat model specificationsgetModelSpecsByRequirements(requirements): Get models matching specific requirementsgetModelsByProvider(): Get all registered models grouped by providergetAllModelsWithOnlineStatus(): Get all models with their online statusgetClient(name): Get a client instance matching the model namegetCheapestModelByRequirements(requirements, estimatedContextLength): Find the cheapest model matching requirements
Model Requirements:
nameLike: Filter models by name patterncontextLength: Maximum context length in tokensmaxCompletionTokens: Maximum output tokenswebSearch: Web search capabilityimage: Image input capabilityvideo: Video input capabilityaudio: Audio input capabilityfile: File input capabilitytools: Tool use capabilitystructuredOutput: Structured output capability
Model Specification:
Each model specification includes:
modelId: Unique identifier for the modelproviderDisplayName: Display name of the providerimpl: Model implementation interfacecostPerMillionInputTokens: Cost per million input tokenscostPerMillionOutputTokens: Cost per million output tokenscostPerMillionCachedInputTokens: Cost per million cached input tokens (optional)costPerMillionReasoningTokens: Cost per million reasoning tokens (optional)maxContextLength: Maximum context length in tokensisAvailable(): Async function to check model availabilityisHot(): Async function to check if model is warmed upmangleRequest(): Optional function to modify the request before sendingsettings: Optional feature specifications for query parametersinputCapabilities: Input capability specifications
Example:
chatRegistry.registerAllModelSpecs([
{
modelId: "custom-model",
providerDisplayName: "CustomProvider",
impl: customProvider("custom-model"),
costPerMillionInputTokens: 5,
costPerMillionOutputTokens: 15,
maxContextLength: 100000,
async isAvailable() {
return true;
}
}
]);
ImageGenerationModelRegistry
Manages image generation model specifications.
Methods:
registerAllModelSpecs(specs): Register image generation model specificationsgetModelSpecsByRequirements(requirements): Get models matching specific requirementsgetModelsByProvider(): Get all registered models grouped by providergetAllModelsWithOnlineStatus(): Get all models with their online statusgetClient(name): Get a client instance matching the model name
Model Specification:
modelId: Unique identifier for the modelproviderDisplayName: Display name of the providerimpl: Image model implementationcalculateImageCost(request, result): Function to calculate image generation costproviderOptions: Provider-specific optionsisAvailable(): Async function to check model availability
VideoGenerationModelRegistry
Manages video generation model specifications.
Methods:
registerAllModelSpecs(specs): Register video generation model specificationsgetModelSpecsByRequirements(requirements): Get models matching specific requirementsgetModelsByProvider(): Get all registered models grouped by providergetAllModelsWithOnlineStatus(): Get all models with their online statusgetClient(name): Get a client instance matching the model name
Model Specification:
modelId: Unique identifier for the modelproviderDisplayName: Display name of the providerimpl: Video model implementationcalculateVideoCost(request, result): Function to calculate video generation costproviderOptions: Provider-specific optionsinputCapabilities: Input capability specifications
EmbeddingModelRegistry
Manages embedding model specifications for text vectorization.
Methods:
registerAllModelSpecs(specs): Register embedding model specificationsgetModelSpecsByRequirements(requirements): Get models matching specific requirementsgetModelsByProvider(): Get all registered models grouped by providergetAllModelsWithOnlineStatus(): Get all models with their online statusgetClient(name): Get a client instance matching the model name
Model Specification:
modelId: Unique identifier for the modelproviderDisplayName: Display name of the providerimpl: Embedding model implementationcontextLength: Maximum context lengthcostPerMillionInputTokens: Cost per million input tokensisAvailable(): Async function to check model availability
SpeechModelRegistry
Manages speech synthesis model specifications.
Methods:
registerAllModelSpecs(specs): Register speech model specificationsgetModelSpecsByRequirements(requirements): Get models matching specific requirementsgetModelsByProvider(): Get all registered models grouped by providergetAllModelsWithOnlineStatus(): Get all models with their online statusgetClient(name): Get a client instance matching the model name
Model Specification:
modelId: Unique identifier for the modelproviderDisplayName: Display name of the providerimpl: Speech model implementationcostPerMillionCharacters: Cost per million charactersproviderOptions: Provider-specific optionssettings: Feature specifications
TranscriptionModelRegistry
Manages speech-to-text transcription model specifications.
Methods:
registerAllModelSpecs(specs): Register transcription model specificationsgetModelSpecsByRequirements(requirements): Get models matching specific requirementsgetModelsByProvider(): Get all registered models grouped by providergetAllModelsWithOnlineStatus(): Get all models with their online statusgetClient(name): Get a client instance matching the model name
Model Specification:
modelId: Unique identifier for the modelproviderDisplayName: Display name of the providerimpl: Transcription model implementationcostPerMinute: Cost per minute of audioproviderOptions: Provider-specific optionssettings: Feature specifications
RerankingModelRegistry
Manages document reranking model specifications.
Methods:
registerAllModelSpecs(specs): Register reranking model specificationsgetModelSpecsByRequirements(requirements): Get models matching specific requirementsgetModelsByProvider(): Get all registered models grouped by providergetAllModelsWithOnlineStatus(): Get all models with their online statusgetClient(name): Get a client instance matching the model name
Model Specification:
modelId: Unique identifier for the modelproviderDisplayName: Display name of the providerimpl: Reranking model implementationcostPerMillionInputTokens: Cost per million input tokens (optional)isAvailable(): Async function to check model availability
Configuration
The AI Client can be configured through environment variables or explicit provider configuration.
Auto-Configuration
Enable automatic provider configuration using environment variables:
import TokenRingApp from "@tokenring-ai/app";
import aiClientPlugin from "@tokenring-ai/ai-client";
const app = new TokenRingApp();
app.addPlugin(aiClientPlugin, {
ai: {
autoConfigure: true // Auto-detect and configure providers from env vars
}
});
Environment Variables
# Anthropic
ANTHROPIC_API_KEY=sk-ant-...
# OpenAI
OPENAI_API_KEY=sk-...
# Google
GOOGLE_GENERATIVE_AI_API_KEY=AIza...
# Groq
GROQ_API_KEY=gsk_...
# ElevenLabs
ELEVENLABS_API_KEY=...
# xAI
XAI_API_KEY=...
# OpenRouter
OPENROUTER_API_KEY=...
# Perplexity
PERPLEXITY_API_KEY=...
# DeepSeek
DEEPSEEK_API_KEY=...
# Cerebras
CEREBRAS_API_KEY=...
# Qwen (DashScope)
DASHSCOPE_API_KEY=sk-...
# Meta API Service (llama.com)
META_LLAMA_API_KEY=sk-...
# zAI
ZAI_API_KEY=...
# Chutes
CHUTES_API_KEY=...
# NVIDIA NIM
NVIDIA_NIM_API_KEY=...
# llama.cpp
LLAMA_BASE_URL=http://127.0.0.1:11434/v1
LLAMA_API_KEY=...
# Azure
AZURE_API_ENDPOINT=https://...
AZURE_API_KEY=<key>
# Ollama
OLLAMA_BASE_URL=http://127.0.0.1:11434/v1
Manual Configuration
app.addPlugin(aiClientPlugin, {
ai: {
autoConfigure: false,
providers: {
OpenAI: {
provider: "openai",
apiKey: "sk-..."
},
Anthropic: {
provider: "anthropic",
apiKey: "sk-ant-..."
},
Google: {
provider: "google",
apiKey: "AIza..."
}
}
}
});
Configuration Schema
The plugin configuration schema is:
{
ai: {
autoConfigure?: boolean;
providers?: Record<string, AIProviderConfig>;
}
}
Note: The provider field in AIProviderConfig is a discriminator that matches provider names like "anthropic", "openai", "google", and so on (lowercase).
Client Usage
You can create clients directly from the registries or use the JSON-RPC API for programmatic access.
Direct Client Creation
import TokenRingApp from "@tokenring-ai/app";
import aiClientPlugin from "@tokenring-ai/ai-client";
const app = new TokenRingApp();
app.addPlugin(aiClientPlugin, {
ai: {
autoConfigure: true
}
});
// Wait for services to be registered
await app.waitForService('ChatModelRegistry', chatRegistry => {
// Get models matching requirements
const models = chatRegistry.getModelSpecsByRequirements({
nameLike: "gpt-5"
});
// Get all models with online status
const allModels = await chatRegistry.getAllModelsWithOnlineStatus();
// Get models by provider
const byProvider = await chatRegistry.getModelsByProvider();
// Get a client
const client = await chatRegistry.getClient("openai:gpt-5");
// Use the client
const [text, response] = await client.textChat(
{
messages: [
{ role: "user", content: "Hello" }
]
},
agent
);
console.log(text); // "Hi there!"
});
Using Model Registries
// Embedding models
await app.waitForService('EmbeddingModelRegistry', embeddingRegistry => {
const client = embeddingRegistry.getClient("openai:text-embedding-3-small");
const embeddings = await client.getEmbeddings(["your text here"]);
});
// Image generation
await app.waitForService('ImageGenerationModelRegistry', imageRegistry => {
const client = imageRegistry.getClient("openai:gpt-image-1-high");
const [image, result] = await client.generateImage({
prompt: "A beautiful sunset over the ocean",
size: "1024x1024",
n: 1
}, agent);
});
// Video generation
await app.waitForService('VideoGenerationModelRegistry', videoRegistry => {
const client = videoRegistry.getClient("video-model");
const [video, result] = await client.generateVideo({
prompt: "A beautiful sunset over the ocean",
aspectRatio: "16:9",
duration: 5
}, agent);
});
// Speech synthesis
await app.waitForService('SpeechModelRegistry', speechRegistry => {
const client = speechRegistry.getClient("openai:tts-1");
const [audio, result] = await client.generateSpeech({
text: "Hello, world!",
voice: "alloy",
speed: 1.0
}, agent);
});
// Transcription
await app.waitForService('TranscriptionModelRegistry', transcriptionRegistry => {
const client = transcriptionRegistry.getClient("openai:whisper-1");
const [text, result] = await client.transcribe({
audio: audioFile
}, agent);
});
Custom Model Registration
// Get registry instance
const chatRegistry = app.getService('ChatModelRegistry');
chatRegistry.registerAllModelSpecs([
{
modelId: "custom-model",
providerDisplayName: "CustomProvider",
impl: customProvider("custom-model"),
costPerMillionInputTokens: 5,
costPerMillionOutputTokens: 15,
maxContextLength: 100000,
async isAvailable() {
return true;
}
}
]);
Using Feature Queries
// Get model with specific configuration
const client = await chatRegistry.getClient("openai:gpt-5?websearch=1");
// Use the client
const [result, response] = await client.textChat(
{
messages: [
{ role: "user", content: "Search the web for the latest AI news" }
]
},
agent
);
Using Feature System
// Get model with multiple features
const client = await chatRegistry.getClient("openai:gpt-5?websearch=1&reasoningEffort=high");
// Set features on client instance
client.setSettings({
websearch: true,
reasoningEffort: "high"
});
// Get current features
const features = client.getSettings();
Client Methods
AIChatClient
The chat client provides methods for generating text and structured outputs.
Methods:
textChat(request, agent): Send a chat completion request and return the full text responsestreamChat(request, agent): Stream a chat completion with real-time delta handlinggenerateObject(request, agent): Send a chat completion request and return a structured object responsererank(request, agent): Rank documents by relevance to a querycalculateCost(usage): Calculate the cost for a given usage objectcalculateTiming(elapsedMs, usage): Calculate timing informationsetSettings(settings): Set enabled settings on this client instancegetSettings(): Get a copy of the enabled settingsgetModelId(): Get the model IDgetModelSpec(): Get the model specification
Example:
const [text, response] = await client.textChat(
{
messages: [
{ role: "user", content: "Hello" }
]
},
agent
);
// Calculate cost
const cost = client.calculateCost({
inputTokens: 100,
outputTokens: 50
});
// Calculate timing
const timing = client.calculateTiming(1500, {
inputTokens: 100,
outputTokens: 50
});
// Rerank documents
const rankings = await client.rerank({
query: "What is machine learning?",
documents: [
"Machine learning is a subset of AI...",
"AI is a broad field...",
"Deep learning is a type of ML..."
],
topN: 3
}, agent);
AIEmbeddingClient
The embedding client generates vector embeddings for text.
Methods:
getEmbeddings({ input }): Generate embeddings for an array of input stringssetSettings(settings): Set enabled settings on this client instancegetSettings(): Get a copy of the enabled settingsgetModelId(): Get the model ID
Example:
const embeddings = await client.getEmbeddings([
"Hello world",
"Machine learning is great"
]);
AIImageGenerationClient
The image generation client creates images from text prompts.
Methods:
generateImage(request, agent): Generate an image based on a promptsetSettings(settings): Set enabled settings on this client instancegetSettings(): Get a copy of the enabled settingsgetModelId(): Get the model ID
Example:
const [image, result] = await client.generateImage({
prompt: "A beautiful sunset over the ocean",
size: "1024x1024",
n: 1
}, agent);
AIVideoGenerationClient
The video generation client creates videos from text prompts or images.
Methods:
generateVideo(request, agent): Generate a video based on a promptsetSettings(settings): Set enabled settings on this client instancegetSettings(): Get a copy of the enabled settingsgetModelId(): Get the model ID
Example:
const [video, result] = await client.generateVideo({
prompt: "A beautiful sunset over the ocean",
aspectRatio: "16:9",
duration: 5
}, agent);
AISpeechClient
The speech client synthesizes speech from text.
Methods:
generateSpeech(request, agent): Generate speech from textsetSettings(settings): Set enabled settings on this client instancegetSettings(): Get a copy of the enabled settingsgetModelSpec(): Get the model specification
Example:
const [audio, result] = await client.generateSpeech({
text: "Hello, world!",
voice: "alloy",
speed: 1.0
}, agent);
AITranscriptionClient
The transcription client transcribes audio to text.
Methods:
transcribe(request, agent): Transcribe audio to textsetSettings(settings): Set enabled settings on this client instancegetSettings(): Get a copy of the enabled settingsgetModelSpec(): Get the model specification
Example:
const [text, result] = await client.transcribe({
audio: audioFile,
language: "en"
}, agent);
AIRerankingClient
The reranking client ranks documents by relevance to a query.
Methods:
rerank({ query, documents, topN }): Rank documents by relevancesetSettings(settings): Set enabled settings on this client instancegetSettings(): Get a copy of the enabled settingsgetModelId(): Get the model ID
Example:
const result = await client.rerank({
query: "What is machine learning?",
documents: [
"Machine learning is a subset of AI...",
"AI is a broad field...",
"Deep learning is a type of ML..."
],
topN: 3
});
RPC Endpoints
The AI Client exposes JSON-RPC endpoints for programmatic access via the RPC service. The endpoint is registered under the path /rpc/ai-client.
Available Endpoints
| Method | Request Params | Response Params | Purpose |
|---|---|---|---|
listChatModels | {} | { models: {...} } | Get all available chat models with their status |
listChatModelsByProvider | {} | { modelsByProvider: {...} } | Get chat models grouped by provider |
listEmbeddingModels | {} | { models: {...} } | Get all available embedding models |
listEmbeddingModelsByProvider | {} | { modelsByProvider: {...} } | Get embedding models grouped by provider |
listImageGenerationModels | {} | { models: {...} } | Get all available image generation models |
listImageGenerationModelsByProvider | {} | { modelsByProvider: {...} } | Get image generation models grouped by provider |
listSpeechModels | {} | { models: {...} } | Get all available speech models |
listSpeechModelsByProvider | {} | { modelsByProvider: {...} } | Get speech models grouped by provider |
listTranscriptionModels | {} | { models: {...} } | Get all available transcription models |
listTranscriptionModelsByProvider | {} | { modelsByProvider: {...} } | Get transcription models grouped by provider |
listRerankingModels | {} | { models: {...} } | Get all available reranking models |
listRerankingModelsByProvider | {} | { modelsByProvider: {...} } | Get reranking models grouped by provider |
Response Structure:
{
models: {
"provider:model": {
status: "online" | "cold" | "offline",
available: boolean,
hot: boolean,
modelSpec: ModelSpec
}
}
}
ByProvider Response Structure:
{
modelsByProvider: {
"Provider Name": {
"provider:model": {
status: "online" | "cold" | "offline",
available: boolean,
hot: boolean
}
}
}
}
RPC Usage Example
import {RpcService} from "@tokenring-ai/rpc";
const rpcService = app.requireService(RpcService);
// List all chat models
const chatModels = await rpcService.call("listChatModels", {
agentId: "some-agent-id"
});
// Get models by provider
const modelsByProvider = await rpcService.call("listChatModelsByProvider", {
agentId: "some-agent-id"
});
Note: Video generation models are currently not exposed via RPC endpoints. They are available through the VideoGenerationModelRegistry service but not through the JSON-RPC interface.
Model Discovery
The package automatically discovers and registers available models from each provider:
- Plugin Installation:
install()method runs during plugin installation and registers the seven service registries - Provider Configuration:
start()method runs after services are registered and registers providers based on configuration - Auto-Configuration: If
autoConfigureis true orprovidersis not set,autoConfig()is called to detect environment variables - Provider Registration: Each provider's
init()method is called with its configuration - Model Registration: Providers add their available models to the appropriate registries
- Availability Checking: Background process checks
isAvailable()to determine model status
Model Status
Models track their online status:
- online: Model is available and ready for use
- cold: Model is available but needs to be warmed up
- offline: Model is not available
Availability Checking
Models check their availability in the background:
// All models are checked for availability shortly after startup
// This automatically fills the online status cache
getAllModelsWithOnlineStatus(): Promise<Record<string, ModelStatus<ChatModelSpec>>>
isAvailable(): Promise<boolean> // Implement in ModelSpec
isHot(): Promise<boolean> // Implement in ModelSpec
Note: The actual client classes (AIChatClient, AIEmbeddingClient, etc.) are not included in this package's exports. They are part of the internal implementation and imported at runtime from the provider-specific SDKs.
Feature System
The package supports a rich feature specification system that allows you to configure models dynamically without creating multiple client instances.
Feature Types
Features can be of the following types:
- boolean: Boolean values with optional default
- number: Numeric values with optional min/max constraints
- string: String values with optional default
- enum: Enumerated values with optional default
- array: Array values with optional default
Feature Specification
Each feature has the following properties:
description: Human-readable description of the featuretype: The type of the featuredefaultValue: Default value (optional)min: Minimum value (for number types)max: Maximum value (for number types)values: Allowed values (for enum types)
Example Features
// Boolean feature
{
description: "Enables web search",
defaultValue: false,
type: "boolean"
}
// Number feature with constraints
{
description: "Maximum number of web searches",
defaultValue: 5,
type: "number",
min: 0,
max: 20
}
// Enum feature
{
description: "Reasoning effort level",
defaultValue: "medium",
type: "enum",
values: ["minimal", "low", "medium", "high"]
}
// Array feature
{
description: "Response modalities",
defaultValue: ["TEXT"],
type: "array"
}
Provider-Specific Features
Different providers support different features:
OpenAI:
websearch: Enable web search toolreasoningEffort: Reasoning effort level (minimal, low, medium, high)reasoningSummary: Reasoning summary mode (auto, detailed)serviceTier: Service tier (auto, flex, priority, default)textVerbosity: Text verbosity (low, medium, high)strictJsonSchema: Use strict JSON schema validationpromptCacheRetention: Prompt cache retention policy (in_memory, 24h)
Anthropic:
caching: Enable context cachingwebsearch: Enable web search toolmaxSearchUses: Maximum number of web searches (0 to disable, max 20)
Google:
responseModalities: Response modalities (TEXT, IMAGE)thinkingBudget: Thinking token budget (for Gemini 2.5)thinkingLevel: Thinking depth (for Gemini 3)includeThoughts: Include thought summaries
Perplexity:
websearch: Enable web search (default: true)searchContextSize: Search context size (low, medium, high)
xAI:
websearch: Enable web searchwebImageUnderstanding: Enable image understanding in web searchXSearch: Enable X searchXFromDate: From date for X searchXToDate: To date for X searchXAllowedHandles: Allowed handles for X searchXImageUnderstanding: Enable image understanding in X searchXVideoUnderstanding: Enable video understanding in X search
OpenRouter:
websearch: Enable web search pluginsearchEngine: Search engine (native, exa)maxResults: Maximum number of search resultssearchContextSize: Search context size for native searchfrequencyPenalty: Frequency penaltymaxTokens: Max tokensminP: Min P samplingpresencePenalty: Presence penaltyrepetitionPenalty: Repetition penaltytemperature: TemperaturetopK: Top K samplingtopP: Top P samplingincludeReasoning: Include reasoningreasoning: Reasoning mode
OpenAI Compatible:
temperature: Sampling temperature (0-2)top_p: Nucleus sampling (0-1)frequency_penalty: Frequency penalty (-2 to 2)presence_penalty: Presence penalty (-2 to 2)seed: Random seed for reproducible outputtop_k: Top K sampling (if supported)min_p: Min P sampling (if supported)repetition_penalty: Repetition penalty (if supported)length_penalty: Length penalty (if supported)min_tokens: Minimum tokens to generate (if supported)enable_thinking: Enable thinking mode (for VLLM)
Best Practices
- Auto-Configure: Use
autoConfigure: truefor convenience and automatic environment variable detection - Check Availability: Always verify models are available using
getAllModelsWithOnlineStatus()orgetClient() - Use Feature Queries: Leverage query parameters for flexible model selection without creating multiple clients
- Monitor Status: Check model status before expensive operations to avoid failed requests
- Reuse Clients: Create client instances once and reuse for multiple requests for better performance
- Select Appropriate Models: Choose models based on context length and cost requirements
- Custom Registrations: Add custom models when needed using
registerAllModelSpecs() - Use RPC for Remote Access: For programmatic access across processes, use the JSON-RPC endpoint (note: video generation models are not yet exposed via RPC)
- Set Settings: Use
setSettings()on client instances to enable specific features without creating multiple clients - Calculate Costs: Use
calculateCost()to estimate expenses before making requests - Use Cheapest Model: Use
getCheapestModelByRequirements()to find the most cost-effective model for your needs - Check Model Hot Status: Use
isHot()to determine if a model needs to be warmed up
Testing
# Run all tests
bun run test
# Run tests in watch mode
bun run test:watch
# Run tests with coverage
bun run test:coverage
Dependencies
Runtime Dependencies
@tokenring-ai/app: Base application framework with service and plugin system@tokenring-ai/agent: Agent framework for tool execution@tokenring-ai/rpc: RPC service for programmatic access@tokenring-ai/utility: Shared utilities and registry functionality@tokenring-ai/metrics: Metrics service for cost trackingai: Vercel AI SDK for streaming and client functionalityzod: Runtime schema validationaxios: HTTP client for API requests
AI SDK Dependencies
@ai-sdk/anthropic: Anthropic AI SDK for Claude models@ai-sdk/azure: Azure OpenAI SDK for Azure hosting@ai-sdk/cerebras: Cerebras AI SDK for LLaMA models@ai-sdk/deepseek: DeepSeek AI SDK for DeepSeek models@ai-sdk/elevenlabs: ElevenLabs SDK for speech synthesis@ai-sdk/fal: Fal AI SDK for image generation@ai-sdk/google: Google Generative AI SDK for Gemini models@ai-sdk/groq: Groq AI SDK for LLaMA inference@ai-sdk/openai: OpenAI AI SDK for GPT, Whisper, TTS models@ai-sdk/openai-compatible: OpenAI-compatible API SDK@ai-sdk/perplexity: Perplexity AI SDK for Perplexity models@ai-sdk/xai: xAI SDK for Grok models@openrouter/ai-sdk-provider: OpenRouter SDK for provider aggregationollama-ai-provider-v2: Ollama SDK for local model hosting
Development Dependencies
@vitest/coverage-v8: Code coveragetypescript: TypeScript compilervitest: Unit testing framework
Development
The package follows the Token Ring plugin pattern:
- Install Phase: Registers seven service instances (registries) and optionally registers RPC endpoint
- Start Phase: Initializes providers and registers models through the provider initialization chain
The package exports the following from index.ts:
Tool: Type from Vercel AI SDKUserModelMessage: Type from Vercel AI SDKchatTool: Tool creation function from Vercel AI SDKstepCountIs: Step counting function from Vercel AI SDK
The actual client classes (AIChatClient, AIEmbeddingClient, etc.) are internal implementation details and are accessed through the model registries.
Utility Functions
The package includes several utility functions in the util/ directory:
parseModelAndSettings
Parses a model name string that may include query parameters for feature settings.
Usage:
import {parseModelAndSettings} from "./util/modelSettings";
// Parse model with settings
const {base, settings} = parseModelAndSettings("openai:gpt-5?websearch=1&reasoningEffort=high");
// base: "openai:gpt-5"
// settings: Map { "websearch" => true, "reasoningEffort" => "high" }
serializeModel
Serializes a model name and settings map back into a model string with query parameters.
Usage:
import {serializeModel} from "./util/modelSettings";
const settings = new Map([["websearch", true], ["reasoningEffort", "high"]]);
const modelString = serializeModel("openai:gpt-5", settings);
// modelString: "openai:gpt-5?websearch=1&reasoningEffort=high"
coerceFeatureValue
Converts string feature values to appropriate types (boolean, number, or string).
Usage:
import {coerceFeatureValue} from "./util/modelSettings";
coerceFeatureValue("1"); // true
coerceFeatureValue("true"); // true
coerceFeatureValue("0"); // false
coerceFeatureValue("false"); // false
coerceFeatureValue("42"); // 42
coerceFeatureValue("medium"); // "medium"
resequenceMessages
Resequences chat messages to maintain proper alternating user/assistant pattern. This is useful when preparing messages for chat models that require strict alternation.
Usage:
import {resequenceMessages} from "./util/resequenceMessages";
const request = {
messages: [
{ role: "user", content: "Hello" },
{ role: "user", content: "How are you?" }, // Consecutive user messages
{ role: "assistant", content: "I'm good" },
{ role: "user", content: "Thanks" }
],
tools: {}
};
resequenceMessages(request);
// Messages are combined and resequenced to maintain alternation
License
MIT License - see LICENSE file for details.