@tokenring-ai/file-index
The @tokenring-ai/file-index package provides file indexing and search capabilities for AI agents, enabling efficient codebase exploration and retrieval through intelligent search algorithms.
Overview
The @tokenring-ai/file-index package provides file indexing and search capabilities for TokenRing AI agents. It
enables agents to index project files and perform efficient searches across codebases using full-text matching with
token overlap scoring (BM25-like) and intelligent result merging.
The package implements a provider architecture that allows for different storage backends, with an in-memory ephemeral provider currently available. It integrates seamlessly with TokenRing agents through tools, chat commands, and state management.
Key Features
- Hybrid Search: Combines full-text matching with token overlap scoring (BM25-like) and intelligent result merging
- Line-Based Chunking: Simple line-based chunking with ~1000 character limit per chunk for efficient processing
- Provider Architecture: Extensible system supporting different storage backends through the FileIndexProvider abstract class
- Agent Integration: Seamless integration with TokenRing AI agents through tools and chat commands
- Result Merging: Advanced search algorithm that merges adjacent/nearby chunks for better context coverage
- Chat Commands: Built-in command interface for managing providers and performing searches
- Tool Integration: Exported tools for hybrid search functionality
- State Management: Agent-specific state persistence for active provider selection
- Lazy Initialization: Non-blocking initialization with background file processing
- File Change Detection: Automatic handling of file additions, modifications, and deletions
- Batch Processing: 10 parallel tasks for file indexing
- Polling Interval: 250ms for file change detection
Core Components/API
FileIndexProvider
Abstract base class defining the interface for all file indexing providers. All providers must implement this interface.
export interface SearchResult {
path: string;
chunk_index: number;
content: string;
relevance?: number;
distance?: number;
}
export default abstract class FileIndexProvider {
// Core search methods
abstract search(query: string, limit?: number): Promise<SearchResult[]>;
abstract fullTextSearch(query: string, limit?: number): Promise<SearchResult[]>;
// Lifecycle methods
abstract waitReady(): Promise<void>;
abstract processFile(filePath: string): Promise<void>;
abstract onFileChanged(type: string, filePath: string): void;
abstract close(): Promise<void>;
// Current file context
abstract setCurrentFile(filePath: string): void;
abstract clearCurrentFile(): void;
abstract getCurrentFile(): string | null;
}
EphemeralFileIndexProvider
In-memory implementation providing fast, non-persistent file indexing. This is the only currently available provider implementation.
import EphemeralFileIndexProvider from '@tokenring-ai/file-index/EphemeralFileIndexProvider';
const provider = new EphemeralFileIndexProvider('/path/to/project');
await provider.start();
Key Features:
- In-memory storage using Map for file contents and chunks
- Queue-based batch processing (250ms interval, 10 parallel tasks)
- Case-insensitive full-text search with relevance scoring
- Lazy initialization with background file processing
- File change handling (unlinks remove from index, changes trigger re-indexing)
Constructor:
| Parameter | Type | Default | Description |
|---|---|---|---|
baseDirectory | string | process.cwd() | Base directory for file operations |
Methods:
| Method | Parameters | Returns | Description |
|---|---|---|---|
start() | - | Promise<void> | Start lazy initialization and file processing |
waitReady() | - | Promise<void> | Wait for initialization to complete |
processFile() | filePath: string | Promise<void> | Process and index a single file |
onFileChanged() | type: string, filePath: string | void | Handle file system changes |
fullTextSearch() | query: string, limit: number | Promise<SearchResult[]> | Full-text search with relevance scoring |
search() | query: string, limit: number | Promise<SearchResult[]> | Search (delegates to fullTextSearch) |
setCurrentFile() | filePath: string | void | Set current file context |
clearCurrentFile() | - | void | Clear current file context |
getCurrentFile() | - | string | null | Get current file context |
close() | - | Promise<void> | Cleanup and clear all data |
FileIndexService
Registry service that manages multiple providers and allows dynamic switching between implementations. Implements the TokenRingService interface.
import FileIndexService from '@tokenring-ai/file-index/FileIndexService';
import { FileIndexServiceConfigSchema } from '@tokenring-ai/file-index/schema';
import { z } from 'zod';
const config: z.input<typeof FileIndexServiceConfigSchema> = {
agentDefaults: {
provider: 'ephemeral'
}
};
const service = new FileIndexService(config);
Service Properties:
| Property | Type | Description |
|---|---|---|
name | string | Service name: "FileIndexService" |
description | string | Service description: "Provides FileIndex functionality" |
options | object | Service configuration from schema |
Key Methods:
| Method | Parameters | Returns | Description |
|---|---|---|---|
registerFileIndexProvider | name: string, provider: FileIndexProvider | void | Register a new provider via KeyedRegistry |
getAvailableFileIndexProviders | - | string[] | Get list of registered provider names |
setActiveProvider | name: string, agent: Agent | void | Set active provider for an agent session |
requireActiveProvider | agent: Agent | FileIndexProvider | Get current provider or throw if none set |
search | query: string, limit: number, agent: Agent | Promise<SearchResult[]> | Execute search using active provider |
fullTextSearch | query: string, limit: number, agent: Agent | Promise<SearchResult[]> | Full-text search via active provider |
waitReady | agent: Agent | Promise<void> | Wait for provider initialization |
close | agent: Agent | Promise<void> | Close and cleanup provider |
attach | agent: Agent | void | Attach service to agent and initialize state |
StringSearchFileIndexService
Alternative service implementation that wraps EphemeralFileIndexProvider for direct usage with TokenRingApp.
import StringSearchFileIndexService from '@tokenring-ai/file-index/StringSearchFileIndexService';
const service = new StringSearchFileIndexService(app, '/path/to/project');
await service.run();
const results = await service.search('query', 10, agent);
Key Features:
- Wrapper around EphemeralFileIndexProvider
- Implements TokenRingService interface
- Integration with TokenRingApp
- Agent-aware initialization and waiting
Constructor:
| Parameter | Type | Description |
|---|---|---|
app | TokenRingApp | Application instance |
baseDirectory | string | Optional base directory for file operations |
Methods:
| Method | Parameters | Returns | Description |
|---|---|---|---|
run() | - | Promise<void> | Start the provider |
waitReady() | agent: Agent | Promise<void> | Wait for provider initialization |
fullTextSearch() | query: string, limit: number, agent: Agent | Promise<SearchResult[]> | Full-text search |
search() | query: string, limit: number, agent: Agent | Promise<SearchResult[]> | Search |
close() | - | Promise<void> | Close provider |
Services
The package provides the following TokenRingService implementations:
FileIndexService
The main service for managing file index providers and performing searches.
Service Registration:
import FileIndexService from '@tokenring-ai/file-index/FileIndexService';
import {FileIndexServiceConfigSchema} from '@tokenring-ai/file-index/schema';
const config = {
agentDefaults: {
provider: 'ephemeral'
}
};
const fileIndexService = new FileIndexService(config);
app.addServices(fileIndexService);
Provider Registration:
import EphemeralFileIndexProvider from '@tokenring-ai/file-index/EphemeralFileIndexProvider';
fileIndexService.registerFileIndexProvider('ephemeral', new EphemeralFileIndexProvider());
Usage from Agent:
const fileIndexService = agent.requireServiceByType(FileIndexService);
// Set active provider
fileIndexService.setActiveProvider('ephemeral', agent);
// Wait for provider to be ready
await fileIndexService.waitReady(agent);
// Perform search
const results = await fileIndexService.search('query', 10, agent);
State Management
FileIndexState
Manages agent-specific state for file index including active provider selection. Extends AgentStateSlice.
import { FileIndexState } from '@tokenring-ai/file-index/state/FileIndexState';
const state = agent.getState(FileIndexState);
console.log(state.activeProvider); // Current active provider
State Properties:
| Property | Type | Description |
|---|---|---|
activeProvider | string | null | Name of currently active provider |
initialConfig | object | Agent defaults configuration from service |
State Methods:
| Method | Parameters | Returns | Description |
|---|---|---|---|
transferStateFromParent | parent: Agent | void | Inherit active provider from parent agent |
reset | - | void | Reset state to initial config provider |
serialize | - | { activeProvider: string | null } | Return serializable state object |
deserialize | data: { activeProvider: string | null } | void | Restore state from object |
show | - | string | Display state information as string |
Serialization Schema:
const serializationSchema = z.object({
activeProvider: z.string().nullable()
});
RPC Endpoints
This package does not define RPC endpoints. It provides functionality through:
- Chat tools (hybridSearchFileIndex)
- Chat commands (/fileindex)
- Direct service method calls
Chat Commands
The package provides the following chat commands, automatically registered when the plugin is installed:
/fileindex provider get
Display the currently active file index provider.
Syntax:
/fileindex provider get
Example:
/fileindex provider get
Response:
Active provider: ephemeral
Description: Shows the name of the currently active file index provider. Returns "none" if no provider is set.
/fileindex provider set <providerName>
Set a specific file index provider by name.
Syntax:
/fileindex provider set <providerName>
Example:
/fileindex provider set ephemeral
Response (success):
Active provider set to: ephemeral
Response (error):
Provider "invalid" not found. Available providers: ephemeral
Parameters:
providerName(required): The name of the provider to activate
Description: Sets the active file index provider for the current agent session. The provider must be registered with the service.
/fileindex provider reset
Reset to the default provider from agent configuration.
Syntax:
/fileindex provider reset
Example:
/fileindex provider reset
Response:
Default provider: ephemeral
Description: Resets the active provider to the default configured in the agent's initial configuration.
/fileindex provider select
Interactively select an active file index provider from available options.
Syntax:
/fileindex provider select
Example:
/fileindex provider select
Response:
Active provider set to: ephemeral
Behavior:
- Shows interactive tree-select menu with available providers
- Displays "(current)" marker for currently active provider
- Auto-selects sole available provider if only one is configured
- Returns "No file index providers are registered." if no providers exist
- Returns "Provider selection cancelled." if user cancels
/fileindex search <query>
Search for text across indexed files.
Syntax:
/fileindex search <query>
Example:
/fileindex search function getUser
Response:
Found 3 result(s):
📄 /path/to/file.ts:
...matching content...
📄 /path/to/another.ts:
...matching content...
Parameters:
query(required): The search query string
Description: Performs a full-text search across all indexed files and returns up to 10 matching results with file paths and content. Waits for the provider to be ready before searching.
Response (no results):
No results found.
Configuration
Plugin Configuration Schema
The plugin accepts configuration through the fileIndex property:
import {FileIndexServiceConfigSchema} from '@tokenring-ai/file-index/schema';
import { z } from 'zod';
export const FileIndexAgentConfigSchema = z
.object({
provider: z.string().optional(),
})
.default({});
export const FileIndexProviderConfigSchema = z.object({
type: z.enum(["ephemeral"]),
});
export const FileIndexServiceConfigSchema = z.object({
agentDefaults: z
.object({
provider: z.string(),
})
.default({provider: "ephemeral"}),
});
Schema Properties:
| Property | Type | Required | Description |
|---|---|---|---|
agentDefaults | object | Yes | Default agent configuration |
agentDefaults.provider | string | Yes | Default provider name for agents |
type | "ephemeral" | Yes | Provider type (currently only "ephemeral" is supported) |
Plugin Configuration Example
const config = {
fileIndex: {
agentDefaults: {
provider: 'ephemeral'
}
}
};
Service Configuration
When creating the service directly:
import FileIndexService from '@tokenring-ai/file-index/FileIndexService';
const config = {
agentDefaults: {
provider: 'ephemeral'
}
};
const service = new FileIndexService(config);
Agent Configuration
Agents can override the default provider through agent configuration slices. The agent's configuration is merged with
the service's agentDefaults using deep merge:
// Agent configuration slice (optional)
const agentConfig = {
provider: 'ephemeral' // Override default provider
};
// This will be merged with service's agentDefaults
Integration
Plugin Registration
The package follows the standard TokenRing plugin pattern. The plugin automatically:
- Creates and registers the
FileIndexService - Registers the
ephemeralprovider - Registers tools with
ChatService - Registers chat commands with
AgentCommandService
import {AgentCommandService} from '@tokenring-ai/agent';
import type {TokenRingPlugin} from '@tokenring-ai/app';
import {ChatService} from '@tokenring-ai/chat';
import { z } from 'zod';
import agentCommands from './commands';
import EphemeralFileIndexProvider from './EphemeralFileIndexProvider';
import FileIndexService from './FileIndexService';
import packageJSON from './package.json' with {type: 'json'};
import {FileIndexServiceConfigSchema} from './schema';
import tools from './tools';
const packageConfigSchema = z.object({
fileIndex: FileIndexServiceConfigSchema.prefault({}),
});
export default {
name: packageJSON.name,
displayName: "File Indexing",
version: packageJSON.version,
description: packageJSON.description,
install(app, config) {
const fileIndexService = new FileIndexService(config.fileIndex);
app.addServices(fileIndexService);
fileIndexService.registerFileIndexProvider(
"ephemeral",
new EphemeralFileIndexProvider(),
);
app.waitForService(ChatService, (chatService) =>
chatService.addTools(tools),
);
app.waitForService(AgentCommandService, (agentCommandService) =>
agentCommandService.addAgentCommands(agentCommands),
);
},
config: packageConfigSchema,
} satisfies TokenRingPlugin<typeof packageConfigSchema>;
Manual Service Registration
For direct service registration without the plugin:
import FileIndexService from '@tokenring-ai/file-index/FileIndexService';
import EphemeralFileIndexProvider from '@tokenring-ai/file-index/EphemeralFileIndexProvider';
const config = {
agentDefaults: {
provider: 'ephemeral'
}
};
const fileIndexService = new FileIndexService(config);
app.addServices(fileIndexService);
// Register providers manually
fileIndexService.registerFileIndexProvider('ephemeral', new EphemeralFileIndexProvider());
// Register tools and commands manually
chatService.addTools(tools);
agentCommandService.addAgentCommands(agentCommands);
Agent Integration
The service integrates with agents through:
- State Management:
FileIndexStateis attached to each agent during service attachment - Tools:
hybridSearchFileIndextool is available to agents - Commands:
/fileindexcommands are available in agent chat - Service Access: Agents can access the service via
agent.requireServiceByType(FileIndexService)
// Access service from agent
const fileIndexService = agent.requireServiceByType(FileIndexService);
// Access state from agent
const state = agent.getState(FileIndexState);
console.log(state.activeProvider);
Tools
The package exports the following tools for agent interaction:
hybridSearchFileIndex
Advanced hybrid search tool combining full-text search with token overlap scoring (BM25-like) and intelligent result merging.
Tool Definition:
const hybridSearchFileIndex = {
name: 'file-index_hybridSearchFileIndex',
displayName: 'FileIndex/hybridSearchFileIndex',
description: 'Hybrid full-text+keyword search with merging/deduplication.',
inputSchema: z.object({
query: z.string().describe('Text or code query: keyword and full-text matches combined.'),
topK: z.number().int().default(10).describe('Number of top merged results to return (default 10)'),
textWeight: z.number().default(0.3).describe('Weight (0-1) for token overlap score (default 0.3)'),
fullTextWeight: z.number().default(0.3).describe('Weight (0-1) for full-text search score (default 0.3)'),
mergeRadius: z.number().int().default(1).describe('How close (in chunk indices) hits must be to merge (default: 1)')
}),
execute: async (params, agent: Agent): Promise<TokenRingToolJSONResult<HybridSearchResult[]>>
};
Input Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
query | string | - | Text or code query to search for |
topK | number | 10 | Number of top merged results to return |
textWeight | number | 0.3 | Weight for token overlap score |
fullTextWeight | number | 0.3 | Weight for full-text search score |
mergeRadius | number | 1 | Maximum gap between chunks to merge |
Returns: TokenRingToolJSONResult<HybridSearchResult[]>
HybridSearchResult Interface:
interface HybridSearchResult {
path: string; // Full path to the file (absolute path)
start: number; // Starting chunk index in the file
end: number; // Ending chunk index in the file
hybridScore: number; // Combined relevance score
content: string; // Merged content of all chunks in the range
}
Search Algorithm:
- Parallel Search: Executes both search (
fileIndex.search) and full-text search (fileIndex.fullTextSearch) in parallel withtopK * 4results each - Token Overlap (BM25-like): Computes token overlap score by counting query tokens found in each chunk
const queryTokens = query.toLowerCase().split(/\W+/).filter(Boolean);const textTokens = text.toLowerCase().split(/\W+/).filter(Boolean);const overlap = textTokens.filter((t) => queryTokens.includes(t)).length;const textScore = overlap / (queryTokens.length || 1);
- Score Combination: Normalizes and combines scores using weighted formula:
embScore = 1 - distance(converts embedding distance to similarity)normalizedFullText = fullTextScore / maxFullTextScorehybridScore = (1 - textWeight - fullTextWeight) * embScore + textWeight * textScore + fullTextWeight * normalizedFullText
- Chunk Merging: Groups results by file and merges adjacent/nearby chunks within
mergeRadius - Result Selection: Returns top K merged result blocks sorted by hybrid score
Example Usage:
import { hybridSearchFileIndex } from '@tokenring-ai/file-index/tools';
// Basic usage
const results = await hybridSearchFileIndex.execute(
{
query: 'user authentication',
topK: 5,
textWeight: 0.3,
fullTextWeight: 0.3,
mergeRadius: 1
},
agent
);
console.log(`Found ${results.data.length} merged regions`);
for (const result of results.data) {
console.log(`\n📄 ${result.path} (chunks ${result.start}-${result.end}):`);
console.log(` Score: ${result.hybridScore.toFixed(3)}`);
console.log(` Content:\n${result.content}`);
}
Tuning Parameters:
// Emphasis on token overlap (keyword matching)
const keywordFocused = await hybridSearchFileIndex.execute(
{ query, topK, textWeight: 0.5, fullTextWeight: 0.2, mergeRadius: 1 },
agent
);
// Emphasis on full-text matching
const fullTextFocused = await hybridSearchFileIndex.execute(
{ query, topK, textWeight: 0.2, fullTextWeight: 0.6, mergeRadius: 1 },
agent
);
// Aggressive merging for broader context
const broadContext = await hybridSearchFileIndex.execute(
{ query, topK, textWeight: 0.3, fullTextWeight: 0.3, mergeRadius: 2 },
agent
);
Note: The tool requires an active file index provider to be set. If no provider is active, it will throw an error.
Available Tools Export
Tools are exported from @tokenring-ai/file-index/tools:
import tools from '@tokenring-ai/file-index/tools';
// Currently exports:
// - hybridSearchFileIndex
// Note: searchFileIndex tool is commented out in the current implementation
// and not included in the exported tools.
Usage Examples
Plugin Installation
import {TokenRingApp} from '@tokenring-ai/app';
import fileIndexPlugin from '@tokenring-ai/file-index/plugin';
const app = new TokenRingApp();
// Install plugin with configuration
await app.install(fileIndexPlugin, {
fileIndex: {
agentDefaults: {
provider: 'ephemeral'
}
}
});
Direct Service Setup
import FileIndexService from '@tokenring-ai/file-index/FileIndexService';
import EphemeralFileIndexProvider from '@tokenring-ai/file-index/EphemeralFileIndexProvider';
import {FileIndexServiceConfigSchema} from '@tokenring-ai/file-index/schema';
import { z } from 'zod';
const config: z.input<typeof FileIndexServiceConfigSchema> = {
agentDefaults: {
provider: 'ephemeral'
}
};
const fileIndexService = new FileIndexService(config);
app.addServices(fileIndexService);
// Register provider manually
fileIndexService.registerFileIndexProvider('ephemeral', new EphemeralFileIndexProvider());
Using the Hybrid Search Tool
import { hybridSearchFileIndex } from '@tokenring-ai/file-index/tools';
// Perform hybrid search with tuned parameters
const results = await hybridSearchFileIndex.execute(
{
query: 'implement user authentication flow',
topK: 5, // Return 5 merged results
textWeight: 0.3, // 30% token overlap importance
fullTextWeight: 0.3, // 30% full-text matching importance
mergeRadius: 1 // Merge adjacent chunks within 1 index
},
agent
);
console.log(`Found ${results.data.length} merged regions:`);
for (const result of results.data) {
console.log(`\n📄 ${result.path} (chunks ${result.start}-${result.end}):`);
console.log(` Score: ${result.hybridScore.toFixed(3)}`);
console.log(` Content:\n${result.content}`);
}
Provider Management
import FileIndexService from '@tokenring-ai/file-index/FileIndexService';
const fileIndexService = agent.requireServiceByType(FileIndexService);
// Get available providers
const providers = fileIndexService.getAvailableFileIndexProviders();
console.log('Available providers:', providers);
// Set provider for specific agent session
fileIndexService.setActiveProvider('ephemeral', agent);
// Wait for provider to be ready
await fileIndexService.waitReady(agent);
// Search using active provider
const results = await fileIndexService.search('user authentication', 10, agent);
console.log(`Found ${results.length} results`);
Using StringSearchFileIndexService
import StringSearchFileIndexService from '@tokenring-ai/file-index/StringSearchFileIndexService';
import TokenRingApp from '@tokenring-ai/app';
const app = new TokenRingApp();
const service = new StringSearchFileIndexService(app, '/path/to/project');
// Start the service
await service.run();
// Wait for readiness
await service.waitReady(agent);
// Perform search
const results = await service.search('query', 10, agent);
console.log(`Found ${results.length} results`);
// Close when done
await service.close();
Using Chat Commands
// Commands are automatically registered when plugin is installed
// Usage in agent chat:
// Search across indexed files
/fileindex search function getUser
// Get current provider
/fileindex provider get
// Response: Active provider: ephemeral
// Set specific provider
/fileindex provider set ephemeral
// Response: Active provider set to: ephemeral
// Reset to default provider
/fileindex provider reset
// Response: Default provider: ephemeral
// Interactively select provider
/fileindex provider select
// Shows interactive tree-select menu with available providers
Custom Provider Implementation
import FileIndexProvider, {SearchResult} from '@tokenring-ai/file-index/FileIndexProvider';
import fs from 'fs-extra';
class CustomFileIndexProvider extends FileIndexProvider {
private indexedFiles = new Map<string, string>();
private isReady = false;
async waitReady(): Promise<void> {
if (!this.isReady) {
await this.loadAllFiles();
this.isReady = true;
}
}
async search(query: string, limit?: number): Promise<SearchResult[]> {
await this.waitReady();
const results: SearchResult[] = [];
const lowerQuery = query.toLowerCase();
for (const [filePath, content] of this.indexedFiles.entries()) {
const matches = content.matchAll(new RegExp(lowerQuery, 'gi'));
for (const match of matches) {
if (results.length >= (limit || 10)) break;
results.push({
path: filePath,
chunk_index: 0,
content: match[0],
relevance: match[0].length
});
}
}
return results.slice(0, limit).sort((a, b) => (b.relevance || 0) - (a.relevance || 0));
}
async fullTextSearch(query: string, limit?: number): Promise<SearchResult[]> {
return this.search(query, limit);
}
async processFile(filePath: string): Promise<void> {
const content = await fs.readFile(filePath, 'utf8');
this.indexedFiles.set(filePath, content);
}
onFileChanged(type: string, filePath: string): void {
if (type === 'unlink') {
this.indexedFiles.delete(filePath);
} else {
this.processFile(filePath);
}
}
setCurrentFile(filePath: string): void {
// Implement file context tracking
}
clearCurrentFile(): void {
// Reset current file context
}
getCurrentFile(): string | null {
return null;
}
async close(): Promise<void> {
this.indexedFiles.clear();
this.isReady = false;
}
private async loadAllFiles(): Promise<void> {
// Load and index files from your custom storage
}
}
// Register with service
fileIndexService.registerFileIndexProvider('custom', new CustomFileIndexProvider());
Agent State Persistence
import {FileIndexState} from '@tokenring-ai/file-index/state/FileIndexState';
// State is automatically managed per agent
const fileIndexService = agent.requireServiceByType(FileIndexService);
// State persists across agent sessions via serialization
const state = agent.getState(FileIndexState);
// Check current provider
const activeProvider = state.activeProvider;
// Serialize state for storage
const serialized = state.serialize();
// { activeProvider: 'ephemeral' }
// Restore state from storage
state.deserialize(serialized);
// Display state information
const stateInfo = state.show();
console.log(stateInfo);
// "Active FileIndex Provider: ephemeral"
Best Practices
Search Weight Tuning
The hybrid search tool allows tuning of search weights for different use cases. See the Tools section for detailed parameter tuning examples.
Provider Selection
- Use
ephemeralprovider for development and testing due to its fast initialization - Consider implementing persistent providers (e.g., database-backed) for production use
- Let agents choose providers via
/fileindex provider selectcommand for flexibility
Chunk Size Considerations
The default chunk size is ~1000 characters with line-based splitting. For custom chunking strategies:
- Implement a custom
FileIndexProviderwith your preferred chunking logic - Smaller chunks (500-700): Better for precise keyword matching
- Larger chunks (1200-1500): Better for broader context understanding
Performance Considerations
- Batch Processing: 10 parallel tasks for file indexing
- Polling Interval: 250ms for file change detection
- Lazy Initialization: Non-blocking startup with background processing
- Memory Usage: The ephemeral provider stores all indexed content in memory
Query Optimization
- Use specific queries for better results
- The hybrid search combines multiple strategies, so natural language queries work well
- For keyword-heavy searches, increase
textWeightparameter
Testing and Development
Building
bun run build
Linting
bun run eslint
Testing
bun run test
Test Watching
bun run test:watch
Test Coverage
bun run test:coverage
Test Example
import { describe, it, expect, beforeEach } from 'vitest';
import FileIndexService from '@tokenring-ai/file-index/FileIndexService';
import EphemeralFileIndexProvider from '@tokenring-ai/file-index/EphemeralFileIndexProvider';
describe('FileIndexService', () => {
let service: FileIndexService;
let testAgent: Agent;
beforeEach(async () => {
const config = {
agentDefaults: { provider: 'ephemeral' }
};
service = new FileIndexService(config);
testAgent = createTestAgent();
service.attach(testAgent);
});
it('should search files', async () => {
await service.waitReady(testAgent);
const results = await service.search('test', 5, testAgent);
expect(results).toBeInstanceOf(Array);
});
it('should manage providers', () => {
service.registerFileIndexProvider('test', new EphemeralFileIndexProvider());
const providers = service.getAvailableFileIndexProviders();
expect(providers).toContain('ephemeral');
expect(providers).toContain('test');
});
});
Package Structure
pkg/file-index/
├── index.ts # Main entry point and exports (schemas, classes)
├── package.json # Package metadata and dependencies
├── plugin.ts # Plugin definition for app installation
├── schema.ts # Zod schemas for configuration
├── FileIndexProvider.ts # Abstract provider interface and SearchResult type
├── FileIndexService.ts # Service registry for providers (TokenRingService)
├── StringSearchFileIndexService.ts # Alternative service using EphemeralFileIndexProvider
├── EphemeralFileIndexProvider.ts # In-memory provider implementation
├── commands.ts # Exports all chat commands
│ └── commands/
│ └── fileindex/
│ ├── search.ts # /fileindex search command
│ └── provider/
│ ├── get.ts # /fileindex provider get command
│ ├── set.ts # /fileindex provider set command
│ ├── reset.ts # /fileindex provider reset command
│ └── select.ts # /fileindex provider select command
├── tools.ts # Exports all tools (currently only hybridSearchFileIndex)
│ └── tools/
│ ├── hybridSearchFileIndex.ts # Hybrid search tool implementation
│ └── searchFileIndex.ts # Semantic search tool (commented out, not exported)
├── state/
│ └── FileIndexState.ts # Agent state slice for file index
├── util/
│ ├── ComputeChunkLineStarts.ts # Compute line starts for chunks (utility)
│ ├── chunker.ts # Line-based chunking utility
│ └── sha256.ts # SHA256 hash utility
├── symbols/
│ └── symbolExtractor.ts # Symbol extraction using tree-sitter (utility)
└── vitest.config.ts # Test configuration
Key Files:
| File | Purpose |
|---|---|
index.ts | Main entry point exporting schemas and class types |
plugin.ts | TokenRing plugin for installation and service registration |
schema.ts | Zod schemas for service and agent configuration |
FileIndexProvider.ts | Abstract base class for all providers |
FileIndexService.ts | Main service managing providers and agent integration |
EphemeralFileIndexProvider.ts | In-memory provider implementation |
StringSearchFileIndexService.ts | Alternative service wrapper |
commands.ts | Aggregates all chat command definitions |
tools.ts | Aggregates all tool definitions |
state/FileIndexState.ts | Agent state management |
Dependencies
Runtime Dependencies
All Token Ring packages are workspace dependencies (version 0.2.0):
| Package | Version | Purpose |
|---|---|---|
@tokenring-ai/app | 0.2.0 | Base application framework and plugin system |
@tokenring-ai/chat | 0.2.0 | Chat and tool system |
@tokenring-ai/agent | 0.2.0 | Agent orchestration and state management |
@tokenring-ai/filesystem | 0.2.0 | File system operations |
@tokenring-ai/utility | 0.2.0 | Shared utility functions |
zod | ^4.3.6 | Schema validation |
fs-extra | ^11.3.4 | File system operations |
commander | ^14.0.3 | Command-line interface |
glob-gitignore | ^1.0.15 | Gitignore-style pattern matching |
gpt-tokenizer | ^3.4.0 | Token counting for chunking |
mysql2 | ^3.20.0 | MySQL client |
sentencex | ^1.0.17 | Sentence segmentation |
sqlite-vec | 0.1.8-alpha.1 | Vector database |
tree-sitter | ^0.25.0 | Syntax parsing |
tree-sitter-javascript | ^0.25.0 | JavaScript grammar |
chokidar | ^5.0.0 | File watcher (listed but not currently used) |
Note: The package lists chokidar in dependencies but it is not currently used in the implementation. The
EphemeralFileIndexProvider uses a simple polling mechanism (setTimeout) for file change detection.
Development Dependencies
| Package | Version | Purpose |
|---|---|---|
vitest | ^4.1.1 | Unit testing framework |
typescript | ^6.0.2 | TypeScript compiler |
@types/fs-extra | ^11.0.4 | Type definitions for fs-extra |
Error Handling
Common Errors
No Active Provider:
Error: No file index provider has been enabled.
Solution: Set an active provider before searching:
fileIndexService.setActiveProvider('ephemeral', agent);
Command Failed Errors:
- Empty query:
No results found. - Provider not found:
Provider "name" not found. Available providers: ...
Error Types
CommandFailedError: Thrown when command parameters are invalid- Generic
Error: Thrown when no active provider is set
Limitations and Considerations
- Memory Usage: The ephemeral provider stores all indexed files in memory, which may be unsuitable for very large codebases (5000+ files)
- Search Methods: The default
search()method delegates tofullTextSearch(), so both achieve similar results from the current implementation - Storage Backend: Currently only ephemeral in-memory provider is implemented. Database and vector providers are planned for future versions
- File Types: Focuses on text files. Binary files are silently skipped during indexing
- Search Dimensions: Currently provides full-text and hybrid scoring. True semantic search requires embedding model integration
- Indexing Performance: Large codebases may experience initial indexing lag due to lazy loading and batch processing
- Chunk Size: Fixed at ~1000 characters with simple line-based splitting. Variable chunking is possible with custom providers
- Result Merging: Merge behavior is controlled by
mergeRadiusparameter. Larger values increase context but reduce precision - Provider Switching: Provider selection is session-specific. Changing provider affects only current agent session
- Updates: File modifications are only indexed after processing queue settles (250ms delay)
Related Components
- @tokenring-ai/agent: Agent orchestration and state management
- @tokenring-ai/chat: Chat and tool system
- @tokenring-ai/app: Base application framework
- @tokenring-ai/filesystem: File system operations
- @tokenring-ai/utility: Shared utility functions
License
MIT License - see LICENSE file for details.