File Index Plugin

File indexing and search functionality with semantic chunking and symbol extraction.

Overview

The @tokenring-ai/file-index package provides file indexing and search functionality for AI agents within the TokenRing AI ecosystem. It enables agents to index project files, chunk their contents semantically, and perform searches (full-text, semantic, or hybrid) to retrieve relevant code or text snippets.

Key Features

Semantic text chunking using sentence boundaries and token limits
Full-text search with relevance scoring
Hybrid search combining embeddings, keywords, and full-text
Symbol extraction for JavaScript/TypeScript using Tree-sitter
In-memory (ephemeral) indexing for quick setup
File watching and automatic re-indexing

Core Components

FileIndexProvider (Abstract Class)

Defines the core interface for file indexing providers.

Key Methods:

search(query, limit?): Semantic or hybrid search for relevant chunks
fullTextSearch(query, limit?): Keyword-based full-text search
processFile(filePath): Index a single file
onFileChanged(type, filePath): Handle file events
waitReady(): Await initialization
setCurrentFile(filePath) / clearCurrentFile(): Track active file context

EphemeralFileIndexProvider

In-memory provider for quick, non-persistent indexing.

Key Features:

Watches files via filesystem events
Chunks content into ~1000-char blocks
Performs case-insensitive full-text search
Uses Map for file contents with mtime tracking

FileIndexService

Registry for multiple providers, allowing dynamic switching.

Key Methods:

registerFileIndexProvider(name, provider): Add a provider
setActiveFileIndexProviderName(name): Switch active provider
fullTextSearch(query, limit?, agent): Delegates to active provider
Similar delegation for search, waitReady, setCurrentFile

Utilities

chunker.ts: chunkText(text, options)

Semantically chunks text by sentences
Respects token limits (~256 default) with overlap (~32 tokens)
Uses sentencex for segmentation and gpt-tokenizer for counting

symbols/symbolExtractor.ts: extractSymbolsFromFile(filePath)

Parses JS/TS files with Tree-sitter
Extracts functions and classes
Returns: [{ name, kind, startLine, endLine }]

Tools

hybridSearchFileIndex: Advanced search combining multiple methods

Input: { query, topK=10, textWeight=0.3, fullTextWeight=0.3, mergeRadius=1 }
Returns: Merged HybridSearchResult[] with scores

Chat Commands

/search [query]: Performs full-text search and displays results

Usage Example

import AgentTeam from '@tokenring-ai/agent/AgentTeam';
import StringSearchFileIndexService from '@tokenring-ai/file-index/StringSearchFileIndexService';

const agentTeam = new AgentTeam();
const fileIndexService = new StringSearchFileIndexService('/path/to/project');
agentTeam.registerService(fileIndexService);

await agentTeam.start();
await fileIndexService.waitReady(agent);

const results = await fileIndexService.search('function example', 5, agent);
console.log(results);

Configuration Options

Base Directory: Set in provider constructor for root to index
Chunking: Customize via chunkText options (maxTokens, overlapTokens)
Search Limits: limit param in search methods (default 10)
Weights in Hybrid Search: textWeight, fullTextWeight

Dependencies

@tokenring-ai/agent (^0.1.0): Agent integration
@tokenring-ai/filesystem (^0.1.0): File watching/paths
chokidar (^4.0.3): File system watcher
gpt-tokenizer (^3.0.1): Token counting
sentencex (^0.4.2): Sentence segmentation
tree-sitter (^0.22.4): Symbol parsing

Overview​

Key Features​

Core Components​

FileIndexProvider (Abstract Class)​

EphemeralFileIndexProvider​

FileIndexService​

Utilities​

Tools​

Chat Commands​

Usage Example​

Configuration Options​

Dependencies​