Indexing and recall

"Recall" is how TekMemo converts static memory into useful context. Instead of passing an entire massive notes.md file to an LLM on every turn, recall searches for the most relevant fragments (chunks) and injects only those into the agent's context window.

Local recall

Local recall runs entirely on your machine. It breaks .tekmemo/ files into chunks and scores them against the query using fast text matching algorithms (like BM25).

Best for:

Short, simple projects
Finding exact terms, file names, or package names
When you are offline or cannot send data to an external provider

Provider-backed recall

Provider-backed recall uses semantic embeddings (vectors) to find context based on meaning rather than exact keywords.

How it works:

Memory is chunked.
An embedding provider (like OpenAI or VoyageAI) converts the chunks into vectors.
The vectors are stored in a vector database (like Upstash).
When a query is made, it is embedded and compared against the database.
A reranker (like Voyage Rerank) can optionally re-order the top results for maximum relevance.

Best for:

Large codebases with extensive documentation
Answering conceptual questions (e.g., "How does authentication work here?")

Cloud recall

Cloud recall happens when you use TekMemo Cloud as your central memory repository. TekMemo Cloud handles the chunking, embedding, vector storage, and reranking automatically behind the scenes.

Best for:

Teams sharing memory across different machines
CI/CD pipelines that need access to memory
Applications built on the TekMemo API

When you run npx tekmemo cloud index, the Cloud API detects any changes pushed from your machine and updates the vector indices in the background.

Indexing and recall ​

Local recall ​

Provider-backed recall ​

Cloud recall ​

Indexing and recall

Local recall

Provider-backed recall

Cloud recall