Memory & Embeddings
Cortask maintains long-term memory for each workspace. When the agent uses memory_append, notes are written to memory.md and also indexed in a per-workspace SQLite database with vector embeddings. This enables memory_search to find relevant memories by meaning, not just keyword matching.
How Semantic Search Works
- When a memory entry is indexed, its text is converted to a numerical vector (embedding) that captures its semantic meaning.
- Embeddings are cached by content hash to avoid redundant computation.
- When you search with
memory_search, the query is also embedded, and the system finds stored entries whose vectors are closest to the query vector. - Results are ranked by similarity score and returned with match type (semantic or keyword).
Embedding Providers
Configure the embedding provider in ~/.cortask/config.yaml:
memory:
embeddingProvider: local # local, openai, google, or ollama
embeddingModel: "" # optional: override the default model
Local (default)
Uses a small GGUF model (embeddinggemma-300m-qat-Q8_0) via node-llama-cpp. Runs entirely on your machine.
- No API key required
- No data leaves your machine
- Free
- Lower accuracy compared to cloud models
- Requires
node-llama-cppto be installed (optional dependency) - Model is downloaded automatically on first use
OpenAI
Uses OpenAI's embedding API. Default model: text-embedding-3-small.
- Requires an OpenAI API key stored as credential
provider.openai.apiKey - High quality embeddings
- Costs per token apply
Uses Google's embedding API via the Gemini provider.
- Requires a Google API key stored as credential
provider.google.apiKey - Good quality embeddings
- Costs per token apply
Ollama
Uses a locally running Ollama instance.
- Requires Ollama host stored as credential
provider.ollama.host - Runs locally, no data leaves your machine
- Quality depends on the model you run
- Free (self-hosted)
Trade-offs
| Local | OpenAI / Google | Ollama | |
|---|---|---|---|
| Cost | Free | Pay per token | Free (self-hosted) |
| Privacy | Full -- nothing leaves your machine | Data sent to cloud API | Full -- runs locally |
| Quality | Good for basic recall | Best accuracy | Depends on model |
| Setup | Automatic | API key required | Ollama server required |
| Speed | Depends on hardware | Fast (API) | Depends on hardware |
For most users, the local provider is a good starting point. Switch to OpenAI or Google if you need higher retrieval accuracy for complex memory searches.
Changing Providers
When you switch embedding providers, previously computed embeddings are not automatically re-indexed. The embedding cache is keyed by content hash and provider+model, so the system will compute new embeddings as needed. To force re-indexing of all memory, delete the memory.db file in the workspace's .cortask/ directory.
You can also trigger a full re-index by updating the config and restarting the gateway. Memory managers are re-created on startup with the new provider configuration.