Memory & Embeddings

Cortask maintains long-term memory for each workspace. When the agent uses memory_append, notes are written to memory.md and also indexed in a per-workspace SQLite database with vector embeddings. This enables memory_search to find relevant memories by meaning, not just keyword matching.

How Semantic Search Works

When a memory entry is indexed, its text is converted to a numerical vector (embedding) that captures its semantic meaning.
Embeddings are cached by content hash to avoid redundant computation.
When you search with memory_search, the query is also embedded, and the system finds stored entries whose vectors are closest to the query vector.
Results are ranked by similarity score and returned with match type (semantic or keyword).

Embedding Providers

Configure the embedding provider in ~/.cortask/config.yaml:

memory:
  embeddingProvider: local   # local, openai, google, or ollama
  embeddingModel: ""         # optional: override the default model

Local (default)

Uses a small GGUF model (embeddinggemma-300m-qat-Q8_0) via node-llama-cpp. Runs entirely on your machine.

No API key required
No data leaves your machine
Free
Lower accuracy compared to cloud models
Requires node-llama-cpp to be installed (optional dependency)
Model is downloaded automatically on first use

OpenAI

Uses OpenAI's embedding API. Default model: text-embedding-3-small.

Requires an OpenAI API key stored as credential provider.openai.apiKey
High quality embeddings
Costs per token apply

Google

Uses Google's embedding API via the Gemini provider.

Requires a Google API key stored as credential provider.google.apiKey
Good quality embeddings
Costs per token apply

Ollama

Uses a locally running Ollama instance.

Requires Ollama host stored as credential provider.ollama.host
Runs locally, no data leaves your machine
Quality depends on the model you run
Free (self-hosted)

Trade-offs

	Local	OpenAI / Google	Ollama
Cost	Free	Pay per token	Free (self-hosted)
Privacy	Full -- nothing leaves your machine	Data sent to cloud API	Full -- runs locally
Quality	Good for basic recall	Best accuracy	Depends on model
Setup	Automatic	API key required	Ollama server required
Speed	Depends on hardware	Fast (API)	Depends on hardware

For most users, the local provider is a good starting point. Switch to OpenAI or Google if you need higher retrieval accuracy for complex memory searches.

Changing Providers

When you switch embedding providers, previously computed embeddings are not automatically re-indexed. The embedding cache is keyed by content hash and provider+model, so the system will compute new embeddings as needed. To force re-indexing of all memory, delete the memory.db file in the workspace's .cortask/ directory.

You can also trigger a full re-index by updating the config and restarting the gateway. Memory managers are re-created on startup with the new provider configuration.