Semantic Search
Most knowledge tools make you organize everything up front so you can find it later. Basic Memory takes the opposite approach — write naturally, and let search figure out the connections.
When you or your AI save a note, Basic Memory indexes it two ways: by the exact words it contains, and by what it means. Search for "login security improvements" and find notes about "authentication hardening" even though the exact words don't match. This happens automatically — no tagging system to maintain, no folder hierarchy to get right.
What makes this different from a typical search tool is that your AI is part of the loop. It writes notes, searches them, and builds on what it finds — all through the same tools. The search index updates incrementally as notes change, so the AI is always working with current knowledge, not a stale snapshot.
Search modes
Basic Memory supports three search modes. Your AI already knows when to use each one — just describe what you're looking for in plain language. You can also use them directly from the command line.
Text
Keyword search. You type words, it finds notes containing those words. Supports boolean operators like AND and OR.
You: "Search my notes for project AND planning"
AI: [Finds notes containing both words]
bm tool search-notes --query "project AND planning" --search-type text
Best when you know the exact terms — names, identifiers, tags.
Vector
Meaning-based search. Your query is compared against all indexed content by meaning, not exact words. Searching for "ways to make the frontend faster" finds notes about "React performance optimization" and "bundle size reduction."
You: "Find notes about ways to make the frontend faster"
AI: [Finds notes about React performance, bundle size, lazy loading]
bm tool search-notes --query "ways to make the frontend faster" --search-type vector
Best for conceptual queries when you aren't sure what words were used.
Hybrid (default)
Runs both text and vector search, then combines results. Notes found by both methods rank highest. Notes found by only one method still appear but rank lower.
You: "Search for authentication security"
AI: [Finds notes matching the words AND notes about related concepts]
bm tool search-notes --query "authentication security"
This is the default. You get keyword precision plus meaning-based recall.
You don't need to pick
Basic Memory describes each search mode to your AI, so it already knows when to use text, vector, or hybrid search. Just ask for what you want in plain language:
- "Find my notes about authentication security" — the AI runs a hybrid search
- "Search for the exact phrase project AND planning" — the AI switches to text search
- "What have I written about handling failures gracefully?" — the AI uses vector search
If you're using the CLI directly, hybrid is the default and covers most cases. Use --search-type text when you need exact keyword matches, or --search-type vector when you're exploring by concept.
Filtering
Tag search
Search by tag using a simple prefix:
You: "Search for notes tagged security"
You: "Search for tag:coffee AND tag:brewing"
bm tool search-notes --query "tag:security"
Metadata filters
Search by note properties — like status, tags, or any custom fields — without needing a text query:
You: "Show me all my in-progress notes"
You: "Find notes tagged security and oauth"
You: "What are my high-priority items?"
bm tool search-notes --metadata '{"status": "in-progress"}'
bm tool search-notes --tags security
Combining text and filters
Combine a text or meaning-based search with property filters:
You: "Search for authentication notes that are specs"
You: "Find active error handling notes"
bm tool search-notes --query "authentication" --note-types spec
bm tool search-notes --query "error handling" --metadata '{"status": "active"}'
Enabling semantic search
Semantic search works automatically — there's nothing to set up. It's included and enabled by default in all standard Basic Memory installs (Homebrew and uv). Embeddings are generated automatically on first startup for existing notes.
For a few hundred notes, expect 1–3 minutes for the initial index build. After that, new and edited notes are indexed incrementally during normal sync.
To manually rebuild the search index (e.g., after switching providers):
bm reindex --embeddings
Platform compatibility
| Platform | FastEmbed (local) | OpenAI (API) |
|---|---|---|
| macOS ARM64 (Apple Silicon) | Yes | Yes |
| macOS x86_64 (Intel Mac) | No (see workaround) | Yes |
| Linux x86_64 | Yes | Yes |
| Linux ARM64 | Yes | Yes |
| Windows x86_64 | Yes | Yes |
Intel Mac workaround
The default local embedding model doesn't run on Intel Macs. You have two options:
Option 1: Use OpenAI embeddings (recommended)
Requires an OpenAI API subscription — create a key there and set it as an environment variable.
export BASIC_MEMORY_SEMANTIC_EMBEDDING_PROVIDER=openai
export OPENAI_API_KEY=sk-...
bm reindex --embeddings
Option 2: Pin ONNX Runtime
uv pip install 'onnxruntime<1.24'
bm reindex --embeddings
Under the Hood
You don't need to read this section to use search — it's here for the curious.
Chunking
Notes are not searched as whole documents. Basic Memory breaks each note into smaller pieces before indexing, so search can surface the specific part of a note that's relevant.
The chunking follows the note's structure:
- Headers create chunk boundaries — each section becomes its own chunk
- Observations (categorized facts like
- [technique] Water temperature...) are indexed individually - Relations (links to other notes like
- works_at [[Company]]) are indexed individually - Prose paragraphs are merged into chunks of ~900 characters with ~120 character overlap at boundaries
This means a search for "water temperature for brewing" can surface the specific fact Water temperature at 205°F extracts optimal compounds rather than returning the entire "Coffee Brewing Methods" note.
How results are ranked
Search results return whole notes ranked by relevance, with the matched chunk text showing which part of the note was most relevant.
Hybrid fusion
Hybrid mode runs text and vector search independently, then merges results using score-based fusion:
Results found by both keyword and meaning match rank highest. The dominant signal (whichever source scored higher) is preserved, while the weaker signal adds a 30% bonus. Items found by only one source keep their original score.
Deduplication
Each chunk has a content hash. When notes are re-synced or reindexed, unchanged chunks skip re-indexing. Only modified content triggers new embeddings. Editing one note in a thousand-note knowledge base only re-indexes the chunks that changed.
Configuration
| Config field | Env var | Default | Description |
|---|---|---|---|
semantic_search_enabled | BASIC_MEMORY_SEMANTIC_SEARCH_ENABLED | true | Enable semantic search |
semantic_embedding_provider | BASIC_MEMORY_SEMANTIC_EMBEDDING_PROVIDER | "fastembed" | "fastembed" (local) or "openai" (API) |
semantic_embedding_model | BASIC_MEMORY_SEMANTIC_EMBEDDING_MODEL | "bge-small-en-v1.5" | Embedding model identifier |
semantic_embedding_dimensions | BASIC_MEMORY_SEMANTIC_EMBEDDING_DIMENSIONS | auto-detected | 384 (FastEmbed), 1536 (OpenAI) |
semantic_embedding_batch_size | BASIC_MEMORY_SEMANTIC_EMBEDDING_BATCH_SIZE | 64 | Texts per embedding batch |
semantic_vector_k | BASIC_MEMORY_SEMANTIC_VECTOR_K | 100 | Vector candidate count |
semantic_min_similarity | BASIC_MEMORY_SEMANTIC_MIN_SIMILARITY | 0.55 | Minimum similarity threshold |
Similarity threshold
The semantic_min_similarity setting controls which results are "similar enough" to return. The value ranges from 0.0 to 1.0:
- Higher (e.g.,
0.7) — Fewer results, stronger relevance. Good for focused queries. - Lower (e.g.,
0.3) — More results, looser associations. Good for exploration. 0.0— Disables filtering. All vector results returned regardless of score.
The default 0.55 is a reasonable middle ground.
See Configuration for the full config reference.
Embedding providers
FastEmbed (default)
- Runs locally — no API key, no network calls, no cost
- Model:
BAAI/bge-small-en-v1.5 - Dimensions: 384
OpenAI
- Requires an OpenAI API subscription and
OPENAI_API_KEYenvironment variable - Model:
text-embedding-3-small - Dimensions: 1536
- Higher dimensions capture more nuance, which can improve results for large or domain-specific knowledge bases
Switching provider or model requires rebuilding the index:
bm reindex --embeddings
Platform compatibility
| Platform | FastEmbed (local) | OpenAI (API) |
|---|---|---|
| macOS ARM64 (Apple Silicon) | Yes | Yes |
| macOS x86_64 (Intel Mac) | No (see workaround) | Yes |
| Linux x86_64 | Yes | Yes |
| Linux ARM64 | Yes | Yes |
| Windows x86_64 | Yes | Yes |
Intel Mac workaround
The default local embedding model doesn't run on Intel Macs. You have two options:
Option 1: Use OpenAI embeddings (recommended)
Requires an OpenAI API subscription — create a key there and set it as an environment variable.
export BASIC_MEMORY_SEMANTIC_EMBEDDING_PROVIDER=openai
export OPENAI_API_KEY=sk-...
bm reindex --embeddings
Option 2: Pin ONNX Runtime
uv pip install 'onnxruntime<1.24'
bm reindex --embeddings
Reindexing
# Rebuild both search index and embeddings
bm reindex
# Rebuild only vector embeddings
bm reindex --embeddings
# Rebuild only full-text search index
bm reindex --search
# Reindex a specific project
bm reindex -p my-project
When you need to reindex
- First time — Happens automatically on first startup
- Switching provider or model — Embeddings from different models aren't compatible
- After a database reset —
bm resetclears everything - Troubleshooting — A fresh reindex can fix index issues
FAQ
Is this RAG?
Sort of. RAG (retrieval-augmented generation) usually means fetching relevant documents and feeding them to an AI as context. Basic Memory does that — but it's tighter than a typical RAG pipeline in two ways. First, your AI reads and writes notes directly through MCP tools, and search is just one of several ways it navigates your knowledge. It's less "retrieve and generate" and more "the AI already lives in your notes." Second, most RAG setups assume static content — you load documents once and query them. Basic Memory watches for changes and selectively re-indexes only the chunks that were modified. Your knowledge base is alive, and the search index stays current as you and your AI work.
Do I need an API key to use semantic search?
No. The default embedding provider (FastEmbed) runs entirely on your machine — no API key, no network calls, no cost. If you want to use OpenAI embeddings instead, that does require an OpenAI API key. See Embedding providers.
Does semantic search send my notes to the cloud?
Not with the default setup. FastEmbed generates embeddings locally on your machine. Nothing leaves your computer. If you switch to OpenAI embeddings, your note text is sent to OpenAI's API for embedding — but the resulting vectors are stored locally.
How many notes can it handle?
Thousands without issue. The initial indexing takes a few minutes for a few hundred notes, and after that new or edited notes are indexed incrementally. The SQLite-based vector store is fast for typical personal knowledge bases.
My search results aren't great. What can I do?
A few things to try:
- Be more specific in your query. "authentication" is vague; "how we handle JWT token refresh" gives the vector search more meaning to work with.
- Lower the similarity threshold. The default
0.55filters out loosely related results. Try0.3if you want broader recall. See Configuration. - Check that embeddings are built. Run
bm reindex --embeddingsto make sure the index is up to date. - Try a different search mode. If you're looking for an exact term, use text search. If you're exploring a concept, use vector search directly instead of hybrid.
Can I use a different embedding model?
Yes. You can switch between FastEmbed (local, free) and OpenAI (API, paid). See Embedding providers. After switching, run bm reindex --embeddings since embeddings from different models aren't compatible.
Does Basic Memory Cloud support semantic search?
Yes. Cloud instances have semantic search enabled by default, using the same hybrid search. No setup required — it works the same way as local.
What inspired Basic Memory's search?
Basic Memory's hybrid search draws on ideas from the open source community — in particular QMD, which combines BM25 keyword search with vector semantic search in a local-first design. We loved the approach and wanted it integrated into Basic Memory's read-write knowledge system, where the AI doesn't just search your notes but actively works in them.

