Concepts

Semantic Search

Meaning-based retrieval with vector and hybrid search in Basic Memory.

Most knowledge tools make you organize everything up front so you can find it later. Basic Memory takes the opposite approach — write naturally, and let search figure out the connections.

When you or your AI save a note, Basic Memory indexes it two ways: by the exact words it contains, and by what it means. Search for "login security improvements" and find notes about "authentication hardening" even though the exact words don't match. This happens automatically — no tagging system to maintain, no folder hierarchy to get right.

What makes this different from a typical search tool is that your AI is part of the loop. It writes notes, searches them, and builds on what it finds — all through the same tools. The search index updates incrementally as notes change, so the AI is always working with current knowledge, not a stale snapshot.

mermaid

Rendering diagram...

Search modes

Basic Memory supports three search modes. Your AI already knows when to use each one — just describe what you're looking for in plain language. You can also use them directly from the command line.

Text

Keyword search. You type words, it finds notes containing those words. Supports boolean operators like AND and OR.

You: "Search my notes for project AND planning"
AI: [Finds notes containing both words]

bm tool search-notes "project AND planning"

Best when you know the exact terms — names, identifiers, tags. The CLI doesn't have a text-only flag — keyword matching (including AND/OR operators) runs as part of the default hybrid search, and searches are text-only when semantic search is disabled.

Vector

Meaning-based search. Your query is compared against all indexed content by meaning, not exact words. Searching for "ways to make the frontend faster" finds notes about "React performance optimization" and "bundle size reduction."

You: "Find notes about ways to make the frontend faster"
AI: [Finds notes about React performance, bundle size, lazy loading]

bm tool search-notes "ways to make the frontend faster" --vector

Best for conceptual queries when you aren't sure what words were used.

Hybrid (default)

Runs both text and vector search, then combines results. Notes found by both methods rank highest. Notes found by only one method still appear but rank lower.

You: "Search for authentication security"
AI: [Finds notes matching the words AND notes about related concepts]

bm tool search-notes "authentication security"

This is the default. You get keyword precision plus meaning-based recall.

You don't need to pick

Basic Memory describes each search mode to your AI, so it already knows when to use text, vector, or hybrid search. Just ask for what you want in plain language:

"Find my notes about authentication security" — the AI runs a hybrid search
"Search for the exact phrase project AND planning" — the AI switches to text search
"What have I written about handling failures gracefully?" — the AI uses vector search

If you're using the CLI directly, pass the query as an argument — hybrid is the default and covers most cases. Use --vector when you're exploring by concept, or --hybrid to combine both explicitly.

Filtering

Tag search

Search by tag using a simple prefix:

You: "Search for notes tagged security"
You: "Search for tag:coffee AND tag:brewing"

bm tool search-notes "tag:security"

Metadata filters

Search by note properties — like status, tags, or any custom frontmatter field — without needing a text query:

You: "Show me all my in-progress notes"
You: "Find notes tagged security and oauth"
You: "What are my high-priority items?"

bm tool search-notes "" --meta status=in-progress
bm tool search-notes "" --tag security --tag oauth
bm tool search-notes "" --filter '{"priority": {"$in": ["high", "critical"]}}'

See Metadata Search for the full operator reference, nested-field syntax, and worked examples.

Combining text and filters

Combine a text or meaning-based search with property filters:

You: "Search for authentication notes that are specs"
You: "Find active error handling notes"

bm tool search-notes "authentication" --type spec
bm tool search-notes "error handling" --meta status=active

Enabling semantic search

Semantic search works automatically — there's nothing to set up. It's included and enabled by default in all standard Basic Memory installs (Homebrew and uv). Embeddings are generated automatically on first startup for existing notes.

For a few hundred notes, expect 1–3 minutes for the initial index build. After that, new and edited notes are indexed incrementally during normal sync.

To manually rebuild the search index (e.g., after switching providers):

bm reindex --embeddings

Platform compatibility

Platform	FastEmbed (local)	OpenAI (API)
macOS ARM64 (Apple Silicon)	Yes	Yes
macOS x86_64 (Intel Mac)	No (see workaround)	Yes
Linux x86_64	Yes	Yes
Linux ARM64	Yes	Yes
Windows x86_64	Yes	Yes

Intel Mac workaround

The default local embedding model doesn't run on Intel Macs. You have two options:

Option 1: Use OpenAI embeddings (recommended)

Requires an OpenAI API subscription — create a key there and set it as an environment variable.

export BASIC_MEMORY_SEMANTIC_EMBEDDING_PROVIDER=openai
export OPENAI_API_KEY=sk-...
bm reindex --embeddings

Option 2: Pin ONNX Runtime

uv pip install 'onnxruntime<1.24'
bm reindex --embeddings

Under the Hood

You don't need to read this section to use search — it's here for the curious.

Chunking

Notes are not searched as whole documents. Basic Memory breaks each note into smaller pieces before indexing, so search can surface the specific part of a note that's relevant.

mermaid

Rendering diagram...

The chunking follows the note's structure:

Headers create chunk boundaries — each section becomes its own chunk
Observations (categorized facts like - [technique] Water temperature...) are indexed individually
Relations (links to other notes like - works_at [[Company]]) are indexed individually
Prose paragraphs are merged into chunks of ~900 characters with ~120 character overlap at boundaries

This means a search for "water temperature for brewing" can surface the specific fact Water temperature at 205°F extracts optimal compounds rather than returning the entire "Coffee Brewing Methods" note.

How results are ranked

Search results return whole notes ranked by relevance, with the matched chunk text showing which part of the note was most relevant.

Hybrid fusion

Hybrid mode runs text and vector search independently, then merges results using score-based fusion:

mermaid

Rendering diagram...

Results found by both keyword and meaning match rank highest. The dominant signal (whichever source scored higher) is preserved, while the weaker signal adds a 30% bonus. Items found by only one source keep their original score.

Deduplication

Each chunk has a content hash. When notes are re-synced or reindexed, unchanged chunks skip re-indexing. Only modified content triggers new embeddings. Editing one note in a thousand-note knowledge base only re-indexes the chunks that changed.

Configuration

Config field	Env var	Default	Description
`semantic_search_enabled`	`BASIC_MEMORY_SEMANTIC_SEARCH_ENABLED`	`true`	Enable semantic search
`semantic_embedding_provider`	`BASIC_MEMORY_SEMANTIC_EMBEDDING_PROVIDER`	`"fastembed"`	`"fastembed"` (local) or `"openai"` (API)
`semantic_embedding_model`	`BASIC_MEMORY_SEMANTIC_EMBEDDING_MODEL`	`"bge-small-en-v1.5"`	Embedding model identifier
`semantic_embedding_dimensions`	`BASIC_MEMORY_SEMANTIC_EMBEDDING_DIMENSIONS`	auto-detected	384 (FastEmbed), 1536 (OpenAI)
`semantic_embedding_batch_size`	`BASIC_MEMORY_SEMANTIC_EMBEDDING_BATCH_SIZE`	`64`	Texts per embedding batch
`semantic_vector_k`	`BASIC_MEMORY_SEMANTIC_VECTOR_K`	`100`	Vector candidate count
`semantic_min_similarity`	`BASIC_MEMORY_SEMANTIC_MIN_SIMILARITY`	`0.55`	Minimum similarity threshold

Similarity threshold

The semantic_min_similarity setting controls which results are "similar enough" to return. The value ranges from 0.0 to 1.0:

Higher (e.g., 0.7) — Fewer results, stronger relevance. Good for focused queries.
Lower (e.g., 0.3) — More results, looser associations. Good for exploration.
0.0 — Disables filtering. All vector results returned regardless of score.

The default 0.55 is a reasonable middle ground.

See Configuration for the full config reference.

Embedding providers

FastEmbed (default)

Runs locally — no API key, no network calls, no cost
Model: BAAI/bge-small-en-v1.5
Dimensions: 384

OpenAI

Requires an OpenAI API subscription and OPENAI_API_KEY environment variable
Model: text-embedding-3-small
Dimensions: 1536
Higher dimensions capture more nuance, which can improve results for large or domain-specific knowledge bases

Switching provider or model requires rebuilding the index:

bm reindex --embeddings

Platform compatibility

Platform	FastEmbed (local)	OpenAI (API)
macOS ARM64 (Apple Silicon)	Yes	Yes
macOS x86_64 (Intel Mac)	No (see workaround)	Yes
Linux x86_64	Yes	Yes
Linux ARM64	Yes	Yes
Windows x86_64	Yes	Yes

Intel Mac workaround

The default local embedding model doesn't run on Intel Macs. You have two options:

Option 1: Use OpenAI embeddings (recommended)

Requires an OpenAI API subscription — create a key there and set it as an environment variable.

export BASIC_MEMORY_SEMANTIC_EMBEDDING_PROVIDER=openai
export OPENAI_API_KEY=sk-...
bm reindex --embeddings

Option 2: Pin ONNX Runtime

uv pip install 'onnxruntime<1.24'
bm reindex --embeddings

Reindexing

# Rebuild both search index and embeddings
bm reindex

# Rebuild only vector embeddings
bm reindex --embeddings

# Rebuild only full-text search index
bm reindex --search

# Reindex a specific project
bm reindex -p my-project

When you need to reindex

First time — Happens automatically on first startup
Switching provider or model — Embeddings from different models aren't compatible
After a database reset — bm reset clears everything
Troubleshooting — A fresh reindex can fix index issues

FAQ

Is this RAG?

Sort of. RAG (retrieval-augmented generation) usually means fetching relevant documents and feeding them to an AI as context. Basic Memory does that — but it's tighter than a typical RAG pipeline in two ways. First, your AI reads and writes notes directly through MCP tools, and search is just one of several ways it navigates your knowledge. It's less "retrieve and generate" and more "the AI already lives in your notes." Second, most RAG setups assume static content — you load documents once and query them. Basic Memory watches for changes and selectively re-indexes only the chunks that were modified. Your knowledge base is alive, and the search index stays current as you and your AI work.

Do I need an API key to use semantic search?

No. The default embedding provider (FastEmbed) runs entirely on your machine — no API key, no network calls, no cost. If you want to use OpenAI embeddings instead, that does require an OpenAI API key. See Embedding providers.

Does semantic search send my notes to the cloud?

Not with the default setup. FastEmbed generates embeddings locally on your machine. Nothing leaves your computer. If you switch to OpenAI embeddings, your note text is sent to OpenAI's API for embedding — but the resulting vectors are stored locally.

How many notes can it handle?

Thousands without issue. The initial indexing takes a few minutes for a few hundred notes, and after that new or edited notes are indexed incrementally. The SQLite-based vector store is fast for typical personal knowledge bases.

My search results aren't great. What can I do?

A few things to try:

Be more specific in your query. "authentication" is vague; "how we handle JWT token refresh" gives the vector search more meaning to work with.
Lower the similarity threshold. The default 0.55 filters out loosely related results. Try 0.3 if you want broader recall. See Configuration.
Check that embeddings are built. Run bm reindex --embeddings to make sure the index is up to date.
Try a different search mode. If you're looking for an exact term, use text search. If you're exploring a concept, use vector search directly instead of hybrid.

Can I use a different embedding model?

Yes. You can switch between FastEmbed (local, free) and OpenAI (API, paid). See Embedding providers. After switching, run bm reindex --embeddings since embeddings from different models aren't compatible.

Does Basic Memory Cloud support semantic search?

Yes. Cloud instances have semantic search enabled by default, using the same hybrid search. No setup required — it works the same way as local.

What inspired Basic Memory's search?

Basic Memory's hybrid search draws on ideas from the open source community — in particular QMD, which combines BM25 keyword search with vector semantic search in a local-first design. We loved the approach and wanted it integrated into Basic Memory's read-write knowledge system, where the AI doesn't just search your notes but actively works in them.

Next steps

MCP Tools Reference

search_notes parameters and search modes.

Configuration

All semantic settings and environment variables.

Schema System

Define and validate note structure with schemas.

Edit this pageorReport an issue

Schema System

Define, validate, and evolve note structure in Basic Memory with schemas.

Metadata Search

Filter and find notes by their YAML frontmatter — status, type, tags, priority, confidence, custom fields, and more.