# Semantic Search

> Meaning-based retrieval with vector and hybrid search in Basic Memory.

Most knowledge tools make you organize everything up front so you can find it later. Basic Memory takes the opposite approach — write naturally, and let search figure out the connections.

When you or your AI save a note, Basic Memory indexes it two ways: by the exact words it contains, and by what it *means*. Search for "login security improvements" and find notes about "authentication hardening" even though the exact words don't match. This happens automatically — no tagging system to maintain, no folder hierarchy to get right.

What makes this different from a typical search tool is that your AI is part of the loop. It writes notes, searches them, and builds on what it finds — all through the same tools. The search index updates incrementally as notes change, so the AI is always working with current knowledge, not a stale snapshot.

<mermaid code="flowchart LR
    Q[Query] --> T[Text Search]
    Q --> V[Vector Search]
    T --> SF[Score Fusion]
    V --> SF
    SF --> R[Results]
">



</mermaid>

---

## Search modes

Basic Memory supports three search modes. Your AI already knows when to use each one — just describe what you're looking for in plain language. You can also use them directly from the command line.

### Text

Keyword search. You type words, it finds notes containing those words. Supports boolean operators like `AND` and `OR`.

<code-group>

```text [Conversation]
You: "Search my notes for project AND planning"
AI: [Finds notes containing both words]
```

```bash [CLI]
bm tool search-notes --query "project AND planning" --search-type text
```

</code-group>

Best when you know the exact terms — names, identifiers, tags.

### Vector

Meaning-based search. Your query is compared against all indexed content by meaning, not exact words. Searching for "ways to make the frontend faster" finds notes about "React performance optimization" and "bundle size reduction."

<code-group>

```text [Conversation]
You: "Find notes about ways to make the frontend faster"
AI: [Finds notes about React performance, bundle size, lazy loading]
```

```bash [CLI]
bm tool search-notes --query "ways to make the frontend faster" --search-type vector
```

</code-group>

Best for conceptual queries when you aren't sure what words were used.

### Hybrid (default)

Runs both text and vector search, then combines results. Notes found by both methods rank highest. Notes found by only one method still appear but rank lower.

<code-group>

```text [Conversation]
You: "Search for authentication security"
AI: [Finds notes matching the words AND notes about related concepts]
```

```bash [CLI]
bm tool search-notes --query "authentication security"
```

</code-group>

This is the default. You get keyword precision plus meaning-based recall.

### You don't need to pick

Basic Memory describes each search mode to your AI, so it already knows when to use text, vector, or hybrid search. Just ask for what you want in plain language:

- "Find my notes about authentication security" — the AI runs a hybrid search
- "Search for the exact phrase project AND planning" — the AI switches to text search
- "What have I written about handling failures gracefully?" — the AI uses vector search

If you're using the CLI directly, hybrid is the default and covers most cases. Use `--search-type text` when you need exact keyword matches, or `--search-type vector` when you're exploring by concept.

---

## Filtering

### Tag search

Search by tag using a simple prefix:

<code-group>

```text [Conversation]
You: "Search for notes tagged security"
You: "Search for tag:coffee AND tag:brewing"
```

```bash [CLI]
bm tool search-notes --query "tag:security"
```

</code-group>

### Metadata filters

Search by note properties — like status, tags, or any custom frontmatter field — without needing a text query:

<code-group>

```text [Conversation]
You: "Show me all my in-progress notes"
You: "Find notes tagged security and oauth"
You: "What are my high-priority items?"
```

```bash [CLI]
bm tool search-notes "" --meta status=in-progress
bm tool search-notes "" --tag security --tag oauth
bm tool search-notes "" --filter '{"priority": {"$in": ["high", "critical"]}}'
```

</code-group>

<tip>

See [Metadata Search](/concepts/metadata-search) for the full operator reference, nested-field syntax, and worked examples.

</tip>

### Combining text and filters

Combine a text or meaning-based search with property filters:

<code-group>

```text [Conversation]
You: "Search for authentication notes that are specs"
You: "Find active error handling notes"
```

```bash [CLI]
bm tool search-notes "authentication" --type spec
bm tool search-notes "error handling" --meta status=active
```

</code-group>

---

## Enabling semantic search

Semantic search works automatically — there's nothing to set up. It's included and enabled by default in all standard Basic Memory installs (Homebrew and uv). Embeddings are generated automatically on first startup for existing notes.

For a few hundred notes, expect 1–3 minutes for the initial index build. After that, new and edited notes are indexed incrementally during normal sync.

To manually rebuild the search index (e.g., after switching providers):

```bash
bm reindex --embeddings
```

### Platform compatibility

<table>
<thead>
  <tr>
    <th>
      Platform
    </th>
    
    <th>
      FastEmbed (local)
    </th>
    
    <th>
      OpenAI (API)
    </th>
  </tr>
</thead>

<tbody>
  <tr>
    <td>
      macOS ARM64 (Apple Silicon)
    </td>
    
    <td>
      Yes
    </td>
    
    <td>
      Yes
    </td>
  </tr>
  
  <tr>
    <td>
      macOS x86_64 (Intel Mac)
    </td>
    
    <td>
      No (see workaround)
    </td>
    
    <td>
      Yes
    </td>
  </tr>
  
  <tr>
    <td>
      Linux x86_64
    </td>
    
    <td>
      Yes
    </td>
    
    <td>
      Yes
    </td>
  </tr>
  
  <tr>
    <td>
      Linux ARM64
    </td>
    
    <td>
      Yes
    </td>
    
    <td>
      Yes
    </td>
  </tr>
  
  <tr>
    <td>
      Windows x86_64
    </td>
    
    <td>
      Yes
    </td>
    
    <td>
      Yes
    </td>
  </tr>
</tbody>
</table>

### Intel Mac workaround

The default local embedding model doesn't run on Intel Macs. You have two options:

**Option 1: Use OpenAI embeddings (recommended)**

Requires an [OpenAI API subscription](https://platform.openai.com/api-keys) — create a key there and set it as an environment variable.

```bash
export BASIC_MEMORY_SEMANTIC_EMBEDDING_PROVIDER=openai
export OPENAI_API_KEY=sk-...
bm reindex --embeddings
```

**Option 2: Pin ONNX Runtime**

```bash
uv pip install 'onnxruntime<1.24'
bm reindex --embeddings
```

---

## Under the Hood

You don't need to read this section to use search — it's here for the curious.

### Chunking

Notes are not searched as whole documents. Basic Memory breaks each note into smaller pieces before indexing, so search can surface the specific *part* of a note that's relevant.

<mermaid code="flowchart TD
    N[Note] --> H[Headers]
    N --> O[Observations]
    N --> R[Relations]
    N --> P[Prose]
    H --> C1[Section Chunks]
    O --> C2[Observation Chunks]
    R --> C3[Relation Chunks]
    P --> C4[Merged Chunks ~900 chars]
    C1 --> E[Embeddings]
    C2 --> E
    C3 --> E
    C4 --> E
">



</mermaid>

The chunking follows the note's structure:

- **Headers** create chunk boundaries — each section becomes its own chunk
- **Observations** (categorized facts like `- [technique] Water temperature...`) are indexed individually
- **Relations** (links to other notes like `- works_at [[Company]]`) are indexed individually
- **Prose** paragraphs are merged into chunks of ~900 characters with ~120 character overlap at boundaries

This means a search for "water temperature for brewing" can surface the specific fact `Water temperature at 205°F extracts optimal compounds` rather than returning the entire "Coffee Brewing Methods" note.

### How results are ranked

Search results return whole notes ranked by relevance, with the matched chunk text showing which part of the note was most relevant.

### Hybrid fusion

Hybrid mode runs text and vector search independently, then merges results using score-based fusion:

<mermaid code="flowchart LR
    Q[Query] --> FTS[Text Search]
    Q --> VS[Vector Search]
    FTS --> R1[Scored Results]
    VS --> R2[Scored Results]
    R1 --> SF[Score Fusion]
    R2 --> SF
    SF --> F[Final Ranking]
">



</mermaid>

Results found by both keyword and meaning match rank highest. The dominant signal (whichever source scored higher) is preserved, while the weaker signal adds a 30% bonus. Items found by only one source keep their original score.

### Deduplication

Each chunk has a content hash. When notes are re-synced or reindexed, unchanged chunks skip re-indexing. Only modified content triggers new embeddings. Editing one note in a thousand-note knowledge base only re-indexes the chunks that changed.

---

## Configuration

<table>
<thead>
  <tr>
    <th>
      Config field
    </th>
    
    <th>
      Env var
    </th>
    
    <th>
      Default
    </th>
    
    <th>
      Description
    </th>
  </tr>
</thead>

<tbody>
  <tr>
    <td>
      <code>
        semantic_search_enabled
      </code>
    </td>
    
    <td>
      <code>
        BASIC_MEMORY_SEMANTIC_SEARCH_ENABLED
      </code>
    </td>
    
    <td>
      <code>
        true
      </code>
    </td>
    
    <td>
      Enable semantic search
    </td>
  </tr>
  
  <tr>
    <td>
      <code>
        semantic_embedding_provider
      </code>
    </td>
    
    <td>
      <code>
        BASIC_MEMORY_SEMANTIC_EMBEDDING_PROVIDER
      </code>
    </td>
    
    <td>
      <code>
        "fastembed"
      </code>
    </td>
    
    <td>
      <code>
        "fastembed"
      </code>
      
       (local) or <code>
        "openai"
      </code>
      
       (API)
    </td>
  </tr>
  
  <tr>
    <td>
      <code>
        semantic_embedding_model
      </code>
    </td>
    
    <td>
      <code>
        BASIC_MEMORY_SEMANTIC_EMBEDDING_MODEL
      </code>
    </td>
    
    <td>
      <code>
        "bge-small-en-v1.5"
      </code>
    </td>
    
    <td>
      Embedding model identifier
    </td>
  </tr>
  
  <tr>
    <td>
      <code>
        semantic_embedding_dimensions
      </code>
    </td>
    
    <td>
      <code>
        BASIC_MEMORY_SEMANTIC_EMBEDDING_DIMENSIONS
      </code>
    </td>
    
    <td>
      auto-detected
    </td>
    
    <td>
      384 (FastEmbed), 1536 (OpenAI)
    </td>
  </tr>
  
  <tr>
    <td>
      <code>
        semantic_embedding_batch_size
      </code>
    </td>
    
    <td>
      <code>
        BASIC_MEMORY_SEMANTIC_EMBEDDING_BATCH_SIZE
      </code>
    </td>
    
    <td>
      <code>
        64
      </code>
    </td>
    
    <td>
      Texts per embedding batch
    </td>
  </tr>
  
  <tr>
    <td>
      <code>
        semantic_vector_k
      </code>
    </td>
    
    <td>
      <code>
        BASIC_MEMORY_SEMANTIC_VECTOR_K
      </code>
    </td>
    
    <td>
      <code>
        100
      </code>
    </td>
    
    <td>
      Vector candidate count
    </td>
  </tr>
  
  <tr>
    <td>
      <code>
        semantic_min_similarity
      </code>
    </td>
    
    <td>
      <code>
        BASIC_MEMORY_SEMANTIC_MIN_SIMILARITY
      </code>
    </td>
    
    <td>
      <code>
        0.55
      </code>
    </td>
    
    <td>
      Minimum similarity threshold
    </td>
  </tr>
</tbody>
</table>

### Similarity threshold

The `semantic_min_similarity` setting controls which results are "similar enough" to return. The value ranges from `0.0` to `1.0`:

- **Higher** (e.g., `0.7`) — Fewer results, stronger relevance. Good for focused queries.
- **Lower** (e.g., `0.3`) — More results, looser associations. Good for exploration.
- **0.0** — Disables filtering. All vector results returned regardless of score.

The default `0.55` is a reasonable middle ground.

See [Configuration](/reference/configuration) for the full config reference.

---

## Embedding providers

### FastEmbed (default)

- Runs locally — no API key, no network calls, no cost
- Model: `BAAI/bge-small-en-v1.5`
- Dimensions: 384

### OpenAI

- Requires an [OpenAI API subscription](https://platform.openai.com/api-keys) and `OPENAI_API_KEY` environment variable
- Model: `text-embedding-3-small`
- Dimensions: 1536
- Higher dimensions capture more nuance, which can improve results for large or domain-specific knowledge bases

Switching provider or model requires rebuilding the index:

```bash
bm reindex --embeddings
```

---

## Platform compatibility

<table>
<thead>
  <tr>
    <th>
      Platform
    </th>
    
    <th>
      FastEmbed (local)
    </th>
    
    <th>
      OpenAI (API)
    </th>
  </tr>
</thead>

<tbody>
  <tr>
    <td>
      macOS ARM64 (Apple Silicon)
    </td>
    
    <td>
      Yes
    </td>
    
    <td>
      Yes
    </td>
  </tr>
  
  <tr>
    <td>
      macOS x86_64 (Intel Mac)
    </td>
    
    <td>
      No (see workaround)
    </td>
    
    <td>
      Yes
    </td>
  </tr>
  
  <tr>
    <td>
      Linux x86_64
    </td>
    
    <td>
      Yes
    </td>
    
    <td>
      Yes
    </td>
  </tr>
  
  <tr>
    <td>
      Linux ARM64
    </td>
    
    <td>
      Yes
    </td>
    
    <td>
      Yes
    </td>
  </tr>
  
  <tr>
    <td>
      Windows x86_64
    </td>
    
    <td>
      Yes
    </td>
    
    <td>
      Yes
    </td>
  </tr>
</tbody>
</table>

### Intel Mac workaround

The default local embedding model doesn't run on Intel Macs. You have two options:

**Option 1: Use OpenAI embeddings (recommended)**

Requires an [OpenAI API subscription](https://platform.openai.com/api-keys) — create a key there and set it as an environment variable.

```bash
export BASIC_MEMORY_SEMANTIC_EMBEDDING_PROVIDER=openai
export OPENAI_API_KEY=sk-...
bm reindex --embeddings
```

**Option 2: Pin ONNX Runtime**

```bash
uv pip install 'onnxruntime<1.24'
bm reindex --embeddings
```

---

## Reindexing

```bash
# Rebuild both search index and embeddings
bm reindex

# Rebuild only vector embeddings
bm reindex --embeddings

# Rebuild only full-text search index
bm reindex --search

# Reindex a specific project
bm reindex -p my-project
```

### When you need to reindex

- **First time** — Happens automatically on first startup
- **Switching provider or model** — Embeddings from different models aren't compatible
- **After a database reset** — `bm reset` clears everything
- **Troubleshooting** — A fresh reindex can fix index issues

---

## FAQ

### Is this RAG?

Sort of. RAG (retrieval-augmented generation) usually means fetching relevant documents and feeding them to an AI as context. Basic Memory does that — but it's tighter than a typical RAG pipeline in two ways. First, your AI reads and writes notes directly through MCP tools, and search is just one of several ways it navigates your knowledge. It's less "retrieve and generate" and more "the AI already lives in your notes." Second, most RAG setups assume static content — you load documents once and query them. Basic Memory watches for changes and selectively re-indexes only the chunks that were modified. Your knowledge base is alive, and the search index stays current as you and your AI work.

### Do I need an API key to use semantic search?

No. The default embedding provider (FastEmbed) runs entirely on your machine — no API key, no network calls, no cost. If you want to use OpenAI embeddings instead, that does require an OpenAI API key. See [Embedding providers](#embedding-providers).

### Does semantic search send my notes to the cloud?

Not with the default setup. FastEmbed generates embeddings locally on your machine. Nothing leaves your computer. If you switch to OpenAI embeddings, your note text is sent to OpenAI's API for embedding — but the resulting vectors are stored locally.

### How many notes can it handle?

Thousands without issue. The initial indexing takes a few minutes for a few hundred notes, and after that new or edited notes are indexed incrementally. The SQLite-based vector store is fast for typical personal knowledge bases.

### My search results aren't great. What can I do?

A few things to try:

- **Be more specific in your query.** "authentication" is vague; "how we handle JWT token refresh" gives the vector search more meaning to work with.
- **Lower the similarity threshold.** The default `0.55` filters out loosely related results. Try `0.3` if you want broader recall. See [Configuration](#configuration).
- **Check that embeddings are built.** Run `bm reindex --embeddings` to make sure the index is up to date.
- **Try a different search mode.** If you're looking for an exact term, use text search. If you're exploring a concept, use vector search directly instead of hybrid.

### Can I use a different embedding model?

Yes. You can switch between FastEmbed (local, free) and OpenAI (API, paid). See [Embedding providers](#embedding-providers). After switching, run `bm reindex --embeddings` since embeddings from different models aren't compatible.

### Does Basic Memory Cloud support semantic search?

Yes. Cloud instances have semantic search enabled by default, using the same hybrid search. No setup required — it works the same way as local.

### What inspired Basic Memory's search?

Basic Memory's hybrid search draws on ideas from the open source community — in particular [QMD](https://github.com/tobi/qmd), which combines BM25 keyword search with vector semantic search in a local-first design. We loved the approach and wanted it integrated into Basic Memory's read-write knowledge system, where the AI doesn't just search your notes but actively works in them.

---

## Next steps

<card-group>
<card icon="i-lucide-wrench" title="MCP Tools Reference" to="/reference/mcp-tools-reference">

`search_notes` parameters and search modes.

</card>

<card icon="i-lucide-settings" title="Configuration" to="/reference/configuration">

All semantic settings and environment variables.

</card>

<card icon="i-lucide-shield-check" title="Schema System" to="/concepts/schema-system">

Define and validate note structure with schemas.

</card>
</card-group>
