Semantic Vault Search

Semantic Vault Search lets you search your vault by meaning, not just keywords. Powered by Google's File Search API, it indexes your notes in the background so the AI agent can find relevant content even when you don't remember exact words.

This feature is experimental. It works well for most vaults, but the underlying Google API has some limitations documented below.

How It Works

Indexing — The plugin uploads your vault files to a private Google File Search Store tied to your API key
Change tracking — File creates, edits, and deletes are automatically synced (with a 2-second debounce)
Semantic search — When the agent needs to find relevant notes, it queries the store using meaning-based search rather than keyword matching
Results — The agent receives text excerpts from matching files along with an AI-generated summary

The agent has access to this as the vault_semantic_search tool. You don't need to invoke it directly — just ask questions like "find my notes about project planning" and the agent will use it automatically.

Getting Started

Enable Vault Indexing

Open Settings → Gemini Scribe
Scroll to Vault Search Index
Toggle Enable vault indexing to ON
The initial indexing starts automatically with a progress modal

TIP

As of v4.5.0, enabling vault indexing starts immediately without requiring a plugin reload.

Initial indexing time depends on vault size. A vault with 1,000 notes typically takes a few minutes.

Configuration Options

Setting	Default	Description
Enable vault indexing	Off	Master toggle for the feature
Search index name	Auto-generated	Read-only. The Google File Search store identifier, assigned automatically when indexing starts
Auto-sync changes	On	Automatically update the index when files change
Include attachments	Off	Index PDFs, Office documents, and other supported file types beyond markdown
Exclude folders	None	Folders to skip during indexing (one per line). System folders like `.obsidian` and the plugin state folder are always excluded

Index Management

In the settings panel you'll find two action buttons:

Reindex Vault — Performs a full re-index. Smart sync skips files that haven't changed, so this is safe to run anytime.
Delete Index — Permanently removes your data from Google Cloud. Use this if you want a fresh start or are done with the feature.

Commands

Access these from the command palette (Ctrl/Cmd + P):

Command	Description
Pause RAG Sync	Pauses auto-sync. File changes still queue up but aren't sent to Google.
Resume RAG Sync	Resumes auto-sync and processes any queued changes.
Show RAG Status	Opens a detailed status modal with indexed files, failures, and actions.

Status Tracking

Status Bar

A small icon appears in the bottom-right status bar:

Icon	State	Meaning
Database	Idle	Index is up to date, shows file count
Upload (pulsing)	Indexing	Files are being uploaded, shows percentage
Clock	Rate limited	Waiting for API cooldown, shows countdown
Triangle	Error	Something went wrong — click for details

Click the status bar icon anytime to open the full status modal.

The status modal has three tabs:

Overview — Current status, file counts, pending changes, last sync time, and action buttons (Sync Now, Reindex All)
Files — Searchable list of all indexed files with timestamps
Failures — Any files that failed to index with error details

Interrupted Indexing

If Obsidian closes during indexing, the plugin detects this on next startup and asks whether to resume where you left off or start fresh. Smart sync means resuming is efficient — already-indexed files are skipped.

How the Agent Uses It

The agent has a vault_semantic_search tool with these parameters:

query (required) — The search question or topic
maxResults (optional) — Number of results, 1–20 (default: 5)
folder (optional) — Filter results to a specific folder path
tags (optional) — Filter results by Obsidian tags

Project scoping: When a session is linked to a project, searches are automatically limited to the project's root folder. You can override this by specifying a different folder value explicitly.

Example Interactions

You: Find my notes about machine learning algorithms

Agent: I'll search your vault semantically for notes about machine learning.
[Uses vault_semantic_search with query "machine learning algorithms"]
I found 5 relevant notes:
- "Neural Networks Overview" — discusses backpropagation and gradient descent...
- "ML Study Guide" — covers supervised vs unsupervised learning...

You: What did I write about project deadlines in my work folder?

Agent: [Uses vault_semantic_search with query "project deadlines", folder="work"]

Supported File Types

Markdown only (default):

.md files

With "Include attachments" enabled:

PDFs (.pdf)
Office documents (.docx, .xlsx, .pptx)
Text files (.txt, .json, .yaml, .csv)
Code files (.js, .ts, .py, .sh, etc.)
HTML, XML, and other text-based formats

Privacy & Data Storage

Where does my data go?

Files are uploaded to Google Cloud via the File Search API. Each API key has its own isolated set of File Search Stores — your data is private to your API key and is not shared with other users or used for model training.

What metadata is stored?

Along with file content, the plugin sends metadata for filtering:

Folder path
Tags (from frontmatter)
Aliases (from frontmatter)
File path

Local cache

A cache file (rag-index-cache.json) is stored in your plugin state folder. It tracks content hashes and timestamps to enable smart sync — only changed files are re-uploaded.

Protecting sensitive content

System folders (.obsidian, plugin state folder) are always excluded
Use Exclude folders in settings to skip sensitive directories (e.g., private/, journal/)
Keep Include attachments off if you don't want PDFs/documents indexed

Data deletion

When you disable vault indexing, a cleanup dialog asks whether to:

Keep data — Faster if you re-enable later
Delete from Google Cloud — Permanently removes the File Search Store

You can also use Delete Index in settings at any time.

Note: If you uninstall the plugin without deleting the index first, the data persists in Google Cloud. You can clean it up by re-enabling the plugin and using Delete Index, or by managing stores directly in Google AI Studio.

Known Limitations

No individual file deletion from the index — When you delete a vault file, it may remain as an orphan in Google Cloud. Use Reindex Vault or Delete Index for a clean state.
Search results don't include file paths — The Google API returns text excerpts but not source file paths. The agent may not always be able to link results back to specific files.
Rate limits during indexing — Large vaults may hit API rate limits. The plugin handles this automatically with exponential backoff (30s base, up to 5 minutes). The status bar shows a countdown.
Concurrent upload limit — Files are uploaded 5 at a time to balance speed with rate limit avoidance.

Troubleshooting

Indexing is very slow

Large vaults take time on the first index. Subsequent syncs are much faster due to smart sync (only changed files).
If you're hitting rate limits frequently, consider enabling billing on your API key for higher quotas.
Excluding large folders you don't need indexed helps significantly.

"File already exists" errors

This was fixed in v4.2+. Update to the latest version. The fix includes deferred initialization and improved cache handling.

Search results seem incomplete

Verify the file is actually indexed — open the status modal (Files tab) and search for it
Check that the file's folder isn't in the exclude list
If you recently enabled Include attachments, run Reindex Vault to pick up non-markdown files
Try increasing maxResults by asking the agent: "Search for X with more results"

Status shows errors

Click the status bar icon to see the Failures tab. Common causes:

Rate limits — Wait and retry, or enable billing for higher quotas
Invalid API key — Verify your key in Google AI Studio
Network issues — Check your internet connection

For persistent issues, try Delete Index followed by Reindex Vault for a clean start.

Semantic Vault Search ​

How It Works ​

Getting Started ​

Enable Vault Indexing ​

Configuration Options ​

Index Management ​

Commands ​

Status Tracking ​

Status Bar ​

Status Modal ​

Interrupted Indexing ​

How the Agent Uses It ​

Example Interactions ​

Supported File Types ​

Privacy & Data Storage ​

Where does my data go? ​

What metadata is stored? ​

Local cache ​

Protecting sensitive content ​

Data deletion ​

Known Limitations ​

Troubleshooting ​

Indexing is very slow ​

"File already exists" errors ​

Search results seem incomplete ​

Status shows errors ​

Further Reading ​