AlloyDB ScaNN
Hybrid SQL + vector search in one call — ScaNN reference notes.

Hard Filters and Vector Search in One Call
Mark's "key lime pie" discovery (Apr 28). ScaNN combines hard SQL filters with semantic vector search in a single SQL call. No two-pass cohort-then-rank dance.
Apr 29 directive (Mark, 5:53 AM): going "all in" on ScaNN. ScaNN-vs-HNSW is settled — ScaNN wins for our matching/scoring milestone. This doc reflects the all-in stance; HNSW is not on the table. Mintlify doc has the full rationale.
Setup
CREATE EXTENSION IF NOT EXISTS alloydb_scann CASCADE;
-- Auto-installs the `vector` extension as well.
Index Creation
Auto-tuned (recommended default)
CREATE INDEX investor_thesis_scann ON investors
USING scann (thesis_embedding cosine)
WITH (MODE='AUTO',
OPTIMIZATION='SEARCH_OPTIMIZED',
auto_maintenance='ON');
Manual two-level tree (when you want control)
CREATE INDEX investor_thesis_scann ON investors
USING scann (thesis_embedding cosine)
WITH (mode='MANUAL',
num_leaves=2000, -- ~sqrt(N) for N rows
quantizer='SQ8', -- SQ8 / AH / FLAT
auto_maintenance='ON');
Three-level tree (for multi-million row tables)
Add max_num_levels=2 to the WITH clause.
Distance functions: l2, dot_product, cosine. OpenAI embeddings are unit-normalized → cosine ≡ dot_product. Use cosine for clarity.
Distance Operators
| Operator | Meaning |
|---|---|
<-> | L2 distance |
<=> | Cosine distance |
<#> | Inner product |
Pure Vector Query
SELECT id, name, thesis_summary
FROM investors
ORDER BY thesis_embedding <=> $1::vector
LIMIT 50;
Hybrid Search (the killer pattern)
This is what Mark called the key lime pie. Hard filter + semantic vector ranking in one statement:
SELECT id, name, firm, thesis_summary,
thesis_embedding <=> $query::vector AS distance
FROM investors
WHERE
-- hard filters
active = true
AND check_size_min <= $target_check
AND check_size_max >= $target_check
AND $sector = ANY(focus_sectors)
AND geography && $allowed_geos
AND last_invested_at > NOW() - INTERVAL '18 months'
ORDER BY thesis_embedding <=> $query::vector
LIMIT 25;
Why this matters for find_investors: the 6 vector indexes Mark talked about (sector thesis, stage, check size, geography, signal text, founder fit) all sit on the same row. A single ScaNN-backed query ranks the post-filter cohort directly — no separate "pull cohort, then rank in app code" step.
Tuning Parameters
| Parameter | Purpose | Notes |
|---|---|---|
num_leaves | Tree partitions | ~√N for N rows is a good starting point |
max_num_levels | Tree depth | 1 = 2-level, 2 = 3-level, 3 = 4-level (preview) |
quantizer | Compression | SQ8 (default, fast), AH (smaller), FLAT (no compression) |
scann.num_leaves_to_search | Query-time recall vs speed | Higher = better recall, slower |
Preview flags:
SET scann.enable_preview_features = on;
SET scann.max_allowed_num_levels = 3;
Column Types
vector(1536)for OpenAI text-embedding-3-smallvector(3072)for OpenAI text-embedding-3-large- Cast
real[]columns:CAST(col AS vector(1536)) - Need ≥10k rows before auto-tuned ScaNN works
Why this beats FalkorDB for find_investors
| Concern | FalkorDB (current) | AlloyDB ScaNN |
|---|---|---|
| Hybrid filter+vector | Two-pass: cohort cypher then in-app rank | Single SQL call |
| Multiple thesis dimensions | One vector per node | Multiple vector columns per row |
| Latency | 8–10s end-to-end (LSI dogfood) | Target sub-second on indexed query |
| Operator surface | Cypher | Standard SQL — anyone can read it |
| Hosting | Self-managed | Google managed |
What we still need from Mark
- AlloyDB connection string for the sandbox
- Schema for the
investors/fundraising_thesistables - Decision on which 6 vector columns make it onto the row (sector / stage / check / geo / signal / founder?)
- BigQuery → AlloyDB sync cadence (Mark mentioned half-million people, hundreds of thousands of theses)
References
- https://docs.cloud.google.com/alloydb/docs/ai/store-index-query-vectors
- Apr 28 sync notes:
~/.claude/projects/.../memory/april-28-mark-sync.md