find_investors · Edge Cases

Edge Cases

24 documented edge cases across input, filter, synthesis, pipeline, and UI.

Edge Case Map

Twenty-Four Decision Branches Across the Pipeline

Things that go wrong in the real world. We need a documented behavior for each before we ship.

Input edge cases

#	Case	Behavior
1	User says "investors" but means "advisors"	Classifier returns confidence < 0.7 → ask back: "investors who write checks, or advisors?"
2	No deck uploaded, only a one-liner	Skip thesis-extract, run on the one-liner directly. Tag results `confidence: low_thesis`.
3	Deck uploaded but not a fundraising deck (e.g. all-hands)	Thesis-extract returns `null` → fall back to one-liner mode, surface a warning card.
4	User specifies stage but stage is impossible (Series A medtech with $50M ARR)	Classifier flags `stage_mismatch` → ask back.
5	Multilingual query	Translate to English first, run pipeline, do NOT translate WHY back (English investors, English memo).
6	Empty / single-word query ("investors")	Classifier returns `count=0`, `confidence=0.3` → render conversation starter cards instead.

#	Case	Behavior
7	Filters too narrow → 0 hits	Render "no matches under these filters" + 3 suggested loosenings (drop geo, expand stage, raise check ceiling)
8	Filters too broad → 5000+ hits	ScaNN LIMIT 25 already truncates. Surface a banner: "showing top 25 of N — narrow filters?"
9	Recency-of-contact filter excludes a strong match	Keep the match but tag `recent_contact: true`, demote in ranking
10	User in a region with no investor coverage in graph	Surface honestly: "graph has no investors in your region. Closest coverage: …"

#	Case	Behavior
11	Investor matches on vector but has no recent activity	Demote, tag `dormant_investor: true`. Don't drop.
12	Investor matches but the WHY is generic ("strong fit")	Reject the WHY pass output, retry with stricter prompt OR drop the card.
13	Two investors at the same firm both score	Group as one card, list both partners.
14	Investor is on banned-phrases list (former relationship gone bad)	Filter pre-rank. User's banlist lives in Zep facts.
15	Opus drops a banned phrase ("worth a 20-min call")	Lint pass before stream — reject + retry.

#	Case	Behavior
16	OpenRouter returns 429 (rate limit)	Fall through provider order automatically (Fireworks → Together)
17	OpenRouter returns markdown-fenced JSON	Strip fences before parsing (XanoScript gotcha: ```json → "[object Object]" if not stripped)
18	AlloyDB connection drops mid-query	Retry once, then surface error card
19	Embedding API returns wrong-dim vector	Reject, retry (1536 dims required)
20	Zep `thread.get_user_context` times out	Continue without context, log it, don't block dispatch

#	Case	Behavior
21	User refreshes mid-stream	Server should not lose state — Zep already has the user message
22	User cancels (abort)	Close stream controller, do NOT record assistant message in Zep
23	User clicks card during stream	Ignore until stream completes — no race
24	Backend stream never closes	`isRunning` stuck true forever — see crayon-sdk gotchas.md

For #1 (ambiguous intent), does the classifier ask back, or does the UI surface 2-3 picker options?
For #11 (dormant investors), what's the cutoff? Mark mentioned 18-month directive on Apr 20 — confirm.
For #14 (banned-relationship list), is that a Zep fact, a separate table, or a user pref?
For #21 (refresh mid-stream), do we resume the stream or restart from the user message?