DocsContext Pipeline
find_investors · Context

Context Pipeline

File upload → GCS → unstructured.io → pitch profile polling. Table schema, BFF routes, fn 12930 bug fixes, and the FK contract for Mark's pipeline.

Context Pipeline

Upload → Parse → Profile — the full deck-to-investor-match context pipeline

Built Apr 30 (Mark sync). The pipeline wires a founder's pitch deck through GCS → unstructured.io → fundraising_pitch_profiles (table 710), and surfaces the structured profile in the UI's Context tab while the investor search is running.

What it does

When a founder uploads a pitch deck in the find_investors flow, the engine needs to know what they're pitching before it can match investors. The context pipeline:

  1. Stages the file with a visual progress indicator in the right-rail
  2. Uploads asynchronously to GCS via Xano (no blocking the interview flow)
  3. Triggers downstream parsing (unstructured.io → Mark's pipeline)
  4. Polls the resulting fundraising_pitch_profiles row until it's populated
  5. Surfaces the structured profile — sector, stage, raise size, ARR, GTM, 6 narrative dimensions — in the Context tab

Architecture

Three distinct flows:

Flow 1 — Upload (synchronous, user-initiated)

Browser (handleFiles)
  → POST /api/upload-files (BFF · route.ts)
  → POST Xano 8418 (anything-engine/upload-files)
  → fn 12930 (mvp/suggestions/upload-suggestion-files)
  → GCS orbiter-production bucket  (PDF stored at url)
  → Table 587  (file row: id, url, mime_type, size)
  → Table 715  (suggestion_request_file: suggestion_request_id, file_id, mode)

The BFF (/api/upload-files) is a thin multipart forward. It validates that suggestion_request_id is present (required by the Xano pipeline) and sets a default mode=suggestion_request. maxDuration=60 covers large decks (20–30 MB).

Flow 2 — Async parsing (Mark's pipeline)

Table 715 row present
  → Cloud Run profile_enrichment_job (Mark's team)
  → unstructured.io  (PDF → markdown)
  → Claude extracts 49 structured fields
  → Table 710 row created (fundraising_pitch_profiles)
       ↳ suggestion_request_id FK set  ← NEW field added Apr 30
       ↳ sector, stage, raise_amount_usd, arr_usd, 6 narrative dims
       ↳ 1536-d embeddings per narrative for AlloyDB ScaNN matching

This flow is fully asynchronous. The UI does not block on it.

Flow 3 — Poll (client-side, 5s interval)

useEffect fires when:
  • any attachment has status === "ready"  AND
  • suggestionRequestId is set             AND
  • pitchProfile is still null

Loop:
  GET /api/get-pitch-profile?suggestion_request_id=N (BFF)
  → GET Xano 8420 (anything-engine/pitch-profile)
  → db.query fundraising_pitch_profiles WHERE suggestion_request_id = N
  → returns { ready: boolean, profile: PitchProfile | null }

Stops when ready === true → setPitchProfile(d.profile) → interval cleared

Xano endpoints

EndpointIDRole
POST /anything-engine/upload-files8418Receives multipart form (suggestion_request_id + files), calls fn 12930
POST /anything-engine/start-outcome8417Creates a draft suggestion_request row so files have an id to attach to
GET /anything-engine/pitch-profile8420Polling endpoint — queries table 710 by suggestion_request_id, returns {ready, profile}

Endpoint 8420 — pitch-profile GET

Intentionally polling-safe. The query is a single db.query with a WHERE suggestion_request_id = N filter. Returns ready: false until company_name is non-empty (empty string means the row hasn't been populated by the pipeline yet).

db.query "fundraising_pitch_profiles" {
  where = $db.fundraising_pitch_profiles.suggestion_request_id == $input.suggestion_request_id
  return = {type: "list", paging: {page: 1, per_page: 1}}
} as $result

var $ready {
  value = ($result.items[0] != null && $result.items[0].company_name != "")
}

response = { ready: $ready, profile: $result.items[0] }

Xano tables

Table 529 — suggestion_request

The anchor record created when the user clicks any of the 14 outcome tiles. Every file upload, interview turn, and pitch profile is linked to one suggestion_request.

FieldTypePurpose
idintPrimary key — required by all downstream APIs
user_idintFK to user table
copilot_modetext"outcome" (Anything Engine sessions)
statustextdraftactivecompleted
request_panel_titletextThe outcome class (e.g. "find_investors")
current_steptext"interview""dispatch"

Table 587 — file

One row per physical file upload. GCS URL stored here is what the downstream markdown parser reads.

FieldTypePurpose
idintPrimary key
urltextGCS URL (storage.googleapis.com/orbiter-production/...)
mime_typetextapplication/pdf, image/png, etc.
sizeintBytes
markdowntextPopulated after unstructured.io parsing

Table 715 — suggestion_request_file (join)

The bridge that links an uploaded file to a suggestion request.

FieldTypePurpose
suggestion_request_idint FK→ suggestion_request (529)
file_idint FK→ file (587)
user_idintDenormalized for quick user-scoped queries
modetext"suggestion_request" (default); "investor_deck" etc. for future modes

Table 710 — fundraising_pitch_profiles

49-field structured extraction of a pitch deck. The 6 narrative dimensions each get a 1536-d embedding for ScaNN investor matching.

Structured fields (hard SQL filters):

FieldPurpose
sectorB2B SaaS · Infrastructure · Marketplace · etc.
stageIdea · MVP · Pre-Seed · Seed · Series A · etc.
raise_amount_usdTarget raise in USD
raise_stageRound label
arr_usdAnnual recurring revenue
mrr_usdMonthly recurring revenue
customer_countPaying / pilot customers
growth_rate_pctYoY or MoM as stated
gtm_modelSales-driven · PLG · Virality · Marketplace
headquartersHQ location string
founder_namesArray of name strings

6 Narrative dimensions (semantic vector match):

FieldMatches against investor thesis…
founder_fit_narrativeTeam & founding story preferences
problem_market_narrativeProblem space & market conviction
competitive_moat_narrativeMoat & defensibility thesis
traction_momentum_narrativeTraction signals (ARR, growth, logos)
business_model_narrativeGTM & monetization model preference
expansion_roadmap_narrativeCapital plan & long-term vision

Each narrative field has a paired _vector (json, 1536 dims) for AlloyDB ScaNN cosine matching against investor thesis vectors in table 709.

New FK added Apr 30:

FieldTypePurpose
suggestion_request_idintLinks this profile back to the originating suggestion_request. Set by Mark's pipeline when the row is created. Required for the poll endpoint (8420) to return results.

Contract for Mark's team: when profile_enrichment_job creates a fundraising_pitch_profiles row, set suggestion_request_id to the id from suggestion_request_file.suggestion_request_id for the uploaded file. Without this, the poll endpoint returns {ready: false, profile: null} indefinitely.

BFF routes

POST /api/upload-files

Thin multipart forward. Validates suggestion_request_id presence, sets default mode=suggestion_request, forwards full FormData to Xano 8418. maxDuration=60 to cover large decks.

// Key validation before forwarding:
if (!formData.get("suggestion_request_id")) {
  return new Response(JSON.stringify({ error: "missing suggestion_request_id" }), { status: 400 });
}
if (!formData.get("mode")) formData.set("mode", "suggestion_request");

// Forward to Xano:
const upstream = await fetch(`${XANO_BASE}/anything-engine/upload-files`, {
  method: "POST",
  body: formData,
});

GET /api/get-pitch-profile

Polling endpoint. Validates suggestion_request_id as an integer, forwards to Xano 8420.

// Called by the UI every 5 seconds after a file upload completes:
const r = await fetch(`/api/get-pitch-profile?suggestion_request_id=${sid}`);
const d = await r.json();  // { ready: boolean, profile: PitchProfile | null }

POST /api/start-outcome

Creates the draft suggestion_request row on tile click — before the user has typed anything. This gives the file upload pipeline an id to attach to.

// Called in handleStarter() when the user clicks any of the 14 outcome tiles:
const r = await fetch("/api/start-outcome", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({ class: "find_investors", prompt: "" }),
});
const d = await r.json();  // { ok, suggestion_request_id, class, user_id }

UI components

Attachment state machine

Each attached file tracks a status through its upload lifecycle:

StatusVisualMeaning
uploadingPulsing indigo · "Uploading…"POST in-flight to /api/upload-files
readyGreen icon · "N KB · Ready"GCS upload confirmed, file_id set
errorRed icon · error messageUpload failed (network, Xano error, no draft)

The FilesSection component renders in both the Summary tab and Context tab so the file list is always visible regardless of which tab is active.

PitchProfilePanel states

StateConditionDisplay
"No deck yet"No attachmentsInvite to upload via paperclip
"Processing deck"attachments.length > 0 && pitchProfile === nullPending state with pipeline description
Rich profilepitchProfile !== nullSummary + meta grid + 6 narrative expansions

Polling lifecycle

useEffect(() => {
  const hasReadyFile = attachments.some((a) => a.status === "ready");
  if (!hasReadyFile || !suggestionRequestId || pitchProfile !== null) return;

  const poll = async () => {
    const d = await fetch(`/api/get-pitch-profile?...`).then(r => r.json());
    if (d.ready && d.profile) {
      setPitchProfile(d.profile);
      clearInterval(pitchPollRef.current);
    }
  };

  void poll();                              // immediate first check
  pitchPollRef.current = setInterval(poll, 5000);  // then every 5s

  return () => clearInterval(pitchPollRef.current);  // cleanup on unmount
}, [attachments, suggestionRequestId, pitchProfile]);

Xano function 12930 — bugs fixed Apr 30

The upload function mvp/suggestions/upload-suggestion-files had two bugs that silently broke the pipeline:

BugFieldFix
Typo in path generation call$fileData.meme$fileData.mime
Wrong column name on join write$srFile.section$srFile.mode

Both fixed via patch_function directly in Xano on Apr 30. Verified by uploading the Star51 deck — 3 rows confirmed in table 715.

Prompt sync — $ backreference bug fixed Apr 30

scripts/sync-prompts.mjs had a silent bug: prompt strings containing $2M, $8M, $4M (seed round sizes in the Mintlify exemplars) were being corrupted when String.replace interpreted $2 as a regex backreference.

Fix:

// Before (broken — $2M becomes old literal content):
const next = script.replace(re, `$1${q}${newLiteral}${q}`);

// After (correct):
const safeLiteral = newLiteral.replace(/\$/g, "$$$$");
const next = script.replace(re, `$1${q}${safeLiteral}${q}`);

The $$$$ becomes $$ after the first .replace() call's own substitution pass, which is what String.replace needs to emit a literal $ in the output.

Open items

ItemStatusOwner
Set suggestion_request_id on table 710 writeNeeds wiringMark's pipeline (Kenya team)
Decode source_url from GCS URL to link file→profile pre-FKWorkaround until FK is setn/a — poll endpoint handles once FK is set
mode enum values on table 715Only suggestion_request used todayMark to define per-class modes

References

  • Architecture: docs/architecture.md
  • Anything Engine: docs/anything-engine.md
  • Xano group 1270 (UgP1h6uR): endpoints 8399–8420
  • Xano table 587 (file), 715 (suggestion_request_file), 710 (fundraising_pitch_profiles), 529 (suggestion_request)
  • BFF routes: src/app/api/upload-files/, src/app/api/get-pitch-profile/, src/app/api/start-outcome/