LlamaIndex + Supabase

This example shows how to fetch content with Supacrawler, build an index with LlamaIndex, and persist vectors in Supabase.

References:

Supabase AI Overview: https://supabase.com/docs/guides/ai
LlamaIndex integration: https://supabase.com/docs/guides/ai/integrations/llamaindex

Install

pip install -U supacrawler-py llama-index-embeddings-openai llama-index-vector-stores-postgres

Python example

import os
from supacrawler import SupacrawlerClient, ScrapeParams
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.vector_stores.postgres import PGVectorStore
from llama_index.core import Document, VectorStoreIndex, StorageContext

DB_URL = os.environ['DATABASE_URL']  # postgresql+psycopg://...
OPENAI_API_KEY = os.environ['OPENAI_API_KEY']
SUPACRAWLER_API_KEY = os.environ['SUPACRAWLER_API_KEY']

# 1) Scrape
crawler = SupacrawlerClient(api_key=SUPACRAWLER_API_KEY)
scrape = crawler.scrape(ScrapeParams(url='https://example.com', format='markdown'))

docs = [Document(text=scrape.content, metadata={'url': scrape.url, 'title': getattr(scrape, 'title', None)})]

# 2) Vector store + index
embed_model = OpenAIEmbedding(model='text-embedding-3-small', api_key=OPENAI_API_KEY)
store = PGVectorStore.from_params(
    database_url=DB_URL,
    collection_name='llama_docs',
    embed_dim=1536,
)
ctx = StorageContext.from_defaults(vector_store=store)
index = VectorStoreIndex.from_documents(docs, storage_context=ctx, embed_model=embed_model)

# 3) Query
query_engine = index.as_query_engine()
resp = query_engine.query('What is this page about?')
print(resp)

Expected output

Response(answer='...', sources=[...])

See the official example for more patterns and settings: https://github.com/supabase/supabase/blob/master/examples/ai/llamaindex/llamaindex.ipynb

pgvector setup

Enable pgvector on your database first.

Supabase: Database → Extensions → enable pgvector.
Self‑hosted Postgres:

create extension if not exists vector;

Create HNSW/IVFFlat indexes per Supabase guidance for production.