LlamaIndex + Supabase
This example shows how to fetch content with Supacrawler, build an index with LlamaIndex, and persist vectors in Supabase.
References:
- Supabase AI Overview: https://supabase.com/docs/guides/ai
- LlamaIndex integration: https://supabase.com/docs/guides/ai/integrations/llamaindex
Install
pip install -U supacrawler-py llama-index-embeddings-openai llama-index-vector-stores-postgres
Python example
import os
from supacrawler import SupacrawlerClient, ScrapeParams
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.vector_stores.postgres import PGVectorStore
from llama_index.core import Document, VectorStoreIndex, StorageContext
DB_URL = os.environ['DATABASE_URL'] # postgresql+psycopg://...
OPENAI_API_KEY = os.environ['OPENAI_API_KEY']
SUPACRAWLER_API_KEY = os.environ['SUPACRAWLER_API_KEY']
# 1) Scrape
crawler = SupacrawlerClient(api_key=SUPACRAWLER_API_KEY)
scrape = crawler.scrape(ScrapeParams(url='https://example.com', format='markdown'))
docs = [Document(text=scrape.content, metadata={'url': scrape.url, 'title': getattr(scrape, 'title', None)})]
# 2) Vector store + index
embed_model = OpenAIEmbedding(model='text-embedding-3-small', api_key=OPENAI_API_KEY)
store = PGVectorStore.from_params(
database_url=DB_URL,
collection_name='llama_docs',
embed_dim=1536,
)
ctx = StorageContext.from_defaults(vector_store=store)
index = VectorStoreIndex.from_documents(docs, storage_context=ctx, embed_model=embed_model)
# 3) Query
query_engine = index.as_query_engine()
resp = query_engine.query('What is this page about?')
print(resp)
Expected output
Response(answer='...', sources=[...])
See the official example for more patterns and settings: https://github.com/supabase/supabase/blob/master/examples/ai/llamaindex/llamaindex.ipynb
pgvector setup
Enable pgvector
on your database first.
- Supabase: Database → Extensions → enable
pgvector
. - Self‑hosted Postgres:
create extension if not exists vector;
Create HNSW/IVFFlat indexes per Supabase guidance for production.