Supacrawler Docs

Integrations

End-to-end examples integrating Supacrawler with vector stores, LLM frameworks, and downstream tools

Integrations

Explore practical, end-to-end integrations using the Supacrawler SDKs with popular tools and platforms.

Common Use Cases

RAG (Retrieval-Augmented Generation)

  1. Scrape website content with Supacrawler
  2. Embed documents using OpenAI/Cohere
  3. Store vectors in Supabase pgvector
  4. Query with semantic search
  5. Generate responses with LLM

Knowledge Base Sync

  1. Monitor docs with Watch API
  2. Crawl on changes
  3. Update vector store
  4. Refresh embeddings automatically

Content Pipeline

  1. Extract structured data with Parse API
  2. Transform with LangChain
  3. Load into data warehouse
  4. Analyze with BI tools

Getting Started

All integrations follow a similar pattern:

from supacrawler import SupacrawlerClient

# Step 1: Scrape content
client = SupacrawlerClient(api_key="your-api-key")
result = client.scrape("https://example.com", format="markdown")

# Step 2: Process with your framework
# (LangChain, LlamaIndex, etc.)

# Step 3: Store in vector database
# (Supabase, Pinecone, Weaviate, etc.)
import { SupacrawlerClient } from '@supacrawler/js'

// Step 1: Scrape content
const client = new SupacrawlerClient({ apiKey: 'your-api-key' })
const result = await client.scrape({ url: 'https://example.com' })

// Step 2: Process with your framework
// (LangChain.js, etc.)

// Step 3: Store in vector database

Supported Platforms

  • LLM Frameworks: LangChain, LlamaIndex, Haystack
  • Vector Stores: Supabase pgvector, Pinecone, Weaviate, Qdrant
  • Embedding Models: OpenAI, Cohere, HuggingFace
  • LLMs: OpenAI GPT, Anthropic Claude, Google Gemini

Was this page helpful?