Local Development

Run Supacrawler locally for development, testing, or self-hosting. Perfect for developers who want full control over their scraping infrastructure.

Quick Start

Option 1: Docker (Recommended)

The fastest way to get Supacrawler running locally:

# Download and start with Docker Compose
curl -O https://raw.githubusercontent.com/supacrawler/supacrawler/main/docker-compose.yml
docker compose up

Your local Supacrawler instance will be available at http://localhost:8081

Option 2: Binary Installation

For advanced users who prefer native binaries:

Download from GitHub releases
Install dependencies: Redis + Node.js + Playwright v1.49.1
Run: ./supacrawler --redis-addr=127.0.0.1:6379

Using SDKs with Local Instance

Once your local Supacrawler is running, you can use our SDKs by simply updating the baseUrl:

Python SDK

from supacrawler import SupacrawlerClient

# Point to your local instance
client = SupacrawlerClient(
    api_key='anything',  # API key not required for local
    base_url='http://localhost:8081/v1'
)

# Use normally
result = client.scrape({
    'url': 'https://example.com',
    'format': 'markdown'
})

print(result.data)

JavaScript/TypeScript SDK

import { SupacrawlerClient } from '@supacrawler/js'

// Point to your local instance
const client = new SupacrawlerClient({ 
    apiKey: 'anything',  // API key not required for local
    baseUrl: 'http://localhost:8081/v1' 
})

// Use normally
const result = await client.scrape({ 
    url: 'https://example.com', 
    format: 'markdown' 
})

console.log(result.data)

Direct HTTP/cURL

You can also make direct HTTP requests:

# Health check
curl http://localhost:8081/v1/health

# Scrape a webpage
curl "http://localhost:8081/v1/scrape?url=https://example.com&format=markdown"

# Take a screenshot
curl -X POST http://localhost:8081/v1/screenshots \
  -H 'Content-Type: application/json' \
  -d '{"url":"https://example.com","full_page":true}'

Local Development Benefits

Cost Savings

No API costs for development and testing
Unlimited requests during development
No rate limits on your local instance

Privacy & Control

Data stays local - no external API calls
Full control over infrastructure and scaling
Custom configurations for specific needs

Development Speed

Instant feedback without network latency
Debug and iterate faster
Test edge cases without quota concerns

Configuration Options

Environment Variables

# Core settings
export HTTP_ADDR=":8081"
export REDIS_ADDR="127.0.0.1:6379"
export DATA_DIR="./data"

# Optional: Supabase integration
export SUPABASE_URL="your-supabase-url"
export SUPABASE_SERVICE_KEY="your-service-key"
export SUPABASE_STORAGE_BUCKET="screenshots"

Custom Configuration

Create a .env file in your project root:

HTTP_ADDR=:8081
REDIS_ADDR=127.0.0.1:6379
DATA_DIR=./data
REDIS_PASSWORD=your-redis-password

Hot Reload Development

For active development with automatic reloading:

# Install Air for hot reloading
go install github.com/air-verse/air@latest

# Set environment variables
export REDIS_ADDR=127.0.0.1:6379
export HTTP_ADDR=:8081

# Run with hot reload
air

Troubleshooting

JavaScript Rendering Issues

If you encounter "please install the driver" errors:

# Install Playwright dependencies
npm install -g playwright
playwright install chromium --with-deps

Redis Connection Issues

Make sure Redis is running:

# macOS with Homebrew
brew services start redis

# Docker
docker run -d --name redis -p 6379:6379 redis:7-alpine

# Ubuntu/Debian
sudo systemctl start redis-server

Port Already in Use

If port 8081 is busy, change the port:

export HTTP_ADDR=":8082"
./supacrawler

Next Steps

Explore API Examples to see what you can build
Check out integrations for popular frameworks
Read our blog for advanced use cases
Join our community on GitHub

Production Deployment

Ready to deploy? Check out our production deployment guide or use our managed service at supacrawler.com - 63% cheaper than alternatives!