Supacrawler Docs
Contributing

Local Development

Run Supacrawler locally for development, testing, or self-hosting.

Quick Start

The fastest way to get Supacrawler running locally:

# Download and start with Docker Compose
curl -O https://raw.githubusercontent.com/supacrawler/supacrawler/main/docker-compose.yml
docker compose up

Your local Supacrawler instance will be available at http://localhost:8081

Option 2: Binary Installation

For advanced users who prefer native binaries:

  1. Download from GitHub releases
  2. Install dependencies: Redis + Node.js + Playwright v1.49.1
  3. Run: ./supacrawler --redis-addr=127.0.0.1:6379

Using SDKs with Local Instance

Once your local Supacrawler is running, point your SDK to it:

from supacrawler import SupacrawlerClient

# Point to your local instance
client = SupacrawlerClient(
    api_key='anything',  # API key not required for local
    base_url='http://localhost:8081/v1'
)

# Use normally
result = client.scrape(
    url='https://example.com',
    format='markdown'
)

print(result.content)
import { SupacrawlerClient } from '@supacrawler/js'

// Point to your local instance
const client = new SupacrawlerClient({ 
    apiKey: 'anything',
    baseUrl: 'http://localhost:8081/v1' 
})

// Use normally
const result = await client.scrape({ 
    url: 'https://example.com',
    format: 'markdown' 
})

console.log(result.content)
# Health check
curl http://localhost:8081/v1/health

# Scrape a webpage
curl "http://localhost:8081/v1/scrape?url=https://example.com&format=markdown"

# Take a screenshot
curl -X POST http://localhost:8081/v1/screenshots \
  -H 'Content-Type: application/json' \
  -d '{"url":"https://example.com","full_page":true}'

Benefits

Cost Savings

  • No API costs for development and testing
  • Unlimited requests during development
  • No rate limits on your local instance

Privacy & Control

  • Data stays local - no external API calls
  • Full control over infrastructure
  • Custom configurations for specific needs

Development Speed

  • Instant feedback without network latency
  • Debug and iterate faster
  • Test edge cases without quota concerns

Configuration

Environment Variables

# Core settings
export HTTP_ADDR=":8081"
export REDIS_ADDR="127.0.0.1:6379"
export DATA_DIR="./data"

# Optional: Supabase integration
export SUPABASE_URL="your-supabase-url"
export SUPABASE_SERVICE_KEY="your-service-key"
export SUPABASE_STORAGE_BUCKET="screenshots"

Configuration File

Create a .env file in your project root:

HTTP_ADDR=:8081
REDIS_ADDR=127.0.0.1:6379
DATA_DIR=./data
REDIS_PASSWORD=your-redis-password

Hot Reload Development

For active development with automatic reloading:

# Install Air for hot reloading
go install github.com/air-verse/air@latest

# Set environment variables
export REDIS_ADDR=127.0.0.1:6379
export HTTP_ADDR=:8081

# Run with hot reload
air

Troubleshooting

JavaScript Rendering Issues

If you encounter "please install the driver" errors:

# Install Playwright dependencies
npm install -g playwright
playwright install chromium --with-deps

Redis Connection Issues

Make sure Redis is running:

# Install and start Redis
brew services start redis
# Run Redis container
docker run -d --name redis -p 6379:6379 redis:7-alpine
# Ubuntu/Debian
sudo systemctl start redis-server

Port Already in Use

If port 8081 is busy, change the port:

export HTTP_ADDR=":8082"
./supacrawler

Development Tools

VS Code

  • Go extension
  • ESLint
  • Prettier
  • GitLens

JetBrains IDEs

  • GoLand (for Go development)
  • WebStorm (for TypeScript/JavaScript)

Code Quality Tools

# Format Go code
gofmt -w .

# Run Go linter
golangci-lint run

# Format TypeScript
npm run format

# Run TypeScript linter
npm run lint

Next Steps

Production Deployment

Ready to deploy? Check out our Self-Hosting Guide or use our managed service at supacrawler.com - 63% cheaper than alternatives with zero maintenance!

Was this page helpful?