Local Development
Run Supacrawler locally for development, testing, or self-hosting. Perfect for developers who want full control over their scraping infrastructure.
Quick Start
Option 1: Docker (Recommended)
The fastest way to get Supacrawler running locally:
# Download and start with Docker Compose
curl -O https://raw.githubusercontent.com/supacrawler/supacrawler/main/docker-compose.yml
docker compose up
Your local Supacrawler instance will be available at http://localhost:8081
Option 2: Binary Installation
For advanced users who prefer native binaries:
- Download from GitHub releases
- Install dependencies: Redis + Node.js + Playwright v1.49.1
- Run:
./supacrawler --redis-addr=127.0.0.1:6379
Using SDKs with Local Instance
Once your local Supacrawler is running, you can use our SDKs by simply updating the baseUrl
:
Python SDK
from supacrawler import SupacrawlerClient
# Point to your local instance
client = SupacrawlerClient(
api_key='anything', # API key not required for local
base_url='http://localhost:8081/v1'
)
# Use normally
result = client.scrape({
'url': 'https://example.com',
'format': 'markdown'
})
print(result.data)
JavaScript/TypeScript SDK
import { SupacrawlerClient } from '@supacrawler/js'
// Point to your local instance
const client = new SupacrawlerClient({
apiKey: 'anything', // API key not required for local
baseUrl: 'http://localhost:8081/v1'
})
// Use normally
const result = await client.scrape({
url: 'https://example.com',
format: 'markdown'
})
console.log(result.data)
Direct HTTP/cURL
You can also make direct HTTP requests:
# Health check
curl http://localhost:8081/v1/health
# Scrape a webpage
curl "http://localhost:8081/v1/scrape?url=https://example.com&format=markdown"
# Take a screenshot
curl -X POST http://localhost:8081/v1/screenshots \
-H 'Content-Type: application/json' \
-d '{"url":"https://example.com","full_page":true}'
Local Development Benefits
Cost Savings
- No API costs for development and testing
- Unlimited requests during development
- No rate limits on your local instance
Privacy & Control
- Data stays local - no external API calls
- Full control over infrastructure and scaling
- Custom configurations for specific needs
Development Speed
- Instant feedback without network latency
- Debug and iterate faster
- Test edge cases without quota concerns
Configuration Options
Environment Variables
# Core settings
export HTTP_ADDR=":8081"
export REDIS_ADDR="127.0.0.1:6379"
export DATA_DIR="./data"
# Optional: Supabase integration
export SUPABASE_URL="your-supabase-url"
export SUPABASE_SERVICE_KEY="your-service-key"
export SUPABASE_STORAGE_BUCKET="screenshots"
Custom Configuration
Create a .env
file in your project root:
HTTP_ADDR=:8081
REDIS_ADDR=127.0.0.1:6379
DATA_DIR=./data
REDIS_PASSWORD=your-redis-password
Hot Reload Development
For active development with automatic reloading:
# Install Air for hot reloading
go install github.com/air-verse/air@latest
# Set environment variables
export REDIS_ADDR=127.0.0.1:6379
export HTTP_ADDR=:8081
# Run with hot reload
air
Troubleshooting
JavaScript Rendering Issues
If you encounter "please install the driver" errors:
# Install Playwright dependencies
npm install -g playwright
playwright install chromium --with-deps
Redis Connection Issues
Make sure Redis is running:
# macOS with Homebrew
brew services start redis
# Docker
docker run -d --name redis -p 6379:6379 redis:7-alpine
# Ubuntu/Debian
sudo systemctl start redis-server
Port Already in Use
If port 8081 is busy, change the port:
export HTTP_ADDR=":8082"
./supacrawler
Next Steps
- Explore API Examples to see what you can build
- Check out integrations for popular frameworks
- Read our blog for advanced use cases
- Join our community on GitHub
Production Deployment
Ready to deploy? Check out our production deployment guide or use our managed service at supacrawler.com - 63% cheaper than alternatives!