Supacrawler Docs
Contributing

Contributing

Welcome to the Supacrawler community! Learn how to contribute to our open-source, MIT-licensed web scraping platform.

Open Core Philosophy

Supacrawler is built with an open-core model. The core scraping engine and API are fully open source under the MIT License, giving you complete freedom to use, modify, and distribute the software. Enterprise features and managed services help sustain development while keeping the core accessible to everyone.

Why Contribute?

  • Impact: Help thousands of developers build better web scraping solutions
  • Learn: Work with modern Go, TypeScript, and browser automation technologies
  • Community: Join a growing community of contributors and users
  • Recognition: Get credit for your contributions and build your portfolio

Getting Started

1. Fork and Clone

# Fork the repository on GitHub first, then clone your fork
git clone https://github.com/YOUR_USERNAME/Supacrawler.git
cd Supacrawler

# Add upstream remote
git remote add upstream https://github.com/Supacrawler/Supacrawler.git

2. Set Up Development Environment

Follow our Local Development Guide for detailed setup instructions.

3. Create a Branch

# Update your main branch
git checkout main
git pull upstream main

# Create a feature branch
git checkout -b feature/your-feature-name

4. Make Your Changes

Write clean, well-documented code that follows our style guidelines (see below).

5. Test Your Changes

# Run tests
make test

# Run linter
make lint

6. Submit a Pull Request

# Commit your changes
git add .
git commit -m "Add: brief description of your changes"

# Push to your fork
git push origin feature/your-feature-name

Then create a pull request on GitHub.

Contribution Guidelines

Code Style

Go (Backend)

  • Follow Effective Go guidelines
  • Use gofmt for formatting
  • Write descriptive variable names
  • Add comments for exported functions

TypeScript (SDKs)

  • Use TypeScript strict mode
  • Follow ESLint configuration
  • Use descriptive type names
  • Document public APIs with JSDoc

General

  • No emojis in logs
  • No ALL CAPS (except proper acronyms like HTTP, SQL)
  • Professional, structured logging with context
  • Include error objects in error logs
  • Keep logs concise and diagnostic

Commit Messages

Follow the Conventional Commits specification:

type(scope): brief description

[optional body]

[optional footer]

Types:

  • feat: New feature
  • fix: Bug fix
  • docs: Documentation changes
  • style: Code style changes (formatting, etc.)
  • refactor: Code refactoring
  • test: Adding or updating tests
  • chore: Build process or auxiliary tool changes

Examples:

feat(api): add support for custom headers in scrape endpoint

fix(worker): resolve memory leak in browser pool

docs(readme): update installation instructions

Pull Request Process

  1. Update Documentation: If you add a feature, update relevant docs
  2. Add Tests: Ensure your changes are covered by tests
  3. Check CI: Make sure all CI checks pass
  4. Request Review: Tag maintainers for review
  5. Address Feedback: Respond to review comments promptly
  6. Squash Commits: Maintainers will squash commits when merging

What to Contribute

Good First Issues

  • Documentation improvements
  • Bug fixes with clear reproduction steps
  • Adding tests for existing features
  • Improving error messages

Feature Requests

  • Check existing issues first
  • Open a discussion before implementing large features
  • Provide use cases and examples

Bug Reports

  • Use the bug report template
  • Include reproduction steps
  • Provide error logs and system info
  • Specify expected vs actual behavior

Development Workflow

Running Locally

# Start Redis
docker run -d -p 6379:6379 redis:7-alpine

# Start Supacrawler with hot reload
export REDIS_ADDR=127.0.0.1:6379
export HTTP_ADDR=:8081
air

Testing

# Run all tests
make test

# Run specific test
go test -v ./internal/core/scrape

# Run with coverage
make test-coverage

Debugging

Use structured logging with context:

logger.Info("scraping webpage",
    "url", targetURL,
    "user_id", userID,
    "format", format,
)

Architecture Overview

  • /cmd: Application entry points
  • /internal: Internal packages (core logic)
    • /core: Scraping, crawling, parsing logic
    • /auth: Authentication and authorization
    • /workers: Background job processing
    • /platform: Third-party integrations
  • /pkg: Public packages (can be imported)
  • /openapi: OpenAPI/Swagger specifications
  • /scripts: Build and deployment scripts

License

Supacrawler is released under the MIT License:

MIT License

Copyright (c) 2025 Supacrawler

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

You are free to:

  • Use the software commercially
  • Modify the source code
  • Distribute the software
  • Use privately
  • Sublicense

Community

Recognition

All contributors are recognized in our Contributors page. Significant contributions may be highlighted in release notes.

Thank you for contributing to Supacrawler! 🚀

Was this page helpful?