"From Distributed Illusion to Resource Isolation: Architecture Evolution"

Reflections on how Firecrawl Lite evolved from an over-engineered distributed solution to a minimalist resource isolation approach.

"Talk is cheap. Show me the data." — Linus Torvalds

This document records the decision-making process of how Firecrawl Lite, when facing computing power bottlenecks, used first-principles thinking to converge from an "over-engineered" distributed solution to a "minimalist" resource isolation approach.

1. Background: The 2C2G Dilemma

Firecrawl Lite runs on a 2C2G server in Singapore, which hosts not only the scraper API but also other core businesses.

Core Conflict:

  • Main Business: Requires low latency, high stability, small and stable memory usage.
  • Scraper Business: Puppeteer is a "memory monster", with a single instance taking up ~500MB+. Peak concurrency easily triggers OOM, causing the main business to crash.

2. The Wrong Path: The Temptation of Distributed Architecture

Facing "insufficient computing power" and the temptation of "CNB free computing power", we initially designed a classic Distributed Worker Architecture (feat-distributed-workers):

  • Master: Responsible for queue management and task distribution.
  • Worker: Deployed in CNB containers, actively pulling tasks via HTTP long polling (penetrating NAT).
  • Components: Redis queue, heartbeat detection, auto-scaling, SSRF protection, authentication...

It looked beautiful, but couldn't withstand first-principles scrutiny:

  1. Complexity Explosion: To solve memory contention, a full set of distributed system complexity (service discovery, state synchronization, fault tolerance) was introduced.
  2. Asynchronous Trap: Crawlers usually need to return in 3-10s. Changing to async + polling causes user experience to drop precipitously.
  3. YAGNI: Do we really need 10+ concurrency? Or do we just want "no crashes"?

3. Back to Basics: Resource Isolation

Revisiting through first principles, we found the essence of the problem is not "Scalability" but "Resource Isolation".

We don't need infinite computing power; we just need to: Kick the memory-eating Puppeteer out of the 2C2G server.

Solution Evolution

Solution Core Idea Complexity Cost Latency Evaluation
A. 503 Throttling Admit limited resources, reject when pool is full Very Low $0 Low Current Best Solution. But isolation not solved.
B. CNB Worker Async queue + long polling High $0 High Over-engineered. Poor async experience.
C. Add Server Physical isolation Low $$ Low Simplest but costly. Goes against "saving money".
D. Cloudflare Browser Rendering Serverless Browser Low $0-5 Low High Potential. Limited to 10min/day, paid expansion needed.
E. Browserless on CNB Cloud Native Dev Env + Heartbeat Medium $0 Low Innovative Solution. Use CNB features for free isolation.

4. Final Exploration: Browserless on CNB

We discovered that CNB's "Cloud Native Development Environment" has two key features:

  1. Port Preview: Provides xxx-port.cnb.run public address.
  2. Recycling Mechanism: Recycled after 10 minutes of inactivity, but can run for up to 18 hours.

This inspired a Hybrid Architecture:

graph TD
    User[User] --> Gateway[2C2G Server]
    
    subgraph "Resource Isolation Strategy"
        Gateway --"1. Try First"--> Remote[CNB Browserless]
        Gateway --"2. Fallback"--> Local[Local Puppeteer]
    end
    
    Remote --"WebSocket"--> Gateway
    
    subgraph "CNB Cloud Native Env"
        Browserless[Browserless Docker]
        KeepAlive[Heartbeat Script]
    end
    
    KeepAlive --"Every 5min"--> Browserless
    Browserless --"Register Address"--> Gateway

Core Advantages

  1. Zero Cold Start: Use heartbeat script to keep container online (up to 18h).
  2. Zero Queue: Connect to remote browser directly via WebSocket, returning results synchronously.
  3. High Availability: When CNB is unavailable, automatically fallback to local Puppeteer (with 503 throttling).
  4. Minimal Code: Just change puppeteer.launch() to puppeteer.connect().

5. Architectural Philosophy Summary

  1. Ask "Is it?" before "How to do it": Confirming it's a resource problem rather than a scalability problem avoided building a huge distributed system.
  2. Use simple solutions if possible: Sync call > Async queue > Auto-scaling.
  3. Leverage existing infrastructure: CNB's development environment itself is a high-quality container runtime; we don't necessarily need Pipeline Triggers.
  4. Graceful degradation is the system's airbag: No matter how unstable external services are, as long as there is a local fallback, the system is robust.

Next Steps