"From Distributed Illusion to Resource Isolation: Architecture Evolution"

Reflections on how Firecrawl Lite evolved from an over-engineered distributed solution to a minimalist resource isolation approach.

"Talk is cheap. Show me the data." — Linus Torvalds

This document records the decision-making process of how Firecrawl Lite, when facing computing power bottlenecks, used first-principles thinking to converge from an "over-engineered" distributed solution to a "minimalist" resource isolation approach.

1. Background: The 2C2G Dilemma

Firecrawl Lite runs on a 2C2G server in Singapore, which hosts not only the scraper API but also other core businesses.

Core Conflict:

Main Business: Requires low latency, high stability, small and stable memory usage.
Scraper Business: Puppeteer is a "memory monster", with a single instance taking up ~500MB+. Peak concurrency easily triggers OOM, causing the main business to crash.

2. The Wrong Path: The Temptation of Distributed Architecture

Facing "insufficient computing power" and the temptation of "CNB free computing power", we initially designed a classic Distributed Worker Architecture (feat-distributed-workers):

Master: Responsible for queue management and task distribution.
Worker: Deployed in CNB containers, actively pulling tasks via HTTP long polling (penetrating NAT).
Components: Redis queue, heartbeat detection, auto-scaling, SSRF protection, authentication...

It looked beautiful, but couldn't withstand first-principles scrutiny:

Complexity Explosion: To solve memory contention, a full set of distributed system complexity (service discovery, state synchronization, fault tolerance) was introduced.
Asynchronous Trap: Crawlers usually need to return in 3-10s. Changing to async + polling causes user experience to drop precipitously.
YAGNI: Do we really need 10+ concurrency? Or do we just want "no crashes"?

3. Back to Basics: Resource Isolation

Revisiting through first principles, we found the essence of the problem is not "Scalability" but "Resource Isolation".

We don't need infinite computing power; we just need to: Kick the memory-eating Puppeteer out of the 2C2G server.

Solution Evolution

Solution	Core Idea	Complexity	Cost	Latency	Evaluation
A. 503 Throttling	Admit limited resources, reject when pool is full	Very Low	$0	Low	Current Best Solution. But isolation not solved.
B. CNB Worker	Async queue + long polling	High	$0	High	Over-engineered. Poor async experience.
C. Add Server	Physical isolation	Low	$$	Low	Simplest but costly. Goes against "saving money".
D. Cloudflare Browser Rendering	Serverless Browser	Low	$0-5	Low	High Potential. Limited to 10min/day, paid expansion needed.
E. Browserless on CNB	Cloud Native Dev Env + Heartbeat	Medium	$0	Low	Innovative Solution. Use CNB features for free isolation.

4. Final Exploration: Browserless on CNB

We discovered that CNB's "Cloud Native Development Environment" has two key features:

Port Preview: Provides xxx-port.cnb.run public address.
Recycling Mechanism: Recycled after 10 minutes of inactivity, but can run for up to 18 hours.

This inspired a Hybrid Architecture:

graph TD
    User[User] --> Gateway[2C2G Server]
    
    subgraph "Resource Isolation Strategy"
        Gateway --"1. Try First"--> Remote[CNB Browserless]
        Gateway --"2. Fallback"--> Local[Local Puppeteer]
    end
    
    Remote --"WebSocket"--> Gateway
    
    subgraph "CNB Cloud Native Env"
        Browserless[Browserless Docker]
        KeepAlive[Heartbeat Script]
    end
    
    KeepAlive --"Every 5min"--> Browserless
    Browserless --"Register Address"--> Gateway

Core Advantages

Zero Cold Start: Use heartbeat script to keep container online (up to 18h).
Zero Queue: Connect to remote browser directly via WebSocket, returning results synchronously.
High Availability: When CNB is unavailable, automatically fallback to local Puppeteer (with 503 throttling).
Minimal Code: Just change puppeteer.launch() to puppeteer.connect().

5. Architectural Philosophy Summary

Ask "Is it?" before "How to do it": Confirming it's a resource problem rather than a scalability problem avoided building a huge distributed system.
Use simple solutions if possible: Sync call > Async queue > Auto-scaling.
Leverage existing infrastructure: CNB's development environment itself is a high-quality container runtime; we don't necessarily need Pipeline Triggers.
Graceful degradation is the system's airbag: No matter how unstable external services are, as long as there is a local fallback, the system is robust.

Next Steps

feat-remote-browser - Remote Browser Integration Plan (TBD)