Crawling

Crawl mode lets you systematically browse and extract content from multiple pages of a website starting from a single URL.

Options

You can customize the extraction process with the following options:

AI Refinement: Uses a generative AI model to clean up the extracted Markdown. This can help fix formatting issues, remove redundant whitespace, and improve overall readability.
No Cache Mode: Forces the scraper to re-fetch the content from the website, ignoring any previously cached versions.
Custom Body Selector: Specify a CSS selector (e.g., main, #content, .article-body) to target a specific part of the page for extraction. This is useful for noisy pages where you only want the main content.

Crawl Max Pages: Limits the total number of pages the crawler will process.
Crawl Depth: Defines how many "clicks" away from the starting page the crawler is allowed to go. A depth of 1 will only crawl pages directly linked from the start page.

After an extraction job is complete, the results are displayed in a tabbed interface.

URL List: Shows a list of all processed URLs and their status. You can click on any URL to view its content.
Preview: Renders the extracted Markdown as formatted text, giving you a clean reading experience.
Code: Shows the raw Markdown code, which you can easily copy.

You can use the Copy and Download buttons to save the content of the currently viewed result.