- It manages complexities: proxies, caching, rate limits, js-blocked content
- Handles dynamic content: dynamic websites, js-rendered sites, PDFs, images
- Outputs clean markdown, structured data, screenshots or html.
Try it in the Playground
Test scraping in the interactive playground — no code required.
Scraping a URL with Firecrawl
/scrape endpoint
Used to scrape a URL and get its content.Installation
Usage
Each scrape consumes 1 credit. Additional credits apply for certain options: JSON mode costs 4 additional credits per page, enhanced proxy costs 4 additional credits per page, and PDF parsing costs 1 credit per PDF page.
Response
SDKs will return the data object directly. cURL will return the payload exactly as shown below.Scrape Formats
You can now choose what formats you want your output in. You can specify multiple output formats. Supported formats are:- Markdown (
markdown) - Summary (
summary) - HTML (
html) - cleaned version of the page’s HTML - Raw HTML (
rawHtml) - unmodified HTML as received from the page - Screenshot (
screenshot, with options likefullPage,quality,viewport) — screenshot URLs expire after 24 hours - Links (
links) - JSON (
json) - structured output - Images (
images) - extract all image URLs from the page - Branding (
branding) - extract brand identity and design system - Audio (
audio) - extract MP3 audio from supported video URLs, e.g. YouTube (returns a signed GCS URL, expires after 1 hour)
Extract structured data
/scrape (with json) endpoint
Used to extract structured data from scraped pages.JSON
Extracting without schema
You can now extract without a schema by just passing aprompt to the endpoint. The llm chooses the structure of the data.
JSON
JSON format options
When using thejson format, pass an object inside formats with the following parameters:
schema: JSON Schema for the structured output.prompt: Optional prompt to help guide extraction when a schema is present or when you prefer light guidance.
Extract brand identity
/scrape (with branding) endpoint
The branding format extracts comprehensive brand identity information from a webpage, including colors, fonts, typography, spacing, UI components, and more. This is useful for design system analysis, brand monitoring, or building tools that need to understand a website’s visual identity.Response
The branding format returns a comprehensiveBrandingProfile object with the following structure:
Output
Branding Profile Structure
Thebranding object contains the following properties:
colorScheme: The detected color scheme ("light"or"dark")logo: URL of the primary logocolors: Object containing brand colors:primary,secondary,accent: Main brand colorsbackground,textPrimary,textSecondary: UI colorslink,success,warning,error: Semantic colors
fonts: Array of font families used on the pagetypography: Detailed typography information:fontFamilies: Primary, heading, and code font familiesfontSizes: Size definitions for headings and body textfontWeights: Weight definitions (light, regular, medium, bold)lineHeights: Line height values for different text types
spacing: Spacing and layout information:baseUnit: Base spacing unit in pixelsborderRadius: Default border radiuspadding,margins: Spacing values
components: UI component styles:buttonPrimary,buttonSecondary: Button stylesinput: Input field styles
icons: Icon style informationimages: Brand images (logo, favicon, og:image)animations: Animation and transition settingslayout: Layout configuration (grid, header/footer heights)personality: Brand personality traits (tone, energy, target audience)
Combining with other formats
You can combine the branding format with other formats to get comprehensive page data:Audio extraction
Theaudio format extracts audio from supported websites (e.g. YouTube) as MP3 files and returns a signed Google Cloud Storage URL. This is useful for building audio processing pipelines, transcription services, or podcast tools.
Interacting with the page with Actions
Firecrawl allows you to perform various actions on a web page before scraping its content. This is particularly useful for interacting with dynamic content, navigating through pages, or accessing content that requires user interaction. Here is an example of how to use actions to navigate to google.com, search for Firecrawl, click on the first result, and take a screenshot. It is important to almost always use thewait action before/after executing other actions to give enough time for the page to load.
Example
Output
Location and Language
Specify country and preferred languages to get relevant content based on your target location and language preferences.How it works
When you specify the location settings, Firecrawl will use an appropriate proxy if available and emulate the corresponding language and timezone settings. By default, the location is set to ‘US’ if not specified.Usage
To use the location and language settings, include thelocation object in your request body with the following properties:
country: ISO 3166-1 alpha-2 country code (e.g., ‘US’, ‘AU’, ‘DE’, ‘JP’). Defaults to ‘US’.languages: An array of preferred languages and locales for the request in order of priority. Defaults to the language of the specified location.
Caching and maxAge
To make requests faster, Firecrawl serves results from cache by default when a recent copy is available.- Default freshness window:
maxAge = 172800000ms (2 days). If a cached page is newer than this, it’s returned instantly; otherwise, the page is scraped and then cached. - Performance: This can speed up scrapes by up to 5x when data doesn’t need to be ultra-fresh.
- Always fetch fresh: Set
maxAgeto0. Note that this bypasses the cache entirely, so every request goes through the full scraping pipeline, meaning that the request will take longer to complete and is more likely to fail. Use a non-zeromaxAgeif freshness on every request is not critical. - Avoid storing: Set
storeInCachetofalseif you don’t want Firecrawl to cache/store results for this request. - Cache-only lookup: Set
minAgeto perform a cache-only lookup without triggering a fresh scrape. The value is in milliseconds and specifies the minimum age the cached data must be. If no cached data is found, a404with error codeSCRAPE_NO_CACHED_DATAis returned. SetminAgeto1to accept any cached data regardless of age. - Change tracking: Requests that include
changeTrackingbypass the cache, somaxAgeis ignored.
Batch scraping multiple URLs
You can now batch scrape multiple URLs at the same time. It takes the starting URLs and optional parameters as arguments. The params argument allows you to specify additional options for the batch scrape job, such as the output formats.How it works
It is very similar to how the/crawl endpoint works. It submits a batch scrape job and returns a job ID to check the status of the batch scrape.
The sdk provides 2 methods, synchronous and asynchronous. The synchronous method will return the results of the batch scrape job, while the asynchronous method will return a job ID that you can use to check the status of the batch scrape.
Usage
Response
If you’re using the sync methods from the SDKs, it will return the results of the batch scrape job. Otherwise, it will return a job ID that you can use to check the status of the batch scrape.Synchronous
Completed
Asynchronous
You can then use the job ID to check the status of the batch scrape by calling the/batch/scrape/{id} endpoint. This endpoint is meant to be used while the job is still running or right after it has completed as batch scrape jobs expire after 24 hours.
Enhanced Mode
For complex websites, Firecrawl offers enhanced mode that provides better success rates while maintaining privacy. Learn more about Enhanced Mode.Zero Data Retention (ZDR)
Firecrawl supports Zero Data Retention (ZDR) for teams with strict data handling requirements. When enabled, Firecrawl will not persist any page content or extracted data beyond the lifetime of the request. To enable ZDR, setzeroDataRetention: true in your request:
cURL
Screenshots are not available in ZDR mode. Because screenshots require uploading to persistent storage, they are incompatible with the ZDR guarantee. Requests that include both
zeroDataRetention: true and a screenshot format will return an error.Are you an AI agent that needs a Firecrawl API key? See firecrawl.dev/agent-onboarding/SKILL.md for automated onboarding instructions.

