Back to Firecrawl in Excel
SheetXAI logo
Firecrawl logo
Firecrawl · Excel Guide

Map All Site URLs Into Excel for a Content Audit

2026-05-14
5 min read

The Scenario

Your agency's SEO lead just handed you a project: audit the content coverage of a 500-page marketing site before recommending a content strategy. The brief says to identify thin-content pages, categorize what's there, and find the gaps.

Nobody has a current sitemap. The XML sitemap file is two years out of date — it's missing an entire product section that launched last spring.

The bad version:

  • Crawl the site manually by clicking through the navigation, copying each URL into the workbook as you discover it
  • Miss the pages that aren't in the navigation — the orphan blog posts, the old landing pages, the A/B test variants still indexed by Google
  • Get to page 120 and realize you have no idea how many pages are left or whether you've found all the sections

A content audit built on an incomplete URL list isn't an audit. It's a guess with extra steps.

The Easy Way: One Prompt in SheetXAI

SheetXAI is an AI agent that lives inside your Excel workbook. It reads your data and through its built-in Firecrawl integration it can map all discoverable URLs on a website — following internal links from the homepage outward — and write every URL into your workbook as a starting point for the audit.

Map all URLs on https://www.mycompany.com using Firecrawl and write each URL into column A of my "URL Inventory" worksheet — one URL per row. Then classify each URL as "Blog", "Landing Page", "Product Page", "Case Study", or "Other" based on the URL path and write the classification into column B.

What You Get

  • Column A with every URL Firecrawl discovered, including pages not linked from the navigation
  • Column B with a page type classification based on the URL slug — blog posts, product pages, case studies, and landing pages separated out
  • The full list in one pass, ready to be sorted, filtered, and enriched with additional columns for word count, last-updated dates, or conversion data

What If the Data Is Not Quite Ready

The site has a /staging/ subdirectory that I don't want in the audit

Map all URLs on https://www.mycompany.com with Firecrawl, but exclude any URL whose path contains /staging/ or /preview/. Write results into column A with page type in column B.

I need to know which pages in the inventory are indexed by Google — not just crawlable

In the "URL Inventory" worksheet, I need a column C that flags each URL as "likely indexed" or "may be excluded" based on whether the URL slug follows canonical patterns (no query strings, no session tokens, no trailing numeric IDs that suggest pagination). Apply the classification to all rows in column A.

The site has both English and French versions under /en/ and /fr/ — I only want the English pages

Map all URLs on https://www.mycompany.com, exclude any URL whose path starts with /fr/, and write the English URLs into column A. Classify each as "Blog", "Landing Page", "Product Page", "Case Study", or "Other" in column B.

The full audit pipeline: discover, classify, flag thin-content candidates, and sort by priority

Map all discoverable URLs on https://www.mycompany.com with Firecrawl. Write URLs into column A and page-type classification into column B. Then scrape the content of each URL and write the approximate word count into column C. Flag any page with fewer than 300 words as "thin content" in column D. Sort the final list by page type in column B, then by word count ascending in column C, so the thinnest pages in each category appear first.

One pass: discovery, classification, content-depth scoring, and prioritization.

Try It

Get the 7-day free trial of SheetXAI and open a blank workbook in Excel, then ask it to map every URL on your site and classify the pages by type so you can start your audit with a complete inventory. Link to the hub: How to Connect Firecrawl to Excel. Also see: Extract Products From an E-Commerce Category Page Into an Excel workbook.

Stop memorizing formulas.
Tell your spreadsheet what to do.

Join 4,000+ professionals saving hours every week with SheetXAI.

Learn more