Crawl a Competitor Website and Export URLs to a Google Sheet

The Scenario

Your head of content just asked you to audit a competitor's site. Not a quick glance — a real content inventory: every URL, every page title, what topics they are covering, where the gaps are relative to your own content plan. You have the competitor's homepage URL. Everything else needs to be discovered.

The bad version:

Start manually clicking through the competitor site, copying URLs into a sheet one by one, guessing at what else is linked from each page you visit
Try a free online crawler tool, hit a 50-page cap, get a CSV with inconsistent formatting, spend 40 minutes cleaning it before it is usable in your sheet
Give up on completeness and run the audit on a subset you collected manually, knowing the analysis will have blind spots

A content gap analysis built on incomplete crawl data produces incomplete conclusions. Your content calendar for next quarter is going to be built on this audit.

The Easy Way: One Prompt in SheetXAI

SheetXAI is an AI agent that lives inside your Google Sheet. Put the competitor's URL in cell A1, and SheetXAI uses Scrapfly's crawler to discover the full site — returning every internal URL and page title — and writes the results into a new tab without you touching a command line or a third-party crawler interface.

Type this prompt

Create a Scrapfly crawler for the website in cell A1 with a limit of 500 pages, retrieve all crawled contents, and write each page's URL and title into columns A and B of a new sheet called Crawl Results

What You Get

A new sheet called Crawl Results with one row per discovered page
Column A contains the full URL of each crawled page
Column B contains the page title tag as found on that page
Pages that returned errors during crawling are included with an error note rather than silently skipped
The crawl respects robots.txt and rate limits through Scrapfly's built-in settings — no manual throttling needed

What If the Data Is Not Quite Ready

You only want pages from a specific section of the site

The full 500-page crawl may include legal pages, tag archives, and author profiles you do not care about. Scope it before writing:

Type this prompt

Create a Scrapfly crawler for the site in cell A1, limit to 500 pages, then write only the URLs that contain /blog/ or /resources/ in the path into columns A and B of a new sheet called Crawl Results

You want to capture the H1 heading in addition to the page title

Title tags and H1s sometimes differ significantly. If you want both for content analysis:

Type this prompt

Crawl the site in cell A1 using Scrapfly with a 500-page limit, extract the page title and the H1 heading from each discovered page, and write URL, title, and H1 into columns A, B, and C of a new sheet called Crawl Results

You want to join the crawl results against your own content inventory

Your existing content lives in a tab called Our Content with URLs in column A. After the crawl, you need to see what the competitor covers that you do not:

Type this prompt

Crawl the site in cell A1 with Scrapfly, write all URLs and titles into Crawl Results, then in column C note whether each URL topic appears to overlap with any URL in the Our Content tab

You want the crawl data cleaned, categorized, and ready for a content gap report in one shot

Type this prompt

Crawl the site in cell A1 using Scrapfly with a 500-page limit, write all URLs and titles into a sheet called Crawl Results, remove any URLs containing /tag/, /author/, or /page/, then in column C categorize each remaining URL as Blog, Guide, Case Study, or Other based on the URL structure and title

One ask, one structured output — the crawl, the filter, and the categorization run together.

Try It

Get the 7-day free trial of SheetXAI and put a competitor domain in cell A1 of any Google Sheet, then ask it to run a Scrapfly crawl and populate a new tab with all discovered URLs and titles. If you also need to audit HTTP status codes from the crawl, check out the spoke on exporting crawled URLs with status codes.

Crawl a Competitor Website and Export URLs to a Google Sheet

The Scenario

The Easy Way: One Prompt in SheetXAI

What You Get

What If the Data Is Not Quite Ready

You only want pages from a specific section of the site

You want to capture the H1 heading in addition to the page title

You want to join the crawl results against your own content inventory

You want the crawl data cleaned, categorized, and ready for a content gap report in one shot

Try It

Stop memorizing formulas.Tell your spreadsheet what to do.

Stop memorizing formulas.
Tell your spreadsheet what to do.