The Scenario
It's the third time this month an SEO manager has been asked the same question by the content team: "Does our competitor have a page on [topic]?" Every time, the answer requires someone to manually browse through the competitor's site, check a few top-level categories, and guess.
There's a sheet called "Competitor Content Audit" with a seed URL in cell A1 and a note that says "we need every page on this site." The site has somewhere between 150 and 300 pages. Nobody knows the actual number because nobody has ever crawled it.
The bad version:
- Start at the homepage, click to every visible link, manually copy each URL into the sheet
- Realize after 40 minutes that the blog has pagination and you haven't touched it yet
- Get to 120 URLs, accept that you probably missed half the site, and submit an incomplete audit because the content calendar meeting is in two hours
The content gap analysis is supposed to inform next quarter's editorial plan. An incomplete crawl produces an incomplete gap analysis, which produces a wrong editorial plan. The problem compounds.
The Easy Way: One Prompt in SheetXAI
SheetXAI is an AI agent built into your Google Sheet. It reads what's in the sheet and uses Hyperbrowser's site crawl capability to systematically walk every page of a site starting from a seed URL — then writes the full inventory back into your sheet.
Start a Hyperbrowser web crawl from the URL in cell A1 of the "Competitor Content Audit" tab. When the crawl is complete, populate the sheet starting from row 2 with each discovered page URL in column A, the page title in column B, the meta description in column C, and the HTTP status code in column D. Sort by status code, 200s first.
What You Get
- Column A: every discovered page URL on the site
- Column B: the page title as it appears in the HTML
<title>tag - Column C: the meta description if one is set, blank if not
- Column D: HTTP status code — lets you immediately spot 301s, 404s, and anything returning an error
- Rows sorted 200s first so the valid content is at the top
What If the Data Is Not Quite Ready
The crawl found 280 URLs but the team only cares about blog posts — the rest is noise
In the "Competitor Content Audit" tab, filter column A to rows where the URL contains "/blog/" and move those rows to a new tab called "Blog Inventory." Leave the full crawl on the original tab.
Some pages in column B have empty titles — they weren't set in the HTML
For each row in "Competitor Content Audit" where column B is blank, note "no title tag" in column E and flag the row in column F so the team can investigate.
The team wants to know which discovered URLs don't have a corresponding page in our own site structure
Compare the page URL slugs in column A of "Competitor Content Audit" against the slug list in column A of the "Our Pages" tab. For any competitor slug with no match in our list, write "content gap" in column G.
The content director wants the crawl results filtered to only informational pages, with a topic category assigned to each one based on the URL and title
In "Competitor Content Audit," for each row where column D is 200 and the URL does not contain "/product/" or "/pricing/" or "/login/", assign a topic category to column H based on the URL path and page title. Use categories like: "How-To," "Case Study," "Comparison," "News," "Glossary," or "Other."
Asking for the crawl, the filter, and the categorization in one prompt means the sheet goes from seed URL to editorial intelligence in a single operation.
Try It
Get the 7-day free trial of SheetXAI and open a sheet with a competitor's homepage URL in cell A1, then ask it to crawl the full site using Hyperbrowser and populate a content inventory. Also see running batch web searches from your sheet or the Hyperbrowser overview.
