Back to Exa in Google Sheets
SheetXAI logo
Exa logo
Exa · Google Sheets Guide

Find Semantically Similar Pages for a URL List in a Google Sheet Using Exa

2026-05-14
5 min read

The Scenario

You manage SEO for a B2B software company and this week's project is link-building outreach. You have a Google Sheet with 20 of your top-performing content URLs — articles that rank well and attract backlinks — and the task is to find 5 semantically similar competitor pages for each one. Those similar pages become your outreach targets.

You've done this manually before. It took most of a Thursday.

The bad version:

  • Take URL 1, paste it into a search engine with some semantic framing ("pages like this article on..."), sift through the results to find 5 that are genuinely similar in topic and format rather than just keyword-adjacent
  • Open each of the 5 candidates to verify they're real content pages and not category pages or homepages
  • Copy the URL and title for each, paste them into the sheet across columns B through K
  • Do this 19 more times, varying your search framing each time because the same query pattern stops returning useful results after the third or fourth iteration

By the time you've worked through all 20 seed URLs, you've spent more time on discovery than you have on actual outreach. And "semantically similar" as a concept doesn't map cleanly onto standard search queries — you end up with results that are keyword-adjacent rather than topically aligned.

The Easy Way: One Prompt in SheetXAI

SheetXAI is an AI agent that lives inside your Google Sheet. It reads the sheet, understands your data layout, and through its built-in Exa integration it can run semantic similarity searches for each URL in your list and write the discovered pages back across adjacent columns.

For each URL in column A of my sheet, find 5 semantically similar web pages via Exa and write the similar URLs and their page titles into columns B through K

What You Get

  • Columns B, D, F, H, and J receive the 5 discovered similar URLs for each seed
  • Columns C, E, G, I, and K receive the page title for each discovered URL
  • Results are genuinely semantically similar — not just keyword-matched — because Exa uses neural embeddings rather than keyword search
  • Rows where Exa returned fewer than 5 similar pages are noted so you can adjust your seed URL or broaden the query

What If the Data Is Not Quite Ready

Some seed URLs in column A are thin pages or redirects

A few of the 20 URLs are older articles that got redirected to category pages, so Exa has nothing useful to match against.

For each URL in column A, check whether Exa returns any similarity results — if it does, write the top 5 similar URLs and titles to columns B through K; if it returns zero results or an error, write "No results" in column B and leave the rest of the row blank

You want to deduplicate discovered pages across seed URLs

The same competitor article might show up as a similar page for multiple seed URLs. You want each discovered URL to appear only once across your outreach list.

For each URL in column A, find 5 semantically similar pages via Exa and write the results to columns B through K, then check for any URL that appears in multiple rows and mark duplicates with "Dup" in a new tracking column

You want to filter out results from your own domain

Exa might return pages from your own site as similar results. Those aren't outreach targets.

For each URL in column A, find 5 semantically similar pages via Exa, filter out any result from our domain (yourdomain.com), and write the remaining similar URLs and titles to columns B through K

Full pipeline: fetch, deduplicate, filter own domain, score by similarity

For each URL in column A, find 5 semantically similar pages via Exa, exclude any from yourdomain.com, remove any URL that appears in a previous row's results, and write the top 5 remaining URLs and titles to columns B through K along with Exa's similarity score in column L

One prompt handles the entire discovery pipeline.

Try It

Get the 7-day free trial of SheetXAI and open any Google Sheet with your top content URLs, then ask it to find semantically similar competitor pages for each one via Exa. For related work, see how to extract full page content for analysis or pull a news digest for your niche.

Stop memorizing formulas.
Tell your spreadsheet what to do.

Join 4,000+ professionals saving hours every week with SheetXAI.

Learn more