Back to Firecrawl in Google Sheets
SheetXAI logo
Firecrawl logo
Firecrawl · Google Sheets Guide

Batch Scrape a Prospect Directory Into Google Sheets

2026-05-14
5 min read

The Scenario

You've been handed 120 company profile URLs from a niche B2B directory — a list of potential partners your VP wants enriched before next week's outreach push. Column A has the URLs. Columns B through E are supposed to have company name, employee count, founding year, and a contact email pulled from each page.

The directory doesn't export data. The profiles are HTML pages, not an API.

The bad version:

  • Open each company profile, read the page, type the employee count into column C, look for a contact email in the footer
  • Hit profile 15 and find the employee count listed as a range ("50-200") instead of a number — decide how to handle that and write a note to yourself
  • Finish 30 profiles on Monday, realize you have 90 left and a Tuesday deadline, and start wondering if any of this is worth your time

Your job title is business development. This is not business development. This is data entry at scale, and it is eating the week you needed for actual outreach.

The Easy Way: One Prompt in SheetXAI

SheetXAI is an AI agent that lives inside your Google Sheet. It reads your data and through its built-in Firecrawl integration it can batch-scrape all 120 URLs and extract the fields you need into the designated columns — no manual page reading, no tab switching, no copy-paste. You specify the fields and it handles the extraction.

Batch-scrape all 120 company profile URLs in column A of my "Prospect List" sheet. Extract company name into column B, employee count into column C, founding year into column D, and contact email into column E. For employee counts listed as a range, write the midpoint as an integer. Flag any URL that returned an error or had no contact email in column F.

What You Get

  • Column B with the company name as it appears on the profile
  • Column C with employee count as an integer — ranges resolved to their midpoint, so the column stays sortable
  • Column D with founding year as a four-digit number
  • Column E with a contact email if one appeared on the page — pulled from footer, "Contact" sections, or visible mailto links
  • Column F flagging any profile that returned an error or had no contactable email, so the outreach team knows which rows need manual follow-up

What If the Data Is Not Quite Ready

Some profiles list multiple emails — a generic info@ address and a personal contact

For any profile where column E contains multiple email addresses, keep only the one that appears most likely to be a personal contact (not info@, hello@, or support@) and write it into column E. Write the generic address into column G as a fallback.

The founding year is sometimes listed as "Founded in 2008" and sometimes as just "2008"

After scraping, normalize the founding year in column D so that all values are four-digit numbers with no surrounding text. If the year could not be extracted from the page, write "unknown" in column D.

The prospect list has two tabs — "Tier 1" with 40 URLs and "Tier 2" with 80 URLs — both need the same fields extracted

Scrape all URLs in column A of the "Tier 1" sheet and write company name, employee count, founding year, and contact email into columns B through E. Then do the same for column A of the "Tier 2" sheet. Keep the results separate — don't merge the tabs.

The full enrichment pipeline: scrape, score by fit, flag missing emails, and sort by priority

Scrape all 120 URLs in column A. Extract company name into B, employee count into C, founding year into D, and contact email into E. Flag any row with no contact email in column F. Then score each company for outreach priority in column G: score 3 if employee count is between 50 and 500 and founding year is 2010 or later, score 2 if only one condition is met, score 1 otherwise. Sort the entire sheet by column G descending so the highest-priority prospects appear first.

One pass: scrape, enrich, score, and prioritize.

Try It

Get the 7-day free trial of SheetXAI and open your prospect list with the 120 directory URLs, then ask it to extract company size, founding year, and contact email for every row. Link to the hub: How to Connect Firecrawl to Google Sheets. Also see: Extract Job Listing Fields From URLs Into a Google Sheet.

Stop memorizing formulas.
Tell your spreadsheet what to do.

Join 4,000+ professionals saving hours every week with SheetXAI.

Learn more