The Scenario
You are a growth analyst on a SaaS pricing intelligence project. The research brief landed in your inbox on Wednesday: pull the pricing page content from 60 competitor SaaS products by Friday for a pricing strategy deck going to the executive team Monday morning.
The problem is that most SaaS pricing pages are JavaScript-rendered single-page applications. A standard HTTP request returns an empty HTML shell. The actual pricing tiers, feature limits, and call-to-action text only appear after the JavaScript executes client-side. Every scraping tool that does not handle this returns useless content.
The bad version:
- Attempt to scrape each URL with a basic request, get back empty pages
- Realize you need headless browser rendering, install a Puppeteer script locally
- Run the script on URL 1, it works. Run it on URLs 5 through 10, three of them return Cloudflare blocks
- Spend Thursday afternoon debugging anti-bot detection instead of analyzing pricing tiers
Friday arrives. You have 22 URLs with usable content.
The Easy Way: One Prompt in SheetXAI
SheetXAI is an AI agent that lives inside your Google Sheet. It reads the URL list and, through Scrape.do's headless browser mode with residential proxy rotation, fetches the fully client-side-rendered HTML for each page — including content that only appears after JavaScript execution — and writes the results back into the sheet.
For each URL in column A, scrape the fully rendered page using Scrape.do with render=true and write the rendered HTML into column B.
What You Get
- Column B contains the fully rendered HTML for each URL, including content loaded by JavaScript
- Pages blocked by bot detection are handled by Scrape.do's proxy rotation — you see the actual page content, not a challenge screen
- Rows where rendering fails or times out write a status label into column B so you know which to retry
- Empty rows in column A are skipped
What If the Data Is Not Quite Ready
Some URLs need rendering and some do not — you have a flag in column C
For each URL in column A, check column C. If column C says "render", scrape using Scrape.do with render=true. Otherwise, scrape without rendering. Write the HTML response into column B for all rows.
You want to extract specific content rather than raw HTML
For each URL in column A, scrape the fully rendered page using Scrape.do with render=true. Extract the pricing plan names, monthly prices, and the first listed feature for each tier. Write the extracted data into columns B, C, and D.
URLs have duplicates that should not be scraped twice
Deduplicate column A before scraping — keep the first occurrence of each URL. Then for each unique URL, scrape the fully rendered page using Scrape.do with render=true and write the HTML into column B.
Full pricing intelligence pipeline in one prompt
Deduplicate and clean column A. For each unique URL, scrape the fully rendered page using Scrape.do with render=true. Extract pricing plan names and their monthly prices. Write extracted plan names into column B and monthly prices into column C. In column D, write the number of distinct pricing tiers found on that page. Flag any pages where no pricing data was detected by writing "No pricing found" in column B.
Dedup, render-mode scraping, structured extraction, and anomaly flagging — one ask.
Try It
Get the 7-day free trial of SheetXAI and open the Google Sheet with your SaaS competitor URL list, then ask it to scrape the fully rendered pages and populate the content column. See also the spoke on scraping geo-targeted pages, or the hub overview for all Scrape.do workflows.
