The Scenario
You're a market research analyst. Your team is building a pricing comparison matrix for fifty SaaS tools, and someone handed you an Excel workbook with fifty pricing page URLs in column A. You need the text content from each page — specifically the pricing tier names and the key feature bullets — extracted into a structured format so you can compare them side by side.
The catch: half these pages render their pricing content via JavaScript. A standard HTTP request returns an empty div. You need rendered HTML.
The bad version:
- Open each pricing page in Chrome, open DevTools, switch to the Elements panel, expand the pricing section, manually copy the relevant text.
- Switch back to the workbook, paste the text into column B, format it enough to be readable.
- On page fourteen, the component uses lazy loading and the DevTools panel shows the unrendered placeholder. You wait, refresh, and try again.
Fifty pages. Variable rendering behavior. No consistent HTML structure across different SaaS sites. You could write a Puppeteer script — or you could have done the analysis by now.
The Easy Way: One Prompt in SheetXAI
SheetXAI is an AI agent inside your Excel workbook. It reads your URL list, calls OpenGraph.io with JavaScript rendering enabled for each page, and writes the extracted content back into your workbook — without you configuring a headless browser.
For each URL in column A, use OpenGraph.io to scrape the full HTML with JavaScript rendering enabled and write the complete page text content (stripped of HTML tags) into column B
What You Get
- Column B fills with the rendered page text for each URL — all the visible copy, stripped of HTML tags so it's readable in the cell.
- Content that was gated behind JavaScript rendering (dynamic pricing tiers, feature lists loaded by React or Vue) is included, not blank.
- Any page that fails to render or returns an error can be flagged in a separate column so you know which URLs need a second look.
What If the Data Is Not Quite Ready
I only want the h1 and h2 tag text, not the full page content
Scrape the HTML of all URLs in column A using OpenGraph.io with JavaScript rendering enabled and extract the text inside all h1 and h2 tags into column B — put the full stripped page text into column C
Some of the pricing pages redirect to a login wall — I want to detect those and skip them
For each URL in column A, use OpenGraph.io to scrape with JavaScript rendering and write the page text into column B; if the returned content contains "sign in," "log in," or "create account" as the primary call to action, flag "LOGIN WALL" in column C instead
The workbook has a "Freemium" worksheet and a "Paid-Only" worksheet with different URL lists — I need content from both
For each URL in column A of the "Freemium" worksheet and column A of the "Paid-Only" worksheet, use OpenGraph.io to scrape with JavaScript rendering and write the stripped page text into column B of each respective worksheet
Clean up the URLs, scrape the rendered content, extract h1 and h2 text, and flag pages where the word "pricing" does not appear in any heading
For each URL in column A, normalize to https:// if needed, then scrape with OpenGraph.io with JavaScript rendering — extract the text from all h1 and h2 tags into column B and the full stripped content into column C; in column D, flag "NO PRICING HEADING" if the word "pricing" does not appear in any h1 or h2
One prompt handles URL normalization, rendering, content extraction, and the conditional check in a single pass.
Try It
Get the 7-day free trial of SheetXAI and open any Excel workbook with dynamic page URLs in column A, then ask it to scrape the rendered content for every row in one operation. You can also capture screenshots of those same pages to pair visual evidence with the extracted text in your comparison matrix.
