The Scenario
You've been building a content gap analysis for three weeks. The backbone is an 80-row Excel workbook: competitor blog post URLs in column A, collected from five different sources, waiting for the actual article text so you can run a topic comparison. The URLs have been sitting there since Tuesday.
Your original plan was to open each one, read it, and manually paste a summary. You got through six before the afternoon disappeared into other things.
The bad version:
- Open each URL one by one, read the page, copy the main article body, switch back to the workbook, paste into column B, repeat.
- Skip any paywalled or JavaScript-heavy pages that don't load properly, leaving gaps you'll have to track separately.
- Spend 40 minutes formatting the pasted text — removing nav copy, footer links, and author bios that came along for the ride.
The analysis is supposed to go to the content lead by end of week. You have 80 URLs and approximately none of the time it would take to do this manually.
The Easy Way: One Prompt in SheetXAI
SheetXAI is an AI agent inside your Excel workbook. It reads your data and, through its built-in Apify MCP integration, can trigger web scraping runs and write the results back into your columns — without you ever leaving the workbook.
Scrape each URL in column A using Apify's RAG web browser and paste the extracted plain text into column B. Skip any URLs that return an error and mark them "failed" in column C.
What You Get
- Column B filled with the clean plain-text body of each article — navigation, footer, and sidebar copy excluded.
- Column C populated with "failed" for any URL that returned an error, a 404, or a blank page.
- The run processes all 80 URLs as a single batch, not one at a time, so total run time is a fraction of what sequential calls would take.
- No downloaded CSV. No re-import step. The data lands in the workbook directly.
What If the Data Is Not Quite Ready
Some URLs redirect before serving content
Scrape each URL in column A with Apify's RAG web browser. Follow redirects automatically. Write the final URL that was actually scraped into column C and the extracted text into column B.
Some pages require JavaScript to render the article body
For each URL in column A, use Apify's JavaScript-rendering scraper rather than the RAG browser. Paste the rendered page body text into column B and log the HTTP status code in column D.
You want title and body separately, not merged
Scrape each URL in column A using Apify. Write the page title into column B and the main article body text into column C. Treat any URL that returns an empty body as failed and note it in column D.
Full cleanup plus extraction in one shot
For each URL in column A: scrape with Apify's RAG browser, extract only the article body (skip nav, footer, byline, and related post links), trim the result to 1,500 words max, write the trimmed text to column B, write the character count to column C, and mark any URL that returned under 200 characters as "too short" in column D. Run all 80 at once.
When you need the content to arrive pre-processed, include the cleanup instructions in the same prompt — SheetXAI handles the extraction and the formatting in one pass.
Try It
Get the 7-day free trial of SheetXAI and open any Excel workbook with a column of URLs you've been meaning to scrape, then ask it to extract the article text with Apify and fill the results into the adjacent column. Or follow up with the spoke on running a specific Apify Actor and importing its full output dataset, or browse the full Apify MCP overview to see all the workflows covered.
