The Scenario
A web analyst exported 1,000 raw user-agent strings from nginx logs into a Google Sheet — every request that hit the marketing landing pages over the past 30 days. The strings are sitting in column A, unprocessed, looking like walls of version numbers and platform identifiers.
The traffic composition report is due to the product team on Wednesday. It needs device type, OS name, browser, and a bot flag for each row — four columns that currently do not exist in the sheet.
The analyst has been doing this kind of work for three years and has never had to parse user-agent strings at scale before. The last time something like this came up, a developer wrote a one-off script. That developer left the company in January.
The bad version:
- Find a user-agent parsing library for Python, install it, write a script that reads the sheet as a CSV.
- Run the parser, hit encoding errors on rows with unusual characters in the UA string, fix the encoding, re-run.
- Discover the library classifies Googlebot as a "browser" rather than a bot because it doesn't recognize the crawler string — manually review and reclassify 40 rows.
It's Tuesday. The report is due tomorrow morning and the analyst is the only one in the office today.
The Easy Way: One Prompt in SheetXAI
SheetXAI is an AI agent inside your Google Sheet that reads your data and calls BigDataCloud's user-agent parser API for each row. You describe the columns you need — it handles the parsing and the writeback.
Open SheetXAI from the Extensions menu and type:
Parse every user-agent string in column A using BigDataCloud and write device type, OS name, browser, and bot flag into columns B through E
What You Get
- Column B: device type — "Mobile", "Desktop", "Tablet", or "Bot"
- Column C: OS name (e.g., "iOS", "Windows 11", "Android", "macOS")
- Column D: browser name (e.g., "Chrome", "Safari", "Firefox") — blank for bots
- Column E: bot flag — true or false, with known crawler names surfaced where BigDataCloud identifies them
- Rows where BigDataCloud cannot parse the string return "Unknown" in device type with the original string preserved
What If the Data Is Not Quite Ready
Some user-agent strings got truncated at 255 characters during the log export
Parse the user-agent strings in column A using BigDataCloud — some entries are truncated (cut off mid-string), parse what's available and note any rows where truncation likely affected classification by adding "truncated" to column F
The sheet mixes user-agent strings with empty rows from rows that had no User-Agent header
Parse every non-blank user-agent string in column A using BigDataCloud — skip rows where column A is empty, mark those rows "no user-agent" in column B, and parse the rest into device type, OS, browser, and bot flag in columns B through E
You need the output classified into four buckets, not the raw parsed fields
Parse all user-agent strings in column A using BigDataCloud into device type, OS, browser, and bot flag in columns B through E — then add a column F that classifies each row as "Mobile", "Desktop", "Tablet", or "Bot" based on device type and the bot flag
Full traffic composition analysis in one pass
Parse all 1,000 user-agent strings in column A using BigDataCloud into device type, OS, browser, and bot flag in columns B through E — then on a new sheet called "Traffic Summary" write a count of rows per device type, per OS, and per browser (top 10 only), and note the total bot percentage at the top of that sheet
One prompt delivers the parsing, the writeback, and the summary pivot.
Try It
Get the 7-day free trial of SheetXAI and open any Google Sheet with a column of raw user-agent strings — paste the prompt above and have your traffic composition data ready for the product team by morning. See the IP geolocation spoke if you also need geographic breakdown data from the same log export.
