The Scenario
You inherited the data dictionary. It lives in a Google Sheet called "Schema Docs" and the last person who touched it was updating it manually — opening Kibana, browsing to the mapping for each index, and copying field names and types into rows one at a time.
The customer-events index has had three schema changes since the last update. A stakeholder just asked for a fresh export of the field list for a data governance review, and what you're looking at in the sheet is already wrong.
The bad version:
- Run
GET customer-events/_mappingin the dev console and get back several hundred lines of nested JSON - Start expanding the
propertiesobject, locate each field name and itstypevalue, note whether it has"index": false - Transcribe each field name, type, and index setting into a row in the sheet — for a complex index this is 80+ rows of JSON-to-spreadsheet translation
Nobody hired you to manually transcribe JSON into a spreadsheet. The governance review doesn't care how you got the data there — it cares that the data is accurate.
The Easy Way: One Prompt in SheetXAI
SheetXAI is an AI agent that lives inside your Google Sheet. It connects to your Elasticsearch cluster, fetches the index mapping, and writes the field list directly into your sheet — field name, type, and index setting, one row per field.
Fetch the schema mapping for my 'customer-events' Elasticsearch index and write each field name, data type, and indexing setting into my "Schema Docs" sheet as one field per row, starting at row 2
What You Get
- Column A fills with each field name from the mapping, one per row
- Column B shows the field's data type —
keyword,text,date,long, etc. - Column C shows the indexing setting —
true,false, or blank where the mapping doesn't specify - Nested fields are written with dot notation (e.g.
user.id,user.email) so the hierarchy is preserved in a flat table - The run covers the full mapping in one pass without pagination
What If the Data Is Not Quite Ready
You need to filter out nested system fields
The mapping includes internal metadata fields like _id, _source, and _index that you don't want in the dictionary.
Fetch the mapping for my 'customer-events' Elasticsearch index and write each non-system field name, type, and indexing setting into "Schema Docs" starting at row 2 — skip any fields beginning with underscore
The sheet already has an outdated schema from the last pull
You don't want to manually delete the stale rows before running the new pull.
Clear all rows from row 2 downward in my "Schema Docs" sheet, then fetch the full mapping from the 'customer-events' Elasticsearch index and write field name, data type, and index setting into columns A, B, and C
You need to compare the fetched schema against a known field list in a second tab
You have a "Required Fields" tab listing the fields that the data contract mandates. After pulling the mapping, you want to flag which required fields are missing.
Fetch the 'customer-events' index mapping and write field names into column A of "Schema Docs" — then check each field name in the "Required Fields" tab's column A and write "present" or "missing" into column B of "Required Fields"
Wipe the stale data, pull the full mapping, flag non-indexed fields, and note missing required fields in one prompt
Clear rows 2 and below in "Schema Docs", fetch the complete mapping for 'customer-events' from Elasticsearch, write field name, type, and index setting into columns A, B, and C — in column D, write "not indexed" for any field where index setting is false, and "indexed" otherwise — then in the "Required Fields" tab, mark each field in column A as "present" or "missing" based on whether it appears in column A of "Schema Docs"
The pattern: describe the full chain — cleanup, fetch, annotation, cross-tab check — in one prompt. SheetXAI runs it without breaking it into separate steps.
Try It
Get the 7-day free trial of SheetXAI and open the Google Sheet where your Elasticsearch schema documentation lives — paste the index name into your prompt and ask it to rebuild the field list from the live mapping. You can also use the same approach to run batch queries against your indices or export a full cluster index inventory.
