The Scenario
You are a platform engineer. Your company's cloud bill just came in 18% over forecast, and your VP of Infrastructure wants a cost review meeting on Friday. She wants every Databricks cluster in the workspace listed with its node type, autoscale range, and who created it, so the team can find the over-provisioned instances.
You manage forty clusters. They are not documented anywhere outside Databricks itself.
The bad version of Thursday:
- You open the Databricks UI, navigate to Compute
- You click into each cluster to see the node type and autoscale settings
- You copy the details into an Excel workbook by hand, one row per cluster
- You make a mistake on cluster 23 and have to go back
- You finish two hours later with a workbook the VP will tear apart for inconsistent formatting
- You also never got to the job definitions she also asked for.
The fast version is two prompts.
The Easy Way: One Prompt in SheetXAI
SheetXAI is an AI agent inside your Excel workbook that calls the Databricks Clusters API directly, so you do not have to click through forty UI screens.
Open the SheetXAI sidebar and type:
List all Databricks clusters in my workspace. Write cluster name, cluster ID, state, node type, min workers, max workers, and creator username into the Clusters tab with headers in row 1. Sort by node type so the largest instances group together.
SheetXAI calls the clusters API, pages through all forty clusters, and populates the Clusters tab. Then you ask the follow-up:
Now list all Databricks job definitions and write job name, job ID, creator, schedule, and last run status into the Jobs tab with headers in row 1.
Two prompts. Two tabs. The cost review workbook is ready.
What You Get
An Excel workbook with two populated tabs:
- Clusters tab — cluster name, cluster ID, state, node type, min workers, max workers, creator username
- Jobs tab — job name, job ID, creator, schedule, last run status
Sorted by node type, so the VP can immediately see which teams are running the largest instances and whether the autoscale ranges are justified.
What If the Data Is Not Quite Ready
Infrastructure audits always surface messier questions. SheetXAI handles them in the same prompt.
When cluster names do not follow the naming convention
Half the clusters have names like "test-cluster-david" or "temp-2025" — no team, no environment, no purpose.
List all Databricks clusters with cluster name, node type, min workers, max workers, and creator username. Add a column F called "Name Issue" — write "YES" if the cluster name does not start with prod-, staging-, data-, or analytics-. Leave it blank otherwise. Sort by "Name Issue" descending so flagged clusters appear first.
When the VP wants an estimated cost impact per cluster
The team uses $0.40 per DBU per hour. She wants a rough estimate per cluster based on max workers and node type.
List all clusters in my Databricks workspace with cluster name, node type, and max workers. In column E called "Est. DBU/hr," apply this mapping: Standard_DS3_v2 = 2 DBUs per worker, Standard_DS4_v2 = 4 DBUs per worker, Standard_DS5_v2 = 8 DBUs per worker. Calculate max DBU/hr as max workers × Est. DBU/hr in column F. Write "UNKNOWN NODE TYPE" for any node type not in the mapping.
When you only want clusters that are currently running
The cost review should focus on active spend, not terminated clusters.
List all Databricks clusters where state is RUNNING. Write cluster name, node type, min workers, max workers, creator username, and cluster ID into the Clusters tab. Sort by max workers descending so the largest running clusters appear first.
When you want the cluster audit plus recent job failures in one workbook
The VP also wants to know which jobs failed in the last 7 days — not just which ones exist, but which ones are causing expensive re-runs.
List all Databricks clusters with cluster name, node type, and creator in the Clusters tab. Then list all job runs from the last 7 days where the result state is FAILED in the FailedRuns tab, with columns: job name, job ID, run ID, start time, and error message. Add a note in cell A1 of the Clusters tab with the total count of failed runs from the last 7 days.
The pattern: what the VP wanted as a manual audit becomes a two-tab workbook in the time it takes to write two prompts.
Try It
Get the 7-day free trial of SheetXAI and open any Excel workbook, then ask it to pull your Databricks cluster and job inventory for the cost review. The Databricks integration is included in every plan. For related workflows, see how to run a SQL query and land results in Excel or the Databricks in Excel overview.
