Skip to main content

Monitor costs using system tables

This article explains how you can use the system.billing.usage table on its own or joined with other system tables to get a picture of your account’s Databricks usage. The following feature-specific articles are also available:

How to read the usage table

Users with permissions to access system table data can view and query their account’s billing logs, located at system.billing.usage. Every billing record includes columns that attribute the usage amount to the specific resources, identities, and products involved.

  • The usage_metadata column includes a struct with information about the resources or objects involved in the usage.
  • The identity_metadata column includes information about the user or service principal that incurred the usage.
  • The custom_tags column includes tags applied to the compute resource associated with the usage. This also includes tags added by budget policies so you can attribute serverless usage.
  • The billing_origin_product and product_features columns give you information about the exact product and features used.

For a complete reference of the usage table, see Billable usage system table reference.

Operationalize billing data

Databricks recommends using AI/BI dashboards to create cost monitoring dashboards using system table billing data. You can create a new dashboard, or account admins can import pre-built, customizable cost monitoring dashboard. See Usage dashboards.

You can also add alerts to your queries to help you stay informed about usage data. See Create an alert.

Sample queries

The following queries provide examples of how you can use the system.billing.usage table data to gain insights into your account’s usage.

How many DBUs of each product have been used throughout this month?

SQL
SELECT
billing_origin_product,
usage_date,
sum(usage_quantity) as usage_quantity
FROM system.billing.usage
WHERE
month(usage_date) = month(NOW())
AND year(usage_date) = year(NOW())
GROUP BY billing_origin_product, usage_date

Which jobs consumed the most DBUs?

SQL
SELECT
usage_metadata.job_id as `Job ID`, sum(usage_quantity) as `Usage`
FROM
system.billing.usage
WHERE
usage_metadata.job_id IS NOT NULL
GROUP BY
`Job ID`
ORDER BY
`Usage` DESC

How much usage can be attributed to resources with a specific tag?

You can break down costs in various ways. This example shows you how to break down costs by a custom tag. Be sure to replace the custom tag’s key and value in the query.

SQL
SELECT
sku_name, usage_unit, SUM(usage_quantity) as `Usage`
FROM
system.billing.usage
WHERE
custom_tags [:key] = :value
GROUP BY 1, 2

Show me the products where usage is growing

SQL
SELECT
after.billing_origin_product, before_dbus, after_dbus, ((after_dbus - before_dbus)/before_dbus * 100) AS growth_rate
FROM
(SELECT
billing_origin_product, sum(usage_quantity) as before_dbus
FROM
system.billing.usage
WHERE
usage_date BETWEEN "2024-04-01" and "2024-04-30"
GROUP BY
billing_origin_product
) as before
JOIN
(SELECT
billing_origin_product, sum(usage_quantity) as after_dbus
FROM
system.billing.usage
WHERE
usage_date
BETWEEN
"2024-05-01" and "2024-05-30"
GROUP BY
billing_origin_product
) as after
WHERE
before.billing_origin_product = after.billing_origin_product
SORT BY
growth_rate DESC

What is the usage trend of All Purpose Compute (Photon)?

SQL
SELECT
sku_name,
usage_date,
sum(usage_quantity) as `DBUs consumed`
FROM
system.billing.usage
WHERE
year(usage_date) = year(CURRENT_DATE)
AND
sku_name = "ENTERPRISE_ALL_PURPOSE_COMPUTE_(PHOTON)"
AND
usage_date > "2024-04-15"
GROUP BY
sku_name, usage_date

What is the DBU consumption of a materialized view or streaming table?

To get the DBU usage and SKU for a specific materialized view or streaming table, submit a query to the billable usage system table for records where usage_metadata.dlt_pipeline_id is set to the ID of the pipeline associated with the materialized view or streaming table. You can find the pipeline ID in the Details tab in Catalog Explorer when viewing the materialized view or streaming table. To limit the consumption by date, specify a start date, end date, or a date range. The following query retrieves the DBU usage for the pipeline with ID 00732f83-cd59-4c76-ac0d-57958532ab5b and a usage start date of 2024-05-30:

SQL
SELECT
sku_name,
usage_date,
SUM(usage_quantity) AS `DBUs`
FROM
system.billing.usage
WHERE
usage_metadata.dlt_pipeline_id = :dlt_pipeline_id
AND usage_start_time > :usage_start_time
GROUP BY
ALL

What is the DBU consumption of a serverless DLT pipeline?

To get the DBU usage and SKU for a serverless DLT pipeline, submit a query to the billable usage system table for records where usage_metadata.dlt_pipeline_id is set to the ID of the pipeline. You can find the pipeline ID on the Pipeline Details tab when viewing a pipeline in the Delta Live Tables UI. To limit the consumption by date, specify a start date, end date, or a date range. The following query retrieves the DBU usage from December 2024 for the pipeline with ID 00732f83-cd59-4c76-ac0d-57958532ab5b.

SQL
SELECT
sku_name,
usage_date,
SUM(usage_quantity) AS `DBUs`
FROM
system.billing.usage
WHERE
usage_metadata.dlt_pipeline_id = :dlt_pipeline_id
AND usage_start_time >= :usage_start_time
AND usage_end_time < :usage_end_time
GROUP BY
ALL

What is the daily trend in DBU consumption?

SQL
SELECT
usage_date as `Date`, sum(usage_quantity) as `DBUs Consumed`
FROM
system.billing.usage
WHERE
sku_name = "STANDARD_ALL_PURPOSE_COMPUTE"
GROUP BY
usage_date
ORDER BY
usage_date ASC

Examples of usage table joins

The following examples show you ways you can join the usage table with other system tables to gain additional insights.

Attribute costs to the compute owner

If you are looking to reduce compute costs, you can use this query to find out which cluster owners in your account are using the most DBUs.

SQL
SELECT
u.record_id record_id,
c.cluster_id cluster_id,
max_by(c.owned_by, c.change_time) owned_by,
max(c.change_time) change_time,
any_value(u.usage_start_time) usage_start_time,
any_value(u.usage_quantity) usage_quantity
FROM
system.billing.usage u
JOIN system.compute.clusters c
WHERE
u.usage_metadata.cluster_id is not null
and u.usage_start_time >= '2023-01-01'
and u.usage_metadata.cluster_id = c.cluster_id
and c.change_time <= u.usage_start_time
GROUP BY 1, 2
ORDER BY cluster_id, usage_start_time desc;

Enrich usage with a job name

SQL
with jobs as (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY workspace_id, job_id ORDER BY change_time DESC) as rn
FROM system.lakeflow.jobs QUALIFY rn=1
)
SELECT
usage.*,
coalesce(usage_metadata.job_name, jobs.name) as job_name
FROM system.billing.usage
LEFT JOIN jobs ON usage.workspace_id=jobs.workspace_id AND usage.usage_metadata.job_id=jobs.job_id
WHERE
billing_origin_product="JOBS"

Join the pricing with usage tables

The list_prices table includes list prices over time for each available SKU. You can join the usage table to view the listing cost of certain usage.

For example, the following query returns the total cost attributed to a particular tag during a month:

SQL
SELECT
SUM(usage.usage_quantity * list_prices.pricing.effective_list.default)
as `Total Dollar Cost`
FROM system.billing.usage
JOIN system.billing.list_prices ON list_prices.sku_name = usage.sku_name
WHERE usage.custom_tags [:key] = :value
AND usage.usage_end_time >= list_prices.price_start_time
AND (list_prices.price_end_time IS NULL OR usage.usage_end_time < list_prices.price_end_time)
AND usage.usage_date BETWEEN "2024-05-01" AND "2024-05-31"

Estimate the add-on costs for usage in the previous calendar month

This query applies a simple percentage to all usage in the period. Note that this might differ slightly from the actual monetization due to how entitlements for some add-ons are managed. Replace the add-on rate with your account’s rate.

SQL
SELECT SUM(usage.usage_quantity * list_prices.pricing.effective_list.default) * :add_on_rate as `Total Add-On Dollar Cost`
FROM system.billing.usage
JOIN system.billing.list_prices ON list_prices.sku_name = usage.sku_name
WHERE usage.usage_end_time >= list_prices.price_start_time
AND (list_prices.price_end_time IS NULL OR usage.usage_end_time < list_prices.price_end_time)
AND usage.usage_date BETWEEN "2024-02-01" AND "2024-02-29"