CACHE TABLE
Applies to: Databricks Runtime
Caches contents of a table or output of a query with the given storage level in Apache Spark cache. If a query is cached, then a temp view is created for this query. This reduces scanning of the original files in future queries.
Syntax
CACHE [ LAZY ] TABLE table_name
[ OPTIONS ( 'storageLevel' [ = ] value ) ] [ [ AS ] query ]
See Disk cache vs. Spark cache for the differences between disk caching and the Apache Spark cache.
Parameters
-
LAZY
Only cache the table when it is first used, instead of immediately.
-
Identifies the Delta table or view to cache. The name must not include a temporal specification or options specification. If the table cannot be found Databricks raises a TABLE_OR_VIEW_NOT_FOUND error.
-
OPTIONS ( ‘storageLevel’ [ = ] value )
OPTIONS
clause withstorageLevel
key and value pair. A warning is issued when a key other thanstorageLevel
is used. The valid options forstorageLevel
are:NONE
DISK_ONLY
DISK_ONLY_2
MEMORY_ONLY
MEMORY_ONLY_2
MEMORY_ONLY_SER
MEMORY_ONLY_SER_2
MEMORY_AND_DISK
MEMORY_AND_DISK_2
MEMORY_AND_DISK_SER
MEMORY_AND_DISK_SER_2
OFF_HEAP
An Exception is thrown when an invalid value is set for
storageLevel
. IfstorageLevel
is not explicitly set usingOPTIONS
clause, the defaultstorageLevel
is set toMEMORY_AND_DISK
. -
query
A query that produces the rows to be cached. It can be in one of following formats:
- A
SELECT
statement - A
TABLE
statement - A
FROM
statement
- A
Examples
> CACHE TABLE testCache OPTIONS ('storageLevel' 'DISK_ONLY') SELECT * FROM testData;