SELECT clause
Applies to: Databricks SQL
Databricks Runtime
Collects the columns to be returned from the subquery, including the execution of expressions, aggregations, and deduplication.
Syntax
SELECT [ hints ] [ ALL | DISTINCT ] { named_expression | star_clause } [, ...]
Parameters
-
Hints help the Databricks optimizer make better planning decisions. Databricks supports hints that influence selection of join strategies and repartitioning of the data.
-
ALL
Select all matching rows from the table references. Enabled by default.
-
DISTINCT
Select all matching rows from the table references after removing duplicates in results.
-
named_expression
An expression with an optional assigned name.
-
A combination of one or more values, operators, and SQL functions that evaluates to a value.
-
An optional column identifier naming the expression result. If no
column_alias
is provided Databricks SQL derives one.
-
-
A shorthand to name all the referencable columns in the
FROM
clause or a specific table reference’s columns or fields in theFROM
clause.
Examples
-- select all referencable columns from all tables
> SELECT * FROM VALUES(1, 2) AS t1(c1, c2), VALUES(3, 4) AS t2(c3, c4);
1 2 3 4
-- select all referencable columns from one table
> SELECT t2.* FROM VALUES(1, 2) AS t1(c1, c2), VALUES(3, 4) AS t2(c3, c4);
3 4
-- select all referencable columns from all tables except t2.c4
> SELECT * EXCEPT(c4) FROM VALUES(1, 2) AS t1(c1, c2), VALUES(3, 4) AS t2(c3, c4);
1 2 3
-- select all referencable columns from a table, except a nested field.
> SELECT * EXCEPT(c2.b) FROM VALUES(1, named_struct('a', 2, 'b', 3)) AS t(c1, c2);
1 { "a" : 2 }
-- Removing all fields results in an empty struct
> SELECT * EXCEPT(c2.b, c2.a) FROM VALUES(1, named_struct('a', 2, 'b', 3)) AS t(c1, c2);
1 { }
-- Overlapping names result in an error
> SELECT * EXCEPT(c2, c2.a) FROM VALUES(1, named_struct('a', 2, 'b', 3)) AS t(c1, c2);
Error: EXCEPT_OVERLAPPING_COLUMNS