SELECT statement
Returns data from FeatureBase tables based on specified rows, columns and clauses.
Table of contents
BNF diagrams
TOP clause
SELECT list
FROM clause
WHERE clause
GROUP BY clause
HAVING clause
ORDER BY clause
Syntax
SELECT
[DISTINCT]
[<top_clause>]
{<select_list>}
{<from_clause>}
[<where_clause>]
[<group_by_clause>]
[<having_clause>]
[<order_by_clause>];
Arguments
Argument | Description | Required | Additional information |
---|---|---|---|
DISTINCT | Keyword that specifies only unique rows exist in the output | Optional | * DISTINCT additional * SELECT…flatten hint |
top_clause | Specify a limit to apply to the number of rows returned in the output. | Optional | Requires integer literal |
select_list | A series of expressions separated by commas that contains the items selected to form the output result set. | Yes | SELECT list and GROUP BY clause |
from_clause | A list of table or subquery expressions that specify which relations to select data from. | Yes | FROM table or subquery expression |
from…with | A list of table query hints | Optional for table queries | Query hints |
where_clause | An expression that defines a filter condition for the rows returned by the query. | Optional | * <expr> filter conditions* SELECT…flatten hint |
group_by_clause | Separates the results into groups of rows allowing aggregates to be performed on each group. | Optional | SELECT…flatten hint |
column_expr | Specify a column or non-aggregate calculation on a column which is not required to appear in the select_list | Must exist in the from_clause | |
having_clause | Pass aggregates to filter on based on conditions. | Optional | <expr> filter conditions |
order_by_clause | Comma-separated column name, column alias or column position in the SELECT list used to specify the order data is returned. | Optional | Ordering results |
Additional information
DISTINCT additional
DISTINCT
can return two kinds of results:
DISTINCT | Description | Examples |
---|---|---|
Values | A query that returns DISTINCT values from a table | SELECT from csv-target |
Sets | A query that returns a specific array of values from SET or SETQ data type columns | Flatten hint |
<expr>
filter
<expr>
can be used in the
- TOP
- WHERE
- HAVING
<expr>
can include:
Filter | Description | Additional information |
---|---|---|
Subquery | A SELECT query that is run to obtain specific results required for the main query | |
Constant | A literal or scalar value | Can be joined by a subquery or SQL operators or subquery |
Function | FeatureBase SQL functions | Can be joined by a subquery or SQL operators |
SELECT list and GROUP BY clause
SELECT LIST Columns referenced in non-aggregate expressions must also appear in the GROUP BY clause.
GROUP BY issues on SET and SETQ data types
SELECT…LIKE wildcards
Wildcards are used with the LIKE
clause.
Wildcard | Description | Additional information |
---|---|---|
* | All columns | |
Select List column alias | ||
expr | Filter used to refine the query | <expr> filter |
<qualifier>.* | limit the results to all columns based on the specified qualifier |
SELECT with TUPLE() function
When the TUPLE()
function is used in a select_list
, the following values are returned:
Source data type | Returns |
---|---|
IDSETQ | TIMESTAMP and associated integer value |
STRINGSETQ | TIMESTAMP and associated string value |
FROM table or subquery
The table_or_subquery expression can be:
- a table_name or table_valued_function
- a parenthesized
SELECT
statement
Both expressions can be aliased with a <table_alias>
Examples
SELECT
examples
These examples demonstrate SELECT statements that use different clauses, expressions, functions and operators.
Statement | Target table | SELECT clauses | Expressions | Functions | Operators |
---|---|---|---|---|---|
SELECT FROM cosvec-target | cosvec-target | COSINE DISTANCE | |||
SELECT FROM csv-target | csv-target | * COUNT() * DISTINCT | |||
SELECT FROM ndjson-target | parquet-target | * WHERE * LIKE * ORDER BY | |||
SELECT FROM orc-target | orc-target | ||||
SELECT FROM parquet-target | parquet-target | * TOP() * WHERE | BETWEEN | ||
SELECT FROM tan-target | tan-target | TANIMOTO_COEFFICIENT() | |||
SELECT FROM tuple-set-target | parquet-target | DATETIMENAME() |
SELECT int function
SELECT MIN(fld) FROM tbl
SELECT MAX(fld) FROM tbl
SELECT SUM(fld) FROM tbl
SELECT AVG(fld) FROM tbl
SELECT MIN(fld) FROM tbl WHERE fld = 1
SELECT MAX(fld) FROM tbl WHERE fld = 1
SELECT SUM(fld) FROM tbl WHERE fld = 1
SELECT AVG(fld) FROM tbl WHERE fld = 1
GROUP BY
SELECT fld, COUNT(*) FROM tbl GROUP BY fld
SELECT fld1, fld2, COUNT(*) FROM tbl GROUP BY fld1, fld2
SELECT fld1, fld2, COUNT(*) FROM tbl GROUP BY fld1, fld2 LIMIT 1
SELECT fld1, fld2, COUNT(*) FROM tbl WHERE fld1 = 1 GROUP BY fld1, fld2
SELECT fld1, COUNT(*) FROM tbl GROUP BY fld1 having COUNT(*) > 1
SELECT fld1, fld2, SUM(fld3) FROM tbl WHERE fld1 = 1 GROUP BY fld1, fld2
SELECT fld1, fld2, SUM(fld3) FROM tbl WHERE fld1 = 1 GROUP BY fld1, fld2 HAVING count(*) > 1
SELECT fld, COUNT(fld) FROM tbl GROUP BY fld
SELECT fld1, COUNT(fld1) FROM tbl WHERE fld2=1 GROUP BY fld1
GROUP BY with STRINGSET
This query that counts combinations of values from the segments
table.
This query can also be run using the FLATTEN() hint
create table segments
(_id id, segment stringset);
insert into segments(_id, segment)
values (1, ['RED', 'BLUE', 'GREEN']),
(2, ['GREEN']),
(3, ['RED', 'BLUE', 'GREEN']);
SELECT COUNT(*) as howmany, segment FROM segments
GROUP BY segment;
howmany | segment
-----+--------------------------
2 | ['RED', 'BLUE', 'GREEN']
1 | ['GREEN']
This query can also be performed using the flatten
hint
SELECT statement with wildcard
SELECT * FROM services WHERE servicelist LIKE '%free%';
SELECT with cosine_distance
SELECT with two RANGEQ() timestamps
SELECT _id, segment FROM segments WHERE RANGEQ(segment, 1889763885,2000000000);
+-----+---------------+
| _id | segment |
+-----+---------------+
| 1 | NULL |
+-----+---------------+
| 2 | GREEN |
+-----+---------------+
| 3 | NULL |
+-----+---------------+
SELECT with one RANGEQ() timestamp
select _id, segment from segments where rangeq(segment, null ,2000000000)
+-----+---------------+
| _id | segment |
+-----+---------------+
| 1 | GREEN,YELLOW |
+-----+---------------+
| 2 | GREEN |
+-----+---------------+
| 3 | RED |
+-----+---------------+
Source table tuple-demo
CREATE TABLE IF NOT EXISTS tuple-demo (
_id STRING,
time_col TIMESTAMP,
stringset_col STRINGSET,
idset_col IDSET,
stringsetq_col STRINGSETQ timequantum 'YMD',
idesetq_col IDSETQ timequantum 'Y'
);
SELECT statement with TUPLE()
SELECT _id, TUPLE(time_col, stringset_col, idset_col)
AS tup
FROM tuple-demo;
_id | tup
----+-------------------------------------
A | (2013-07-15T01:18:46Z,stringset1, 1)
B | (2014-07-15T01:18:46Z,stringset2, 2)