Skip to main content Link Menu Expand (external link) Document Search Copy Copied

SELECT statement

Returns data from FeatureBase tables based on specified rows, columns and clauses.

Table of contents

BNF diagrams

expr

TOP clause

expr

SELECT list

expr expr

FROM clause

expr expr expr

WHERE clause

expr

GROUP BY clause

expr

HAVING clause

expr

ORDER BY clause

expr expr

Syntax

SELECT
  [DISTINCT]
  [<top_clause>]
  {<select_list>}
  {<from_clause>}
  [<where_clause>]
  [<group_by_clause>]
  [<having_clause>]
  [<order_by_clause>];

Arguments

Argument Description Required Additional information
DISTINCT Keyword that specifies only unique rows exist in the output Optional * DISTINCT additional
* SELECT…flatten hint
top_clause Specify a limit to apply to the number of rows returned in the output. Optional Requires integer literal
select_list A series of expressions separated by commas that contains the items selected to form the output result set. Yes SELECT list and GROUP BY clause
from_clause A list of table or subquery expressions that specify which relations to select data from. Yes FROM table or subquery expression
from…with A list of table query hints Optional for table queries Query hints
where_clause An expression that defines a filter condition for the rows returned by the query. Optional * <expr> filter conditions
* SELECT…flatten hint
group_by_clause Separates the results into groups of rows allowing aggregates to be performed on each group. Optional SELECT…flatten hint
column_expr Specify a column or non-aggregate calculation on a column which is not required to appear in the select_list Must exist in the from_clause  
having_clause Pass aggregates to filter on based on conditions. Optional <expr> filter conditions
order_by_clause Comma-separated column name, column alias or column position in the SELECT list used to specify the order data is returned. Optional Ordering results

Additional information

DISTINCT additional

DISTINCT can return two kinds of results:

DISTINCT Description Examples
Values A query that returns DISTINCT values from a table SELECT from csv-target
Sets A query that returns a specific array of values from SET or SETQ data type columns Flatten hint

<expr> filter

<expr> can be used in the and clauses:

  • TOP
  • WHERE
  • HAVING

<expr> can include:

Filter Description Additional information
Subquery A SELECT query that is run to obtain specific results required for the main query  
Constant A literal or scalar value Can be joined by a subquery or SQL operators or subquery
Function FeatureBase SQL functions Can be joined by a subquery or SQL operators

SELECT list and GROUP BY clause

SELECT LIST Columns referenced in non-aggregate expressions must also appear in the GROUP BY clause.

GROUP BY issues on SET and SETQ data types

SELECT…LIKE wildcards

Wildcards are used with the LIKE clause.

Wildcard Description Additional information
* All columns  
Select List column alias  
expr Filter used to refine the query <expr> filter
<qualifier>.* limit the results to all columns based on the specified qualifier  

SELECT with TUPLE() function

When the TUPLE() function is used in a select_list, the following values are returned:

Source data type Returns
IDSETQ TIMESTAMP and associated integer value
STRINGSETQ TIMESTAMP and associated string value

FROM table or subquery

The table_or_subquery expression can be:

  • a table_name or table_valued_function
  • a parenthesized SELECT statement

Both expressions can be aliased with a <table_alias>

Examples

SELECT examples

These examples demonstrate SELECT statements that use different clauses, expressions, functions and operators.

Statement Target table SELECT clauses Expressions Functions Operators
SELECT FROM cosvec-target cosvec-target     COSINE DISTANCE  
SELECT FROM csv-target csv-target * COUNT()
* DISTINCT
     
SELECT FROM ndjson-target parquet-target * WHERE
* LIKE
* ORDER BY
     
SELECT FROM orc-target orc-target        
SELECT FROM parquet-target parquet-target * TOP()
* WHERE
    BETWEEN
SELECT FROM tan-target tan-target     TANIMOTO_COEFFICIENT()  
SELECT FROM tuple-set-target parquet-target     DATETIMENAME()  

SELECT int function

SELECT MIN(fld) FROM tbl
SELECT MAX(fld) FROM tbl
SELECT SUM(fld) FROM tbl
SELECT AVG(fld) FROM tbl
SELECT MIN(fld) FROM tbl WHERE fld = 1
SELECT MAX(fld) FROM tbl WHERE fld = 1
SELECT SUM(fld) FROM tbl WHERE fld = 1
SELECT AVG(fld) FROM tbl WHERE fld = 1

GROUP BY

SELECT fld, COUNT(*) FROM tbl GROUP BY fld

SELECT fld1, fld2, COUNT(*) FROM tbl GROUP BY fld1, fld2

SELECT fld1, fld2, COUNT(*) FROM tbl GROUP BY fld1, fld2 LIMIT 1

SELECT fld1, fld2, COUNT(*) FROM tbl WHERE fld1 = 1 GROUP BY fld1, fld2

SELECT fld1, COUNT(*) FROM tbl GROUP BY fld1 having COUNT(*) > 1

SELECT fld1, fld2, SUM(fld3) FROM tbl WHERE fld1 = 1 GROUP BY fld1, fld2

SELECT fld1, fld2, SUM(fld3) FROM tbl WHERE fld1 = 1 GROUP BY fld1, fld2 HAVING count(*) > 1

SELECT fld, COUNT(fld) FROM tbl GROUP BY fld

SELECT fld1, COUNT(fld1) FROM tbl WHERE fld2=1 GROUP BY fld1

GROUP BY with STRINGSET

This query that counts combinations of values from the segments table.

This query can also be run using the FLATTEN() hint

create table segments  
    (_id id, segment stringset);  

insert into segments(_id, segment)  
    values (1, ['RED', 'BLUE', 'GREEN']),
      (2, ['GREEN']),
      (3, ['RED', 'BLUE', 'GREEN']);
SELECT COUNT(*) as howmany, segment FROM segments
GROUP BY segment;

 howmany | segment
-----+--------------------------
   2 | ['RED', 'BLUE', 'GREEN']
   1 | ['GREEN']

This query can also be performed using the flatten hint

SELECT statement with wildcard

SELECT * FROM services WHERE servicelist LIKE '%free%';

SELECT with cosine_distance

SELECT with two RANGEQ() timestamps

SELECT _id, segment FROM segments WHERE RANGEQ(segment, 1889763885,2000000000);

+-----+---------------+
| _id | segment       |
+-----+---------------+
|   1 | NULL          |
+-----+---------------+
|   2 | GREEN         |
+-----+---------------+
|   3 | NULL          |
+-----+---------------+

SELECT with one RANGEQ() timestamp

select _id, segment from segments where rangeq(segment, null ,2000000000)

+-----+---------------+
| _id | segment       |
+-----+---------------+
|   1 | GREEN,YELLOW  |
+-----+---------------+
|   2 | GREEN         |
+-----+---------------+
|   3 | RED           |
+-----+---------------+

Source table tuple-demo

CREATE TABLE IF NOT EXISTS tuple-demo (
    _id STRING,
    time_col TIMESTAMP,
    stringset_col STRINGSET,
    idset_col IDSET,
    stringsetq_col STRINGSETQ timequantum 'YMD',
    idesetq_col IDSETQ timequantum 'Y'
);

SELECT statement with TUPLE()

SELECT _id, TUPLE(time_col, stringset_col, idset_col)
  AS tup
  FROM tuple-demo;

_id | tup
----+-------------------------------------
  A | (2013-07-15T01:18:46Z,stringset1, 1)
  B | (2014-07-15T01:18:46Z,stringset2, 2)

Further information