Skip to main content Link Menu Expand (external link) Document Search Copy Copied

How do I evaluate FeatureBase sample data?

The sample database is provided for users and organizations evaluating FeatureBase at scale. It contains one billion rows with attributes commonly used for segmentation queries.

Learn how fast the FeatureBase database is by running queries against the cseg (customer segmentation) and skills tables

Before you begin

Table statements

Run the SHOW CREATE TABLE statement to output CREATE TABLE statements for both tables.

SHOW CREATE TABLE cseg;
SHOW CREATE TABLE skills;

What are IDSET and STRINGSET data types?

The cseg table has multiple columns assigned to IDSET and STRINGSET data types. These data types enable FeatureBase to store low-cardinality data (1:many relationships) in a single column without needing to rely on traditional data models such as the star schema.

SQL queries

The following queries demonstrate sub-second latency in the FeatureBase database

Queries

Data Exploration 

Prove the sample database contains 1000025000 rows, the majority contained in cseg.

SELECT COUNT(*) FROM cseg;
SELECT COUNT(*) FROM skills;

Output the top 10 rows in each table:

SELECT TOP(10) * FROM cseg;
SELECT TOP(10) * FROM skills;

Complex Segmentation 

SELECT SUM(income) FROM cseg
WHERE income > 5000 AND age = 45 AND (SETCONTAINSANY(skills,['Ms Office','Excel']));

Aggregations

SELECT SUM(income) FROM cseg;
SELECT SUM(income) FROM cseg where income > 5000;
SELECT AVG(income) FROM cseg;

Grouping with Complex Conditions

SELECT hobbies, COUNT(*) as cnt
FROM cseg
WITH (flatten(hobbies))
GROUP BY hobbies
HAVING COUNT(*) > 200000000
ORDER BY cnt DESC;
SELECT education, SUM(income)
FROM cseg
WITH (flatten(education))
WHERE age=18
GROUP BY education;

JOINS

[cseg]Count(Intersect(
Row(hobbies="Teaching"),
Distinct(Row(bools='available_for_hire'), field= id, index=skills)))

Top K 

[cseg]TopK(hobbies, k=5)
[cseg]TopK(hobbies, k=10, filter=Intersect(Row(sex=Female),Row(hobbies='Scuba Diving')))

Next step