Skip to main content Link Menu Expand (external link) Document Search Copy Copied

fbsql loader TOML configuration

The fbsql loader command relies on an appropriately formatted TOML configuration file that contains:

  • FeatureBase target table to insert data
  • Connection settings for an Apache Impala, Apache Kafka or PostgreSQL data source
  • An optional series of key/value pairs that correspond to target table columns.

Before you begin

TOML configuration syntax

# Kafka keys
hosts = ["<address:port>",...]
group = "<kafka-confluent-group>"
topics = "<kafka-confluent-topics>"

# Impala and PostgreSQL connection keys
driver= "<datasource-type>"
connection-string = "<datasource-type>://<datasource-connection-string>"

# Data keys
table = "<target-table>"
query = "<select-from-impala-or-postgresql-data-source>"

# Ingest batching keys
batch-size = <integer-value>
batch-max-staleness = "<integer-value><time-unit>"
timeout = "<integer-value><time-unit>"

# Optional target table keys
[[fields]]
name = "<target-table-column>"
source-type = "<target-table-column-data-type>"
source-column = "<target-table-column>"
[primary-key= "true"]
[source-path = ["<kafka-json-parent-key>", "<json-child-key>"]]

Kafka keys

Key Description Required Additional information
hosts One or more Kafka confluent consumer hosts. Use [] for multiple hosts Apache Kafka Confluent Hosts documentation
group Kafka consumer group Kafka Confluent Hosts documentation
topics One or more Kafka topics Yes Confluent Hosts documentation

Impala and PostgreSQL connection keys

Key Description Required Additional information
driver Driver required for data source Impala or PostgreSQL  
connection-string Quoted connection string that includes the data source type Impala or PostgreSQL Data source connection strings

Data keys

Key Description Required Additional information
table Double-quoted target table to insert data Yes CREATE TABLE statement
query Valid SQL query to SELECT data from the data source for insertion into the target table Impala or PostgreSQL  

Ingest batching keys

Data is collected into batches before importing to FeatureBase. Default values are used if batching keys are not supplied.

Key Description Required Default Additional information
batch-size Integer value representing the maximum size of a batch file containing the data to import. Yes 1 Batch keys
batch-max-staleness Maximum length of time the oldest record in a batch can exist before the batch is flushed Kafka   Batch keys
timeout Time to wait before batch is flushed Kafka "1s" Batch keys

Optional target table keys

Run SHOW CREATE TABLE <tablename> to output column names and data types required for [[fields]] key-values.

FeatureBase will supply values from specified table key if [[fields]] key/values are not supplied.

Key Description Required Additional information
name Target column name Yes  
source-type Target column data Yes Featurebase data types
Configuring Record Time for Time Quantum fields
source-path Nested JSON object parent and child Kafka Defaults to name value when not supplied
source-column Target column name Optional When omitted, order of [[fields]] key-values are correlated to those in <target-table>
primary-key Set to "true" for FeatureBase _id column Only for _id column Omit for other columns

source-type

Specifies the FeatureBase column type the incoming data will be formatted as. For example, if a kafka message contains “foo”:”6” the configuration for foo should contain source-type = “string” even if the foo column in FeatureBase is an Int type. If a source-type is not provided, it will default to the FeatureBase field’s type.

Configuring Record Time for Time Quantum fields

To load data into set type columns with TIMEQUATUM option, the loader configuration should have their mapped fields defined with source-type set to StringSetQ or IDSetQ. Also, an optional timestamp field can be defined in the loader configuration to specify a record time for these time quantum set fields, this special timestamp field must be defined with source-type set to recordTime. Loader will automatically apply the timestamp value from record time field to all the time quantum set fields found in the record. If the optional record time field is not configured or data for that field is not available then time quantum fields will be loaded without a record time and they will be visible to all time ranges without restrictions.

Additional information

Impala and PostgreSQL connection strings

Batch keys

  • There is a direct correlation between the batch-size value in relation to the speed of import and resource usage.
  • batch-max-staleness values may result in timeouts while waiting for a data source
  • timeout can be set to 0s to disable

Batch key time-unit

Batch keys that require <integer-value><time-unit> can use one or more of the following combinations, in descending order.

Time unit Declaration Example
hour h 24h30m
minute m 30m45s
second s 45s10ms
milliseconds ms 10ms22us
microseconds us 22us28ns
nanoseconds ns 28ns

Further information