Define a datasource with fbsql loaders
The fbsql-loader
command can be run from the CLI to:
- read data from a specified Impala, Kafka or Postgres data source
BULK INSERT
this data to an existing FeatureBase database and table.
Before you begin
- Learn about SQL BULK INSERT
- Learn about “docopt” notation standards used in this guide
- Learn about fbsql
- Create a FeatureBase database
Syntax
fbsql
<db-connection-string> \
(--loader-(impala|kafka|postgres)) filename.toml
Arguments
Argument | Description | Additional information |
---|---|---|
<db-connection-string> | fbsql connection string to FeatureBase database | * fbsql connect to FeatureBase Cloud * fbsql connect to FeatureBase Community |
--loader-impala | Designate a configuration file containing Impala database credentials FeatureBase will read from. | Load Impala Data With fbsql |
--loader-kafka | Designate a configuration file containing Kafka Avro JSON files | Load Kafka Data With fbsql |
--loader-postgres | Run fbsql in non-interactive mode to load data from PostgreSQL. | Load PostgreSQL Data With fbsql |
Source Independent Configuration Options
The configuration file must be in TOML format.
General
The table below holds the key/value pairs supported in the TOML file independent of the source you want to connect to:
Key | Description | Example Value | Default |
---|---|---|---|
table | The name of the FeatureBase table into which data, consumed by fbsql, will be written. The table must exist prior to running fbsql . | "tablename" | |
batch-size | The size of the BULK INSERT batches sent to FeatureBase. The ideal value will depend on the data model, avaliable resources, and target load rates. Generally speaking, larger values will increase the rate at which data is loaded but will use more resources. | 100000 | 1 |
Fields
Providing field configuration in the TOML configuration file is optional. If no fields are provided, fbsql will try to map each source data field to a FeatureBase columns from the table specified in the configuration file.
Fields are specified as a TOML arrays of tables. Each source data field will need an entry in the file.
The table below holds the key/value pairs supported in the TOML file independent of the source you want to connect to:
Key | Description | Example Value | Default |
---|---|---|---|
name | Specifies the name of the FeatureBase column into which data will be written. | col_name | |
source-type | Specifies the FeatureBase column type the incoming data will be formatted as. For example, if a kafka message message contains "foo":"6" the configuration for foo should contain source-type = "string" even if the foo column in FeatureBase is an Int type. If a source-type is not provided, it will default to the FeatureBase field’s type. | "idset" | FeatureBase Column Type |
primary-key | Exactly one field of the source data should be set as the primary key. The name of the field designated the primary key does not need to map to a column in FeatureBase. | true | false |
Possible source-type
values are: