Number of records to read before indexing them as a batch. A larger value indicates better throughput and more memory usage. Recommended: 1,048,576
1
--concurrency
int
Number of concurrent sources and indexing routines to launch. Does not support SQL ingestion or --auto-generate
1
When ingesting multiple CSV files
--featurebase-hosts
string
Supply FeatureBase default bind points using comma separated list of host:port pairs.
[localhost:10101]
--index
string
Name of target FeatureBase index.
Yes
--string-array-separator
string
character used to delineate values in string array
,
--use-shard-transactional-endpoint
Use alternate import endpoint that ingests data for all fields in a shard in a single atomic request. No negative performance impact and better consistency.
Recommended.
CSV ingest flags
Flag
Data type
Description
Default
Required
--files
string
List of files, URLs, or directories to ingest. CSV files can be gzipped.
[]
Yes
--header
string
Defined as {source_column_name}[__data_type[_constraint-value...]],...
[]
If data_type, constraint-value not defined in data file.
--ignore-header
Ignore header in file and use --header flag to define column names and data types.
When using --header flag
Generate ID flags
Flag
Data type
Description
Default
Required
--auto-generate
Automatically generate IDs. Used for testing purposes. Cannot be used with --concurrency
When --id-field or --primary-key-fields not defined
--external-generate
Allocate _id using the FeatureBase ID allocator. Supports --offset-mode. Requires --auto-generate
--id-alloc-key-prefix
string
Prefix for ID allocator keys when using --external-generate. Requires different value for each concurrent ingester
ingest
--id-field
string
A sequence of positive integers that uniquely identifies each record. Use instead of --primary-key-fields
if --auto-generate or --primary-key-fields not defined
--primary-key-fields
string
Convert records to strings for use as unique _id. Single records are not added to target as records. Multiple records are concatenated using / and added to target as records. Use instead of --id-field
[]
If --auto-generate or --id-field are not defined.
--offset-mode
Set Offset-mode based Autogenerated IDs. Requires --auto-generate and --external-generate
When ingesting from an offset-based data source
Error handling flags
flag
data type
Description
Default
Required
--allow-decimal-out-of-range
Allow ingest to continue when it encounters out of range decimals in Decimal Fields.
false
--allow-int-out-of-range
Allow ingest to continue when it encounters out of range integers in Int Fields.
false
--allow-timestamp-out-of-range
Allow ingest to continue when it encounters out of range timestamps in Timestamp Fields.
false
--batch-max-staleness
duration
Maximum length of time the oldest record in a batch can exist before the batch is flushed. This may result in timeouts while waiting for the source
--commit-timeout
duration
A commit is a process of informing the data source the current batch of records is ingested. --commit-timeout is the maximum time before the commit process is cancelled. May not function for CSV ingest process.
--skip-bad-rows
int
Fail the ingest process if n rows are not processed.