Resizing Your FeatureBase Cluster
The cluster backup & restore tooling provides a simple way to extract data from a cluster and restore that data to a new cluster—even if the new cluster is a different size. We can use this process to migrate from a smaller cluster to a larger one in a relatively short amount of time.
Data in FeatureBase is typically historical data fed in via bulk ingest or streamed in via Kafka or another data pipeline. As such, the latency requirements are looser than if FeatureBase were a system of record. Migration can be performed in several steps:
- Start a new FeatureBase cluster.
- Stop ingestion via a bulk ingester.
- Backup data from the original FeatureBase cluster.
- Restore data to the new FeatureBase cluster.
- Redirect traffic to the new FeatureBase cluster.
- Restart ingestion against the new FeatureBase cluster.
- Shutdown the original FeatureBase cluster.
First, create a new cluster of the desired size. The configuration from the old cluster can be reused by changing:
etcd.listen-client-addressto the node’s new network address
etcd.listen-peer-addressto the node’s new network address
etcd.initial-clusterto use the new network addresses and add additional nodes
Additionally, if the replication factor needs to be changed, the
cluster.replicas setting should be updated now.
If data is written during/after a backup, it will not end up on the new cluster.
In order to ensure that all data are preserved, shut down the writing processes for the remainder of the migration.
If using an ingest consumer (e.g.
molecula-consumer-kafka), this should be accomplished by completely shutting down the process.
Do not attempt to stall the consumer by creating an exclusive transaction.
featurebase command line tool contains a
backup subcommand for executing a backup against a cluster:
featurebase backup --host featurebase:10101 -o /path/to/backup/ --concurrency 4 # and TLS config
--concurrency flag is not required, but setting it to an appropriate number will improve backup speed.
It is possible to speed the process up more by disabling sync of the backup data:
featurebase backup --host featurebase:10101 -o /path/to/backup/ --no-sync
This will allow the backup to complete without waiting for the operating system to move the data to persistent storage. If the machine running the backup loses power, you may lose some (or all) of the backup data.
featurebase command line tool has an accompanying restore subcommand that will restore a directory created by the backup subcommand:
featurebase restore --host newfeaturebase:10101 -s /path/to/backup/ --concurrency 4
Once the data have been moved over to the new cluster, it is safe to redirect query traffic. Change the targets of any load balancers, and update configurations for services which may be pointed manually to a node.
Once the new system is up and running, the ingester configurations can be updated to point to the new cluster (
They can then be started back up to import new data.
Once the new cluster is running and appears to be functioning properly, the old cluster may be shut down and deleted.
© 2022 Molecula Corp. (DBA FeatureBase). All rights reserved.