Backup & Recovery¶

FoundationDB provides robust backup, restore, and disaster recovery capabilities. This guide covers continuous backup to blob storage, point-in-time recovery, and cross-datacenter replication.

Overview¶

FoundationDB's backup system offers:

Feature	Description
Continuous Backup	Stream changes to backup destination with minimal overhead
Point-in-Time Recovery	Restore to any version within the backup window
Disaster Recovery	Real-time replication to a standby cluster
Backup Tags	Run multiple independent backups simultaneously
Disk Snapshot Backup	Block-level point-in-time backup using filesystem/EBS snapshots; high-throughput restore, no continuous backup

Components

fdbbackup - CLI for managing backups
fdbrestore - CLI for restoring from backups
backup_agent - Background process that performs backup operations
fdbdr - CLI for disaster recovery management
dr_agent - Background process for DR replication
fdbcli snapshot - Block-level disk snapshot backup orchestrator (covered below)

Backup System

The backup system streams mutations from transaction logs to your backup destination with minimal overhead.

Backup V3 (In Development)

Backup V3 extends Backup V2 by partitioning the mutation log along two dimensions instead of one: by log-router tag (already in V2) and additionally by user key range (new in V3). At restore time this lets the restore job process distinct key ranges in parallel, dramatically reducing restore time for large datasets.

A new CLI option, --mutation-log-type (with values DEFAULT and PARTITIONED_LOG), selects the log format when starting a backup (PR #13127).

Backup V3 is gated under the 8.0 protocol version and is not available in 7.3 or 7.4 (PR #13225). It will only ship once a release-8.0 branch is cut and tagged.

Note: An earlier "parallel restore" feature was a separate, prior attempt at fast restore that was removed from the codebase in PR #12903. Backup V3 is a ground-up replacement, not a continuation of that work.

If you need fast restore today and can tolerate non-continuous backup, see Disk Snapshot Backup below — it is used in production by some large operators.

Backup Architecture¶

Backup agents run as separate processes that read mutation logs from the database and write them to a backup destination. Multiple agents can run for redundancy and performance.

graph LR
    subgraph "FoundationDB Cluster"
        TLog[Transaction Logs<br/>Mutations]
        SS[Storage Servers]
    end

    subgraph "Backup System"
        BA1[Backup Agent 1]
        BA2[Backup Agent 2]
    end

    subgraph "Backup Storage"
        S3[(S3 / Blob Store)]
        FS[(Filesystem)]
    end

    TLog --> BA1
    TLog --> BA2
    BA1 --> S3
    BA2 --> S3
    BA1 -.-> FS
    BA2 -.-> FS

    style S3 fill:#ff9800,color:#000
    style FS fill:#4caf50,color:#fff

Backup Destinations¶

URL Formats¶

Type	Format	Example
Local filesystem	`file://PATH`	`file:///mnt/backup/fdb`
Blob store (S3-compatible)	`blobstore://HOST/RESOURCE`	`blobstore://s3.amazonaws.com:443/bucket/backup`

Filesystem Backup¶

Bash

fdbbackup start -d file:///mnt/backup/fdb

Requirements: - Path must be accessible to all backup agents - Use network filesystem (NFS, EFS) for multi-machine setups - Ensure sufficient disk space for backup history

Blob Store Backup¶

Multi-Cloud Backup Status

The blob store backup URL scheme (blobstore://) is primarily tested with AWS S3. Be aware of the following:

Azure Blob Storage — Support exists in the codebase but is believed to be broken in current versions. Use with caution and test thoroughly.
Google Cloud Storage (GCS) — Community support is being contributed by Palantir. Check the latest release notes for availability.

Bash

fdbbackup start -d "blobstore://s3.amazonaws.com:443/my-bucket/fdb-backup?bucket=my-bucket&region=us-east-1"

Blob Store URL Parameters¶

Parameter	Description	Default
`bucket`	Bucket name (if not in path)	-
`region`	AWS region	`us-east-1`
`sc`	Storage class (`standard`, `standard_ia`, etc.)	`standard`
`sdk_auth`	Use AWS SDK auth chain	`false`
`secure_connection`	Use HTTPS	`1`

Blob Store Credentials¶

Create a JSON credentials file:

JSON

{
  "accounts": {
    "s3.amazonaws.com": {
      "api_key": "AKIAIOSFODNN7EXAMPLE",
      "secret": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
    }
  }
}

Set via environment variable:

Bash

export FDB_BLOB_CREDENTIALS=/path/to/credentials.json

Or use AWS SDK authentication:

Bash

fdbbackup start -d "blobstore://s3.amazonaws.com/bucket?sdk_auth=true"

Starting a Backup¶

Basic Backup¶

Bash

fdbbackup start -C /etc/foundationdb/fdb.cluster -d file:///backup/fdb

Backup with Snapshot¶

Include a full snapshot for faster restores:

Bash

fdbbackup start -d file:///backup/fdb -z

The -z flag (or --no_stop_when_done) continues backing up after the initial snapshot.

Backup with Tags¶

Run multiple independent backups:

Bash

# Primary backup to S3
fdbbackup start -t primary -d "blobstore://s3.amazonaws.com/prod-backup"

# Secondary backup to local storage
fdbbackup start -t local -d file:///mnt/backup/fdb

Backup with Key Range¶

Back up only specific key ranges:

Bash

fdbbackup start -d file:///backup/fdb -k '\x00' '\xff'

Managing Backups¶

Check Backup Status¶

Bash

fdbbackup status -C /etc/foundationdb/fdb.cluster

Or for a specific tag:

Bash

fdbbackup status -t mytag

Example output:

Text Only

Using cluster file `/etc/foundationdb/fdb.cluster'.

The backup on tag `default' is restorable but continuing to
 blobstore://s3.amazonaws.com:443/my-bucket (last completed log version is 12345678).

Backup is restorable to version 12345678 (approximately 5 seconds old).

List Backups¶

Bash

fdbbackup list -b file:///backup/fdb

Pause/Resume Backup¶

Bash

# Pause (backup continues but range log writing pauses)
fdbbackup modify -t default --active-snapshot-interval 0

# Resume
fdbbackup modify -t default --active-snapshot-interval 86400

Stop Backup¶

Bash

# Stop gracefully (keeps backup restorable)
fdbbackup discontinue -t default

# Abort immediately
fdbbackup abort -t default

Wait for Backup¶

Wait for backup to complete or reach a restorable state:

Bash

fdbbackup wait -t default

fdbbackup Command Reference¶

Subcommand	Description
`start`	Start a new backup
`modify`	Modify backup parameters
`status`	Show backup status
`abort`	Stop backup immediately
`discontinue`	Stop backup gracefully
`wait`	Wait for backup completion
`describe`	Describe backup contents
`list`	List available backups
`tags`	List backup tags
`expire`	Remove old backup data
`delete`	Delete backup entirely
`cleanup`	Clean up incomplete operations

Common Options¶

Option	Description
`-C, --cluster-file`	Path to cluster file
`-d, --destcontainer`	Backup destination URL
`-t, --tag`	Backup tag (default: `default`)
`-k, --keys`	Key range to back up
`-z, --no-stop-when-done`	Continue after snapshot
`-s, --snapshot-interval`	Seconds between snapshots

Restoring from Backup¶

Basic Restore¶

Bash

fdbrestore start -r file:///backup/fdb -C /etc/foundationdb/fdb.cluster

Warning

Restore overwrites all data in the destination cluster. The cluster should be empty or you must use --remove_prefix and --add_prefix options.

Restore to Specific Version¶

Bash

fdbrestore start -r file:///backup/fdb --version 12345678

Restore to Specific Timestamp¶

Bash

fdbrestore start -r file:///backup/fdb --timestamp "2025-02-03 10:30:00"

Restore with Key Transformation¶

Restore to different key prefixes:

Bash

fdbrestore start -r file:///backup/fdb \
  --remove_prefix "prod/" \
  --add_prefix "staging/"

Check Restore Status¶

Bash

fdbrestore status -C /etc/foundationdb/fdb.cluster

Wait for Restore¶

Bash

fdbrestore wait

Abort Restore¶

Bash

fdbrestore abort

fdbrestore Command Reference¶

Subcommand	Description
`start`	Start a restore
`abort`	Stop restore in progress
`wait`	Wait for restore completion
`status`	Show restore status

Restore Options¶

Option	Description
`-r, --source-url`	Backup source URL
`-C, --cluster-file`	Destination cluster file
`--version`	Restore to specific version
`--timestamp`	Restore to specific timestamp
`-t, --tag`	Target tag
`--remove_prefix`	Remove prefix from restored keys
`--add_prefix`	Add prefix to restored keys
`-k, --keys`	Key range to restore

Running Backup Agents¶

Backup agents perform the actual backup work. Run them on your cluster machines.

Starting Backup Agent¶

Bash

backup_agent -C /etc/foundationdb/fdb.cluster

Backup Agent in foundationdb.conf¶

INI

[backup_agent]
command = /usr/lib/foundationdb/backup_agent/backup_agent
logdir = /var/log/foundationdb

Multiple Agents¶

Run multiple backup agents for redundancy and performance. They coordinate automatically.

Agent with Blob Credentials¶

Bash

backup_agent -C /etc/foundationdb/fdb.cluster \
  --blob-credentials /path/to/credentials.json

Disaster Recovery (DR)¶

DR provides real-time replication to a standby cluster for immediate failover.

DR vs Backup¶

Feature	Backup	DR
Recovery time	Minutes to hours	Seconds
Storage	Blob/filesystem	Secondary FDB cluster
Point-in-time	Yes	Limited
Use case	Data protection	High availability

DR Architecture¶

graph LR
    subgraph "Primary Datacenter"
        P_TLog[Transaction Logs]
        P_SS[Storage Servers]
        P_Client[Clients<br/>Read/Write]
    end

    subgraph "DR Datacenter"
        DR_TLog[Transaction Logs]
        DR_SS[Storage Servers]
        DR_Client[Clients<br/>Read-Only]
    end

    DR_Agent[DR Agent]

    P_Client --> P_TLog
    P_TLog --> P_SS
    P_TLog --> DR_Agent
    DR_Agent --> DR_TLog
    DR_TLog --> DR_SS
    DR_Client -.-> DR_SS

    style P_TLog fill:#4caf50,color:#fff
    style DR_TLog fill:#ff9800,color:#000

Setting Up DR¶

Start DR on primary cluster:

Bash

fdbdr start -s /path/to/primary.cluster -d /path/to/dr.cluster

Run DR agents:

Bash

dr_agent -s /path/to/primary.cluster -d /path/to/dr.cluster

Check DR status:

Bash

fdbdr status -s /path/to/primary.cluster -d /path/to/dr.cluster

DR Switchover¶

Perform a planned switchover:

Bash

fdbdr switch -s /path/to/primary.cluster -d /path/to/dr.cluster

After switchover: - DR cluster becomes primary - Original primary becomes standby - Applications reconnect to new primary

DR Failover¶

For unplanned failover when the primary cluster is unavailable and fdbdr switch cannot be used (since it requires both clusters):

Abort the DR job to unlock the destination cluster and stop replication:
Bash
```
fdbdr abort -s /path/to/primary.cluster -d /path/to/dr.cluster
```
After abort, the DR cluster retains a consistent snapshot of the source database from some point in the past. The cluster is unlocked and becomes writable.
Redirect applications to the DR cluster by updating their cluster files to point to the DR cluster.

(Optional) Set up reverse DR once the original primary is recovered, to replicate back from the now-active DR cluster:

Bash

fdbdr start -s /path/to/dr.cluster -d /path/to/primary.cluster
dr_agent -s /path/to/dr.cluster -d /path/to/primary.cluster

Potential Data Loss

Unplanned failover will lose any transactions that were committed on the primary but not yet replicated to the DR cluster. When DR is operating normally, this window is typically only a few seconds of commits.

If fdbdr abort Fails

If the DR agents and metadata are in a broken state and fdbdr abort does not work, you can force-unlock the DR cluster using fdbcli:

Bash

# Connect to the DR cluster
fdbcli -C /path/to/dr.cluster
fdbcli> unlock <UID>

Use fdbcli> lock to find the current lock UID. This unlocks the database without clearing any data.

fdbdr Command Reference¶

Subcommand	Description
`start`	Start DR replication
`switch`	Switch primary and DR
`abort`	Stop DR
`status`	Show DR status

TLS Support¶

All backup commands support TLS when connecting to TLS-enabled clusters:

Bash

fdbbackup start -d file:///backup/fdb \
  --tls-certificate-file /path/to/cert.pem \
  --tls-key-file /path/to/key.pem \
  --tls-ca-file /path/to/ca.pem

Or via environment variables:

Bash

export FDB_TLS_CERTIFICATE_FILE=/path/to/cert.pem
export FDB_TLS_KEY_FILE=/path/to/key.pem
fdbbackup start -d file:///backup/fdb

Backup Retention and Expiration¶

Expire Old Backups¶

Remove backup data older than specified version or timestamp:

Bash

# Expire by version
fdbbackup expire -d file:///backup/fdb --expire-before-version 12345678

# Expire by timestamp
fdbbackup expire -d file:///backup/fdb --expire-before-timestamp "2025-01-01 00:00:00"

# Force minimum restorability before expiring
fdbbackup expire -d file:///backup/fdb --restorable-after-version 12000000

Delete Backup Entirely¶

Bash

fdbbackup delete -d file:///backup/fdb

Disk Snapshot Backup¶

Disk snapshot backup is an alternative backup mechanism that captures a point-in-time, block-level image of every FoundationDB process's data directory by triggering block-level volume snapshots (AWS EBS, LVM, CSI VolumeSnapshot, etc.) coordinated across the cluster. Unlike fdbbackup, it does not stream a continuous mutation log to external storage — instead, it produces a single consistent disk image per role at a single FDB version. Operators choose this approach when restore throughput from fdbbackup is the bottleneck (a snapshot restore is bounded by the speed at which volumes can be attached or copied, not by log replay), and when continuous point-in-time recovery is not required. The mechanism has been part of FoundationDB since the 6.x line and is used in production by some large operators. The snapshot mechanism must come from the block layer underneath the filesystem (EBS, LVM, CSI VolumeSnapshot); see Prerequisites below.

When to Use¶

Aspect	`fdbbackup`	Disk Snapshot Backup
Granularity	Logical key-value mutations	Block-level disk image per process
Point-in-time recovery	Any version within the backup window	Only the FDB version captured at snapshot time
Continuous backup	Yes	No
Restore speed	Bounded by data size + log replay throughput	Bounded by volume attach / copy speed
External dependencies	Blob store or filesystem destination	Block device or volume-snapshot mechanism (EBS / LVM / CSI VolumeSnapshot)
Storage engine support	Any storage engine	Redwood (`ssd-redwood-1`) and SQLite (`ssd-2`) only
Operator tooling required	Low — ships with FDB	High — operator must build, deploy, and manage a `snap_create` binary

How It Works¶

When fdbcli> snapshot <binary> [args...] is invoked, the cluster controller orchestrates a synchronized snapshot across all stateful processes. Each fdbserver process then forks the operator-supplied snap_create binary, which is responsible for invoking the underlying volume-snapshot mechanism on that host's data directory.

graph TD
    Op[Operator]
    CLI["fdbcli&gt; snapshot /bin/snap_create.sh"]
    CC[Cluster Controller<br/>Snapshot Orchestrator]

    subgraph "Storage Process"
        SS[fdbserver<br/>storage role]
        SS_Snap[snap_create<br/>--role=storage]
        SS_Disk[(Storage data dir)]
    end

    subgraph "TLog Process"
        TL[fdbserver<br/>tlog role]
        TL_Snap[snap_create<br/>--role=tlog]
        TL_Disk[(TLog data dir)]
    end

    subgraph "Coordinator Process"
        CO[fdbserver<br/>coordinator]
        CO_Snap[snap_create<br/>--role=coord]
        CO_Disk[(Coordinator data dir)]
    end

    Op --> CLI
    CLI --> CC
    CC --> SS
    CC --> TL
    CC --> CO
    SS --> SS_Snap --> SS_Disk
    TL --> TL_Snap --> TL_Disk
    CO --> CO_Snap --> CO_Disk

    style CC fill:#ff9800,color:#000
    style SS_Disk fill:#4caf50,color:#fff
    style TL_Disk fill:#4caf50,color:#fff
    style CO_Disk fill:#4caf50,color:#fff

The orchestrator quiesces the relevant subsystems and ensures that all per-role snapshots taken across the cluster reflect the same FDB version. The result is a set of disk images — one per role, per process — that together form a consistent backup of the cluster.

Prerequisites

Block-level snapshots (required) — AWS EBS, LVM, CSI VolumeSnapshot on Kubernetes. These work under the upstream-recommended ext4 with defaults,noatime,discard filesystem, which is the only supported configuration for the FoundationDB data directory; see Filesystem for the rationale.
Linux only — disk snapshot backup is not supported on Windows.
Storage engine restriction — supported only with the Redwood (ssd-redwood-1) and SQLite (ssd-2) storage engines. Not supported with the RocksDB storage engine (apple/foundationdb#5155).
Operator-supplied binary — the operator must build, deploy, and maintain a snap_create executable (see below). FoundationDB does not ship one.

Setting Up the `snap_create` Binary¶

snap_create is an operator-supplied executable invoked by fdbserver once per role on each host when a snapshot is requested. It is responsible for actually triggering the underlying volume-snapshot operation (for example, an aws ec2 create-snapshot call, an lvcreate --snapshot, or a CSI VolumeSnapshot).

The simplest illustrative implementation copies the data directory to a separate location, similar to the upstream example:

Bash

#!/bin/bash
# /bin/snap_create.sh — illustrative example only.
# Real deployments should call EBS / LVM / CSI VolumeSnapshot APIs.
set -euo pipefail

UID=""
VERSION=""
PATH_ARG=""
ROLE=""

while [[ $# -gt 0 ]]; do
  case "$1" in
    --uid)     UID="$2";      shift 2 ;;
    --version) VERSION="$2";  shift 2 ;;
    --path)    PATH_ARG="$2"; shift 2 ;;
    --role)    ROLE="$2";     shift 2 ;;
    *)         shift ;;  # ignore extra operator-supplied args
  esac
done

DEST="/var/snapshots/${UID}/${ROLE}-${VERSION}"
mkdir -p "$DEST"
cp -a "$PATH_ARG"/. "$DEST"/

fdbserver injects four named arguments when invoking the binary, followed by any extra arguments the operator passed to fdbcli> snapshot:

Argument	Description
`--uid <UID>`	Snapshot UID generated by the cluster controller; identical across all roles in a single snapshot.
`--version <FDB version>`	The FDB cluster version captured by this snapshot.
`--path <data dir>`	Absolute path to the data directory that must be snapshotted.
`--role <role>`	Role of the process: `storage`, `tlog`, `coord`, etc.
user-supplied args	Any additional arguments after the binary path in `fdbcli> snapshot`.

Configuring the Cluster¶

To allow fdbserver to fork the snapshot binary, set whitelist_binpath in the [fdbserver] section of foundationdb.conf on every host:

INI

[fdbserver]
command = /usr/sbin/fdbserver
whitelist_binpath = /bin/snap_create.sh

The path may be a colon-separated list if multiple binaries are permitted. Restart fdbserver (or trigger a rolling restart) for the change to take effect.

The SNAP_CREATE_MAX_TIMEOUT knob (default 5 minutes) bounds how long the orchestrator will wait for snap_create to complete on each process. Tune it via --knob-snap-create-max-timeout=<seconds> if your underlying snapshot mechanism is slow.

Recommended Metadata to Capture¶

Disk snapshot images by themselves are not sufficient to reconstitute a cluster — the operator must also record enough metadata to map images back to roles, processes, and FDB versions at restore time. Capture at least the following per snapshot:

Field	Description
UID	The snapshot UID returned by `fdbcli> snapshot`. Identical across all roles in a single snapshot.
fdbserver version	Exact FDB binary version that produced the snapshot. Required when restoring (the new cluster must run the same major version).
Creation time	Wall-clock time the snapshot was taken. Useful for retention policy.
Cluster file	Contents of `fdb.cluster` at the time of snapshot, so coordinator addresses can be re-derived.
Configuration / knobs	`foundationdb.conf` and any non-default knobs in effect on each process.
Process IP and port	Address each role was listening on at snapshot time.
Locality	`--locality_*` settings (zoneid, dcid, machineid) for each process.
File naming	Recommended naming convention: `<cluster-name>:<ip>:<port>:<UID>` so images can be grouped and matched at restore time.

Tip

Store this metadata alongside the snapshot images themselves (e.g., as object tags on EBS snapshots or as a sidecar JSON file) so it cannot be lost independently of the data.

Taking a Backup¶

From an fdbcli session attached to the cluster:

Text Only

fdbcli> snapshot /bin/snap_create.sh --extra-arg value
Snapshot command succeeded with UID a1b2c3d4e5f60718293a4b5c6d7e8f90

Pass the absolute path to your snap_create binary (which must match whitelist_binpath) followed by any extra arguments your binary accepts. The UID printed in the response is the same UID injected as --uid to every invocation of snap_create across the cluster.

snapshot is a hidden fdbcli command

In both release-7.3 and release-7.4 of apple/foundationdb, snapshot is registered as a hidden command (CommandFactory snapshotFactory("snapshot") in fdbcli/SnapshotCommand.actor.cpp, marked // hidden commands, no help text for now). It is fully functional, but it does not appear in fdbcli> help output. Invoke it directly by name.

Restore Steps¶

A disk-snapshot restore reconstitutes a new FDB cluster from a previously captured set of per-role snapshot images. Roughly:

Locate snapshot images by UID. Identify all images that share the same snapshot UID — one per role per process across the original cluster.
Group by old IP / locality. Use the captured metadata to group images by the original process's IP, port, and locality. Each group corresponds to one process worth of state.
Provision new cluster nodes. Decide on the IP layout for the new cluster, build a mapping from old IP → new IP, and attach (or copy) each image to the corresponding new host into the same data directory layout per role.
Recompute fdb.cluster. Rewrite the cluster file with the new coordinator IPs (taken from the IP mapping). Distribute the new cluster file to every node.
Start fdbserver on the new nodes. With the data directories in place and the new cluster file pointing at the new coordinators, the cluster will recover automatically to the snapshot's FDB version.

Multi-role processes share a data directory

If a single fdbserver on the original cluster ran multiple roles out of one data directory (for example, a combined storage + tlog), the disk image will contain files for all of those roles. When restoring such an image into a node that should serve only one of those roles, the operator must delete the on-disk files belonging to the other roles before starting fdbserver, or the process will refuse to start. Plan the role-to-node mapping carefully when designing the restore.

Error Codes¶

snap_create failures and orchestration errors surface through standard FoundationDB error codes. The most relevant are:

Code	Name	Description	Suggested action
2500	`snap_disable_tlog_pop_failed`	Failed to disable tlog popping during snapshot.	Retry; check tlog process health.
2501	`snap_storage_failed`	`snap_create` invocation on a storage process failed.	Check `snap_create` logs on the affected storage host.
2502	`snap_tlog_failed`	`snap_create` invocation on a tlog process failed.	Check `snap_create` logs on the affected tlog host.
2503	`snap_coord_failed`	`snap_create` invocation on a coordinator failed.	Check `snap_create` logs on the affected coordinator.
2504	`snap_enable_tlog_pop_failed`	Failed to re-enable tlog popping after snapshot.	Investigate tlog state; popping may need to be re-enabled manually.
2505	`snap_path_not_whitelisted`	The supplied binary is not present in `whitelist_binpath`.	Add the binary path to `whitelist_binpath` in `foundationdb.conf` and restart.
2506	`snap_not_fully_recovered_unsupported`	Cluster has not fully recovered; snapshot is not allowed.	Wait for cluster recovery to complete, then retry.
2507	`snap_log_anti_quorum_unsupported`	Snapshot is not supported with log anti-quorum configured.	Reconfigure the cluster without log anti-quorum to use disk snapshots.
2508	`snap_with_recovery_unsupported`	Snapshot was attempted concurrently with recovery.	Retry once recovery completes.
4000	`snap_invalid_uid_string`	The supplied UID string is malformed.	Use a valid UID (the API generates one for you when called from `fdbcli`).

Limitations

No continuous / point-in-time recovery. Each snapshot captures one FDB version; you cannot replay forward to an arbitrary later version.
Linux only. Windows is not supported.
Encryption depends on the storage layer. FoundationDB does not encrypt the snapshot images itself — encryption-at-rest is whatever your filesystem, EBS volume, or storage backend provides.
Operator-built tooling. The snap_create binary, snapshot transport, and restore orchestration are entirely the operator's responsibility.
Restore version is fixed. A restore brings the cluster up at exactly the version captured by the snapshot; you cannot choose a different version at restore time.

Programmatic API¶

Disk snapshot backup can also be triggered from application code via the C API (apple/foundationdb#4241):

C

const char *uid = "a1b2c3d4e5f60718293a4b5c6d7e8f90";
FDBFuture *f = fdb_database_create_snapshot(db, uid, strlen(uid));

The caller supplies the UID (typically a freshly generated 32-character hex string), and fdbserver invokes the configured snap_create binary on each role exactly as it would for fdbcli> snapshot. The future resolves once the cluster-wide snapshot has either succeeded or failed.

Cleanup¶

FoundationDB does not garbage-collect old or failed disk-snapshot images. If a snapshot operation fails partway through, or if a successful snapshot ages out of the operator's retention policy, the on-disk (or on-EBS, or on-S3) artifacts must be expired by external tooling — for example, a cron job that lists snapshots older than N days and deletes them, or lifecycle policies on the underlying storage. Plan a cleanup strategy before enabling disk snapshot backup in production.

Monitoring Backups¶

Status in fdbcli¶

Bash

fdb> status
...
Backup and DR:
  Running backups        - 1
  Running DRs            - 0

Backup Metrics¶

Monitor these via machine-readable status:

Metric	Path	Alert Threshold
Backup running	`cluster.layers.backup.instances_running`	== 0 when expected
Last restorable	`cluster.layers.backup.tags.{tag}.last_restorable_seconds_behind_primary`	> 300 seconds
Backup agents	`cluster.layers.backup.agents_running`	< expected count

Alerting Script¶

Python

#!/usr/bin/env python3
import fdb
import json
import sys

fdb.api_version(730)
db = fdb.open()

@fdb.transactional
def get_status(tr):
    return json.loads(tr[b'\xff\xff/status/json'])

status = get_status(db)
backup = status.get('cluster', {}).get('layers', {}).get('backup', {})

if not backup.get('instances_running', 0):
    print("WARNING: No backup instances running")
    sys.exit(1)

for tag, info in backup.get('tags', {}).items():
    lag = info.get('last_restorable_seconds_behind_primary', float('inf'))
    if lag > 300:
        print(f"WARNING: Backup {tag} is {lag:.0f} seconds behind")
        sys.exit(1)

print("OK: Backups healthy")
sys.exit(0)

Best Practices¶

Backup Configuration¶

Run multiple backup agents - At least 2 for redundancy
Use blob storage for production - More durable than filesystem
Set appropriate snapshot intervals - Daily snapshots balance storage vs restore time
Tag your backups - Enables multiple independent backup streams

Testing and Validation¶

Test restores regularly - Monthly to a test cluster
Validate backup integrity - Use fdbbackup describe
Monitor backup lag - Alert if > 5 minutes behind
Document recovery procedures - RTO/RPO requirements

Retention Policy¶

Define retention requirements - Regulatory, operational
Automate expiration - Use fdbbackup expire in cron
Keep multiple generations - At least 7 days for point-in-time
Verify before expiring - Ensure newer backups are restorable

Security¶

Encrypt backup storage - S3 server-side encryption or client-side
Secure credentials - Use IAM roles, not long-term keys
Restrict backup access - Separate credentials for backup vs restore
Audit backup operations - Log all backup/restore commands

Backup Encryption NEW IN 7.3¶

Starting in FoundationDB 7.3, backups support native file-level encryption. When enabled, backup data is encrypted before being written to the backup destination, providing end-to-end encryption regardless of the storage backend.

Enabling Encryption¶

To enable encryption on an existing backup, use the fdbbackup modify command:

Bash

fdbbackup modify -t default --encryption

New backups can also be started with encryption enabled:

Bash

fdbbackup start -d "blobstore://s3.amazonaws.com/my-bucket/fdb-backup" --encryption

Note

Encryption applies to newly written backup files. Existing unencrypted files in the backup are not retroactively encrypted.

Checking Encryption Status¶

Backup status — Encryption key information is included in the backup status JSON output:

Bash

fdbbackup status -t default --json

The JSON output includes encryption key details when encryption is active.

Backup describe — The fdbbackup describe output includes a FileLevelEncryption field indicating whether the backup is encrypted:

Bash

fdbbackup describe -d "blobstore://s3.amazonaws.com/my-bucket/fdb-backup" --json

Look for the FileLevelEncryption field in the JSON response to confirm encryption is enabled.

Tip

Native backup encryption works with all backup destinations including S3 blob storage and local filesystem targets. It can be combined with S3 server-side encryption for defense in depth.

Troubleshooting¶

Common Issues¶

Issue	Cause	Solution
Backup not progressing	No backup agents	Start backup agents
High backup lag	Slow destination	Check network, storage IOPS
Restore fails	Corrupted backup	Try earlier version
Permission denied	Credential issues	Check blob credentials file

Debug Commands¶

Bash

# Describe backup contents
fdbbackup describe -d file:///backup/fdb

# List backup tags
fdbbackup tags

# Cleanup incomplete operations
fdbbackup cleanup -d file:///backup/fdb

# Check backup agent logs
tail -f /var/log/foundationdb/backup_agent*.xml

Next Steps¶

Learn Troubleshooting for common issues
Review Monitoring for backup metrics
See Configuration for backup agent settings

Backup & Recovery¶

Overview¶

Backup Architecture¶

Backup Destinations¶

URL Formats¶

Filesystem Backup¶

Blob Store Backup¶

Blob Store URL Parameters¶

Blob Store Credentials¶

Starting a Backup¶

Basic Backup¶

Backup with Snapshot¶

Backup with Tags¶

Backup with Key Range¶

Managing Backups¶

Check Backup Status¶

List Backups¶

Pause/Resume Backup¶

Stop Backup¶

Wait for Backup¶

fdbbackup Command Reference¶

Common Options¶

Restoring from Backup¶

Basic Restore¶

Restore to Specific Version¶

Restore to Specific Timestamp¶

Restore with Key Transformation¶

Check Restore Status¶

Wait for Restore¶

Abort Restore¶

fdbrestore Command Reference¶

Restore Options¶

Running Backup Agents¶

Starting Backup Agent¶

Backup Agent in foundationdb.conf¶

Multiple Agents¶

Agent with Blob Credentials¶

Disaster Recovery (DR)¶

DR vs Backup¶

DR Architecture¶

Setting Up DR¶

DR Switchover¶

DR Failover¶

fdbdr Command Reference¶

TLS Support¶

Backup Retention and Expiration¶

Expire Old Backups¶

Delete Backup Entirely¶

Disk Snapshot Backup¶

When to Use¶

How It Works¶

Setting Up the snap_create Binary¶

Configuring the Cluster¶

Recommended Metadata to Capture¶

Taking a Backup¶

Restore Steps¶

Error Codes¶

Programmatic API¶

Cleanup¶

Monitoring Backups¶

Status in fdbcli¶

Backup Metrics¶

Alerting Script¶

Best Practices¶

Backup Configuration¶

Testing and Validation¶

Retention Policy¶

Security¶

Backup Encryption NEW IN 7.3¶

Enabling Encryption¶

Checking Encryption Status¶

Troubleshooting¶

Common Issues¶

Debug Commands¶

Next Steps¶

Setting Up the `snap_create` Binary¶