Audit Storage NEW IN 7.4¶

Version Notice

This feature is only available in FoundationDB 7.4 and later. You are viewing docs for version 7.3.

Audit Storage validates the consistency of data replicas and location metadata in your FoundationDB cluster. It provides end-to-end verification that all copies of your data match and that metadata is consistent.

Overview¶

Audit Storage checks three types of consistency:

Audit Type	What It Checks
`replica`	Data consistency between replicas across all DCs
`locationmetadata`	Consistency between KeyServer and ServerKey metadata
`ssshard`	Consistency between ServerKey and storage server shard mappings

Key Features¶

End-to-end completeness - Persists progress; continues until all ranges are verified
Scalable - Near-linear speedup with parallelism (configurable via CONCURRENT_AUDIT_TASK_COUNT_MAX)
Fault tolerant - Automatically retries failed checks
Progress monitoring - CLI commands to track job status
No additional setup - Uses existing DD and SS infrastructure

Commands¶

Start an Audit¶

Bash

# Check replica consistency
fdbcli> audit_storage replica "" \xff\xff

# Check location metadata
fdbcli> audit_storage locationmetadata "" \xff\xff

# Check SS shard mappings
fdbcli> audit_storage ssshard "" \xff\xff

Check Status¶

Bash

# List recent jobs
fdbcli> get_audit_status replica recent

# Check specific job progress
fdbcli> get_audit_status replica progress <AUDIT_ID>

Cancel an Audit¶

Bash

fdbcli> audit_storage cancel replica <AUDIT_ID>

Audit Types¶

Replica Consistency (`replica`)¶

Verifies that all replicas of each key-value pair are identical:

Compares data between storage servers across all data centers
Uses shard-based partitioning for efficient parallel checking
Generates SSAuditStorageShardReplicaError trace events on mismatch

Bash

fdbcli> audit_storage replica "" \xff
Audit ID: 12345678-1234-5678-1234-567812345678

Location Metadata (`locationmetadata`)¶

Validates consistency between system metadata:

Checks KeyServer ↔ ServerKey mappings
Ensures ranges are assigned to correct servers
Generates DDDoAuditLocationMetadataError on mismatch

Note

Location metadata audit always checks all key space, regardless of the range specified.

SS Shard Mappings (`ssshard`)¶

Verifies storage server local state matches system metadata:

Compares ServerKeys with SS in-memory shard information
Checks each storage server individually
Generates SSAuditStorageSsShardError on mismatch

Monitoring Progress¶

CLI Status¶

Bash

fdbcli> get_audit_status replica progress <AUDIT_ID>
Audit ID: 12345678...
Type: replica
Range: ["", "\xff")
Phase: Running
Submitted: 42 tasks
Completed: 38 tasks
Error: 0 tasks

Trace Events¶

Monitor these trace events for audit activity:

Event	Description
`AuditStorageStart`	Audit job started
`AuditStorageComplete`	Audit job finished
`SSAuditStorageShardReplicaError`	Replica inconsistency detected
`DDDoAuditLocationMetadataError`	Metadata inconsistency detected
`SSAuditStorageSsShardError`	Shard mapping inconsistency detected

Progress Persistence¶

Audit progress is stored in system metadata:

Replica/location metadata: \xff/auditRanges/
SS shard checking: \xff/auditServers/

This enables: - Resume after failures without re-checking completed ranges - Accurate progress tracking - Efficient resource utilization

Comparison with Consistency Checker Urgent¶

Feature	Audit Storage	Consistency Checker Urgent
Progress persistence	✅ Yes	❌ No
Location metadata check	✅ Yes	❌ No
CLI job management	✅ Yes	❌ No
Efficiency	✅ High (no repeat work)	⚠️ Lower

Best Practices¶

Schedule regular audits - Run replica audits periodically (e.g., weekly)
Monitor trace events - Alert on *Error trace events
Use appropriate ranges - For large clusters, audit in segments
Check after incidents - Run audits after hardware failures or recoveries

Troubleshooting¶

Audit Not Progressing¶

Check storage server health with status details
Verify data distribution is working
Review trace logs for errors

High Error Count¶

Examine specific *Error trace events
Check for storage server issues
Consider running shard-by-shard audits

Audit Storage NEW IN 7.4¶

Overview¶

Key Features¶

Commands¶

Start an Audit¶

Check Status¶

Cancel an Audit¶

Audit Types¶

Replica Consistency (replica)¶

Location Metadata (locationmetadata)¶

SS Shard Mappings (ssshard)¶

Monitoring Progress¶

CLI Status¶

Trace Events¶

Progress Persistence¶

Comparison with Consistency Checker Urgent¶

Best Practices¶

Troubleshooting¶

Audit Not Progressing¶

High Error Count¶

See Also¶

Replica Consistency (`replica`)¶

Location Metadata (`locationmetadata`)¶

SS Shard Mappings (`ssshard`)¶