Backup & Restore¶
Complete backup strategies for all Dango components.
Overview¶
A Dango project has several components that need backup:
| Component | Location | Contents | Priority |
|---|---|---|---|
| DuckDB Database | data/warehouse.duckdb | All synced data | High |
| Configuration | .dango/, .dlt/ | Source configs, credentials | High |
| dbt Models | dbt/ | Transformations, tests | High |
| Metabase | metabase-data/ | Dashboards, settings | Medium |
| Raw Files | data/uploads/ | CSV source files | Medium |
Quick Backup Script¶
Create a complete backup with one script:
#!/bin/bash
# backup_dango.sh
PROJECT_DIR=$(pwd)
BACKUP_DIR="backups/$(date +%Y%m%d_%H%M%S)"
mkdir -p "$BACKUP_DIR"
# 1. Stop services (optional but recommended)
dango stop
# 2. Database
cp data/warehouse.duckdb "$BACKUP_DIR/"
# 3. Configuration
cp -r .dango "$BACKUP_DIR/"
cp -r .dlt "$BACKUP_DIR/"
# 4. dbt models
cp -r dbt "$BACKUP_DIR/"
# 5. Metabase dashboards
dango metabase save --output "$BACKUP_DIR/metabase/"
# 6. Metabase data (Docker volume)
if [ -d "metabase-data" ]; then
cp -r metabase-data "$BACKUP_DIR/"
fi
# 7. CSV files (optional - may be large)
# cp -r data/uploads "$BACKUP_DIR/"
# Restart services
dango start
echo "Backup complete: $BACKUP_DIR"
Component-Specific Backups¶
DuckDB Database¶
The DuckDB database contains all your synced and transformed data:
# Simple copy (while services stopped)
dango stop
cp data/warehouse.duckdb data/warehouse.duckdb.backup
dango start
# With timestamp
cp data/warehouse.duckdb "backups/warehouse_$(date +%Y%m%d).duckdb"
Stop Before Backup
For consistency, stop Dango services before copying the database file, or use DuckDB's export functionality for a consistent snapshot.
Export to Parquet (alternative):
Configuration Files¶
Back up your source configurations and project settings:
# Configuration files
cp -r .dango backups/dango_config/
cp .dango/sources.yml backups/sources.yml.backup
cp .dango/project.yml backups/project.yml.backup
# dlt configuration (NO secrets!)
cp .dlt/config.toml backups/dlt_config.toml.backup
# WARNING: Don't backup secrets.toml to shared storage
Protect Secrets
Never commit or backup .dlt/secrets.toml to shared or cloud storage. Recreate credentials manually on restore.
dbt Models¶
Your transformation logic should be version controlled:
# If using git (recommended)
cd dbt && git add . && git commit -m "Backup dbt models"
# Manual backup
cp -r dbt backups/dbt_backup/
Metabase Dashboards¶
Export dashboard definitions for backup:
# Export dashboards
dango metabase save --output backups/metabase/
# Full Metabase data (H2 database)
cp -r metabase-data backups/metabase_backup/
CSV Source Files¶
# Back up uploaded CSV files
cp -r data/uploads backups/csv_files/
# Compressed backup for large files
tar -czf backups/csv_files.tar.gz data/uploads/
Restore Procedures¶
Full Restore¶
# 1. Create fresh project
dango init my-project-restored
cd my-project-restored
# 2. Restore configuration
cp -r /path/to/backup/.dango .
cp -r /path/to/backup/.dlt .
# 3. Restore dbt models
rm -rf dbt
cp -r /path/to/backup/dbt .
# 4. Restore database
cp /path/to/backup/warehouse.duckdb data/
# 5. Start services
dango start
# 6. Restore Metabase dashboards
dango metabase load --input /path/to/backup/metabase/
# 7. Re-enter credentials (secrets.toml)
# Edit .dlt/secrets.toml with your credentials
Restore Individual Components¶
Database Only:
Metabase Only:
dango stop
rm -rf metabase-data
cp -r /path/to/backup/metabase_backup metabase-data
dango start
# Or restore from saved dashboards
dango metabase load --input /path/to/backup/metabase/
dbt Models Only:
rm -rf dbt/models/intermediate dbt/models/marts
cp -r /path/to/backup/dbt/models/intermediate dbt/models/
cp -r /path/to/backup/dbt/models/marts dbt/models/
dango run
Automated Backups¶
Cron Job (Linux/macOS)¶
# Edit crontab
crontab -e
# Add daily backup at 2 AM
0 2 * * * cd /path/to/project && ./backup_dango.sh >> /var/log/dango_backup.log 2>&1
Backup Rotation¶
Keep recent backups, delete old ones:
#!/bin/bash
# rotate_backups.sh
BACKUP_DIR="backups"
KEEP_DAYS=7
# Delete backups older than 7 days
find "$BACKUP_DIR" -type d -mtime +$KEEP_DAYS -exec rm -rf {} +
echo "Old backups cleaned up"
Cloud Backup¶
To AWS S3:
To Google Cloud Storage:
Disaster Recovery¶
Recovery Checklist¶
- ✅ Install Dango:
pip install getdango - ✅ Initialize project:
dango init project-name - ✅ Restore configuration files
- ✅ Recreate credentials in
.dlt/secrets.toml - ✅ Restore database or re-sync:
dango sync - ✅ Restore or regenerate dbt models:
dango run - ✅ Restore Metabase dashboards
- ✅ Verify data integrity
Recovery Time Estimates¶
| Scenario | Method | Time |
|---|---|---|
| Quick recovery | Restore from backup | 5-10 min |
| Full re-sync | Re-fetch all data | Hours (depends on data volume) |
| Partial recovery | Restore DB + re-sync recent | 30-60 min |
What NOT to Backup¶
Some files should be excluded from backups:
# .gitignore patterns for backup exclusion
*.log
*.tmp
__pycache__/
.dlt/pipelines/ # Can be regenerated
dbt/target/ # Compiled artifacts
dbt/logs/ # dbt logs
Verification¶
After backup, verify integrity:
# Check DuckDB backup
duckdb backups/warehouse.duckdb "SELECT COUNT(*) FROM information_schema.tables"
# Verify JSON export
python -c "import json; json.load(open('backups/metabase_export.json'))"
# List backup contents
ls -la backups/
Next Steps¶
- Git Workflows - Version control for configs and models
- Troubleshooting - Recovery from common issues
- Performance - Optimize large database backups