Database Migrations
This guide covers database migrations using Alembic with SQLModel.
tip
For comprehensive documentation, see the Alembic documentation.
Common Migration Scenarios
Never Edit Migration Files After Running Them
Once a migration has been applied to any database (dev, staging, or production):
- Never edit the migration file - create a new migration instead
- Never delete migration files - they're part of your database history
- Always test migrations locally before deploying
If you need to fix a mistake, create a new migration that corrects it.
Editing the current schema
- Edit the schema (add/edit/remove tables and fields) in
common/src/common/models/db_models.py(Pydantics withtable=True) - Run
uvx invoke create-migration "Add priority to tasks": this will create a migration script by checking the diff between the current DB (see.env) and the one fromdb_models.py - IMPORTANT: Check the generated migration script in
common/alembic/versions/. Alembic might miss some changes or generate incorrect migrations - When everything seems good and you want to update the DB, use:
uvx invoke update-db
Data Migration
Safe Data Migrations
When migrating data:
- Add new column as nullable first - allows existing rows to have NULL
- Populate data - fill the new column with appropriate values
- Make non-nullable after - enforce constraint once data is populated
- Always provide downgrade path - be able to rollback if needed
# Migration with data transformation
from alembic import op
import sqlalchemy as sa
def upgrade() -> None:
# Add new column
op.add_column('task', sa.Column('status', sa.String(20), nullable=True))
# Migrate data
connection = op.get_bind()
connection.execute(
"UPDATE task SET status = CASE WHEN completed THEN 'completed' ELSE 'pending' END"
)
# Make column not nullable
op.alter_column('task', 'status', nullable=False)
Creating Indexes
def upgrade() -> None:
# Create composite index
op.create_index('ix_task_user_completed', 'task', ['user_id', 'completed'])
# Create partial index (PostgreSQL)
op.execute(
"CREATE INDEX ix_task_active ON task (user_id) WHERE completed = false"
)
def downgrade() -> None:
op.drop_index('ix_task_user_completed', 'task')
op.execute("DROP INDEX ix_task_active")
Migration Best Practices
1. Always Review Generated Migrations
Common Autogeneration Issues
Alembic autogenerate might miss:
- Custom indexes or constraints
- Column type changes (e.g., String length changes)
- Enum value additions
- Table or column renames (seen as drop + create)
Always manually review and edit generated migrations!
# Generate migration
uvx invoke create-migration "Add task categories"
# Review the generated file before applying
# Check: common/alembic/versions/[hash]_add_task_categories.py
2. Use Descriptive Migration Messages
# Good migration messages
uvx invoke create-migration "Add task categories and tags"
uvx invoke create-migration "Add user preferences table"
uvx invoke create-migration "Fix task due_date timezone issues"
# Avoid generic messages
uvx invoke create-migration "Update models" # Too vague
3. Handle Large Data Migrations
# For large tables, process in batches
def upgrade() -> None:
connection = op.get_bind()
# Process in batches of 1000
batch_size = 1000
offset = 0
while True:
result = connection.execute(
f"SELECT id FROM task LIMIT {batch_size} OFFSET {offset}"
)
rows = result.fetchall()
if not rows:
break
# Update batch
task_ids = [str(row[0]) for row in rows]
connection.execute(
f"UPDATE task SET status = 'pending' WHERE id IN ({','.join(['%s'] * len(task_ids))})",
task_ids
)
offset += batch_size
Migration Checklist
Before deploying migrations to production:
- Review generated migration files
- Test migration up and down on development data
- Backup production database
- Plan for rollback if needed
- Check for breaking changes in application code
- Coordinate with team for deployment