Skip to main content

Database Migrations

This guide covers database migrations using Alembic with SQLModel.

tip

For comprehensive documentation, see the Alembic documentation.

Common Migration Scenarios

:::danger Never Edit Migration Files After Running Them

Once a migration has been applied to any database (dev, staging, or production):

  • Never edit the migration file - create a new migration instead
  • Never delete migration files - they're part of your database history
  • Always test migrations locally before deploying

If you need to fix a mistake, create a new migration that corrects it. :::

Editing the current schema

  1. Edit the schema (add/edit/remove tables and fields) in common/src/common/models/db_models.py (Pydantics with table=True)
  2. Run uvx invoke create-migration "Add priority to tasks": this will create a migration script by checking the diff between the current DB (see .env) and the one from db_models.py
  3. IMPORTANT: Check the generated migration script in common/alembic/versions/. Alembic might miss some changes or generate incorrect migrations
  4. When everything seems good and you want to update the DB, use: uvx invoke update-db

Data Migration

:::tip Safe Data Migrations

When migrating data:

  1. Add new column as nullable first - allows existing rows to have NULL
  2. Populate data - fill the new column with appropriate values
  3. Make non-nullable after - enforce constraint once data is populated
  4. Always provide downgrade path - be able to rollback if needed :::
# Migration with data transformation
from alembic import op
import sqlalchemy as sa

def upgrade() -> None:
# Add new column
op.add_column('task', sa.Column('status', sa.String(20), nullable=True))

# Migrate data
connection = op.get_bind()
connection.execute(
"UPDATE task SET status = CASE WHEN completed THEN 'completed' ELSE 'pending' END"
)

# Make column not nullable
op.alter_column('task', 'status', nullable=False)

Creating Indexes

def upgrade() -> None:
# Create composite index
op.create_index('ix_task_user_completed', 'task', ['user_id', 'completed'])

# Create partial index (PostgreSQL)
op.execute(
"CREATE INDEX ix_task_active ON task (user_id) WHERE completed = false"
)

def downgrade() -> None:
op.drop_index('ix_task_user_completed', 'task')
op.execute("DROP INDEX ix_task_active")

Migration Best Practices

1. Always Review Generated Migrations

:::warning Common Autogeneration Issues

Alembic autogenerate might miss:

  • Custom indexes or constraints
  • Column type changes (e.g., String length changes)
  • Enum value additions
  • Table or column renames (seen as drop + create)

Always manually review and edit generated migrations! :::

# Generate migration
uvx invoke create-migration "Add task categories"

# Review the generated file before applying
# Check: common/alembic/versions/[hash]_add_task_categories.py

2. Use Descriptive Migration Messages

# Good migration messages
uvx invoke create-migration "Add task categories and tags"
uvx invoke create-migration "Add user preferences table"
uvx invoke create-migration "Fix task due_date timezone issues"

# Avoid generic messages
uvx invoke create-migration "Update models" # Too vague

3. Handle Large Data Migrations

# For large tables, process in batches
def upgrade() -> None:
connection = op.get_bind()

# Process in batches of 1000
batch_size = 1000
offset = 0

while True:
result = connection.execute(
f"SELECT id FROM task LIMIT {batch_size} OFFSET {offset}"
)
rows = result.fetchall()

if not rows:
break

# Update batch
task_ids = [str(row[0]) for row in rows]
connection.execute(
f"UPDATE task SET status = 'pending' WHERE id IN ({','.join(['%s'] * len(task_ids))})",
task_ids
)

offset += batch_size

Migration Checklist

Before deploying migrations to production:

  • Review generated migration files
  • Test migration up and down on development data
  • Backup production database
  • Plan for rollback if needed
  • Check for breaking changes in application code
  • Coordinate with team for deployment