Skip to main content

Rails Database — Migration Conventions

This page sets one rule for all data migrations in this repo. Follow it on every PR.

The rule

Inside any data migration, do not call your domain models directly. Define small, throwaway classes scoped to the migration instead, namespaced under modules that mirror the real domain.

A "data migration" is a migration that reads or writes rows — anything that uses find_each, update, update_columns, where(...).update_all, or similar. A pure schema migration (only add_column, create_table, add_index, etc.) does not need this rule.

Why the rule exists

Migrations and models live on different timelines.

  • A migration is a snapshot in time. The file 20250728034851_populate_uuid_in_users.rb describes what was true on 28 July 2025 and is supposed to be replayable forever in that same form.
  • A model like Identities::User is current code. It reflects whatever the model file says today, including every enum, validation, callback, and association anyone has added since the migration was written.

When you run bin/rails db:migrate from an empty database, Rails replays every migration in order, oldest first. At migration 20250728034851, your Identities::User class loads — but it loads today's version, not the version from July 2025. If today's model code depends on a column or table that a later migration creates, the load fails. The migration that broke wasn't the one that crashed; the migration that broke was the one that introduced the model dependency, months later.

A real example from this repo

We hit this exact bug. Here is the timeline:

  1. 2025-07-28 — A migration was written that calls Identities::User.where(uuid: nil).find_each. At the time, Identities::User had no gender enum and no reference to a gender column. The migration ran cleanly on every laptop.

  2. 2026-03-13 — A later migration added a gender column to identities_users (add_column :identities_users, :gender, :string, default: 'male', null: false).

  3. Later still — An engineer added enum(:gender, { male: 'male', female: 'female' }) to Identities::User to take advantage of Rails' enum helpers.

  4. Today — A new engineer runs bin/rails db:create db:migrate from empty. Migration 20250728034851 runs first. It loads Identities::User. Rails sees the gender enum on the model. To set up the enum, Rails needs to know the type of the underlying gender column. It looks at the identities_users table — the column does not exist yet, because the March 2026 migration has not run. Rails 8 raises:

    Undeclared attribute type for enum 'gender' in Identities::User.
    Enums must be backed by a database column or declared with an explicit
    type via `attribute`.

The migration file from July 2025 was never edited. It is broken because the model around it changed underneath it. A migration that worked on day one is now broken on day three hundred. This is the failure mode the rule prevents.

Note: older Rails versions (≤ 7) silently created the enum with a nil type and let the migration through. Rails 8 is stricter on purpose — silently broken enums cause subtle data bugs in production. Rails 8 is doing you a favour by catching this at migration time.

Why this is hard to spot

Engineers run migrations on their existing local database. The gender column already exists there from a previous run. The model loads cleanly. The migration is fine on their laptop, every day, for months. The bug only appears on a fresh database — which means CI, or a new engineer joining the team. Both are rare moments, so the bug ships and hides until something forces a fresh-DB run.

This is also why the bug feels "unfair" when it surfaces. The engineer who originally wrote the migration did nothing wrong at the time. The engineer who added the enum did nothing obviously wrong either. The bug emerges from the combination, in a code path nobody routinely runs.

The pattern

Wrong (calls the live model)

class PopulateUuidInUsers < ActiveRecord::Migration[8.0]
def up
Identities::User.where(uuid: nil).find_each do |user|
user.update_columns(uuid: SecureRandom.uuid)
end
end
end

This loads the real Identities::User with all its enums, validations, callbacks, associations, and any custom Sorbet sigs. Any one of them can break the migration in the future when the model evolves but the migration does not.

Right (throwaway model nested inside the migration class)

class PopulateUuidInUsers < ActiveRecord::Migration[8.0]

module Identities
class User < ActiveRecord::Base
self.table_name = 'identities_users'
end
end

def up
Identities::User.where(uuid: nil).find_each do |user|
user.update_columns(uuid: SecureRandom.uuid)
end
end
end

The Identities::User inside up resolves to PopulateUuidInUsers::Identities::User — a brand-new class scoped under the migration. It has no enums, no validations, no callbacks, no associations, no Sorbet types. It only knows how to read and write rows in the identities_users table. That is all the migration needs.

This local class will work the same way one year from now, no matter how the real ::Identities::User changes. The migration is now frozen in time, the way a migration should be.

For migrations that touch multiple tables, define one nested module per domain:

class BackfillAddressGeoAreaIdAndAddConstraintToIdentitiesUsers < ActiveRecord::Migration[8.0]

module Geo
class Area < ActiveRecord::Base
self.table_name = 'geo_areas'
end
end

module Identities
class User < ActiveRecord::Base
self.table_name = 'identities_users'
end
end

def up
transaction do
default_geo_area = Geo::Area.find_by!(name: 'Singapore')

Identities::User.where(address_geo_area_id: nil).in_batches do |users|
users.update_all(address_geo_area_id: default_geo_area.id)
end

change_column_null :identities_users, :address_geo_area_id, false
end
end
end

Why nest the module inside the migration class?

Notice the right example writes module Identities inside the class PopulateUuidInUsers block. This placement matters.

Ruby's constant definition rules: when you write module Identities inside class PopulateUuidInUsers, Ruby creates PopulateUuidInUsers::Identities — a brand-new constant in the migration class's namespace. It does not look up or re-open the existing top-level ::Identities. The two constants happen to share a short name but live in different namespaces.

Inside up, when you write Identities::User.where(...), Ruby's constant lookup walks outward from the current scope. It finds Identities inside PopulateUuidInUsers first, so it resolves to the migration-local one — never touching ::Identities::User.

Wrong (top-level module re-opens the real one)

# Outside the migration class — at the top of the file:
module Identities
class User < ActiveRecord::Base
self.table_name = 'identities_users'
end
end

class PopulateUuidInUsers < ActiveRecord::Migration[8.0]
def up
Identities::User.where(uuid: nil).find_each { |user| ... }
end
end

This is what you must NOT do. At top level, module Identities looks up Identities via constant lookup. The autoloader finds the real ::Identities module from app/domains/identities/. Now you are re-opening it. The class User < ActiveRecord::Base line either re-opens the real model (and inherits all its enums, validations, callbacks — defeating the entire purpose) or raises TypeError: superclass mismatch for class User because the real model has a different superclass via the multi-DB abstract base setup.

The fix is always the same: keep the module ... class ... end end block inside class <YourMigration> < ActiveRecord::Migration[...]. Nest, do not re-open.

When to use a throwaway model

Use it whenever the migration reads or writes data:

  • Backfilling values into a new column (the PopulateUuidInUsers example)
  • Splitting one column into two
  • Renaming enum values across existing rows
  • Cleaning up duplicates
  • Any find_each, each, update, update_all, update_columns on existing rows

If the migration touches data and you find yourself typing Identities::, Org::, Careers::, or any other domain prefix at the call site — stop and check. The references in up should resolve to a nested module inside the migration class, not the live domain model.

When you do not need a throwaway model

Pure schema migrations are safe — they do not load any models:

class AddSlugToCareersJobs < ActiveRecord::Migration[8.0]
def change
add_column :careers_jobs, :slug, :string
end
end

add_column, create_table, add_index, change_column_null, remove_column, rename_column — all fine. They speak directly to the database through ActiveRecord's schema API without loading model code.

You will see these in some migrations. Here is what they do and when to use them.

Model.reset_column_information

When you add a column and then immediately use it from a model class in the same migration, the model's column cache is stale. The class was loaded before the column existed, so it has no idea the new column is there. You must tell the model to re-read its column list:

class AddRoleToUsersAndBackfill < ActiveRecord::Migration[8.0]

module Identities
class User < ActiveRecord::Base
self.table_name = 'identities_users'
end
end

def up
add_column :identities_users, :role, :string

# Without this, Identities::User does not know :role exists yet.
# The class above was defined before add_column ran, so its column
# cache is from before the schema changed.
Identities::User.reset_column_information

Identities::User.update_all(role: 'member')
end
end

Without reset_column_information, the update_all either silently does nothing or raises unknown attribute :role. This trips up engineers regularly because it only happens when the column is added inside the same migration.

disable_ddl_transaction!

By default, Rails wraps each migration in a database transaction. If anything fails halfway, the whole migration rolls back and the database is left exactly as it was before. This is almost always what you want.

A few Postgres operations cannot run inside a transaction. The most common one is CREATE INDEX CONCURRENTLY, which lets Postgres build an index without locking writes on a large production table — essential for tables with millions of rows. Postgres refuses to run it inside a transaction.

For these cases, you opt out:

class AddIndexConcurrentlyToGigShifts < ActiveRecord::Migration[8.0]
disable_ddl_transaction!

def change
add_index :gig_shifts, :starts_at, algorithm: :concurrently
end
end

The trade-off: if the migration fails partway through, there is no transaction to roll back. You must clean up by hand (e.g. drop the partial index manually). For this reason, only use disable_ddl_transaction! when you actually need a non-transactional operation, and keep these migrations as small as possible — ideally one operation per migration.

Code review checklist

When reviewing a migration PR, ask:

  1. Does this migration touch row data (anything other than schema-shape changes)? If no, accept.
  2. If yes, does it call any domain model directly without a nested module wrapping it (e.g. Identities::User with no module Identities block above)? If yes, ask for the throwaway pattern.
  3. Is the module ... class ... end end block placed inside the class <YourMigration> < ActiveRecord::Migration[...] body? If it sits at top level outside the migration class, that is broken — it re-opens the real model.
  4. Does each throwaway class declare only self.table_name? If yes, accept. If it has enums, validations, or callbacks copied over, ask why — most of the time, removing them is correct.
  5. If the migration adds a column and then uses it from a throwaway class in the same up, is reset_column_information called between them?
  6. If the migration uses disable_ddl_transaction!, is the operation actually one that requires it (e.g. concurrent index)? Is the migration kept small?

Reference: nine real examples

PR #1596 audited and patched 9 historical data migrations to follow this convention. Read those diffs for working examples that touch single tables, multiple tables, slug generation, and enum value backfills.

How this rule is enforced

Humans cannot catch this every time. The Schema Drift Check CI workflow does not catch the failure mode this rule prevents — it only checks that migration filenames and schema_migrations versions agree. The throwaway-class pattern is enforced by convention and code review.

If you suspect a migration may be broken on a fresh DB, the only way to confirm is to drop, create, and migrate from empty:

bin/rails db:drop
bin/rails db:create
bin/rails db:migrate

Do this before opening a PR if the migration touches data.