-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify the reasoning for Define Model Class Migrations #284
Comments
There are two more things that recently hit me:
class MigrationProduct < ActiveRecord::Base
self.table_name = :products
end
class ModifyDefaultStatusForProducts < ActiveRecord::Migration while the class MigrationUser < ActiveRecord::Base
self.table_name = 'accounts' # not 'users'
end The suggestion would be to define migration model classes inside migration classes: class ModifyDefaultStatusForProducts < ActiveRecord::Migration
class MigrationProduct < ActiveRecord::Base
self.table_name = :products
end
... |
Still, the issue was as following: # migration 1
class AddCategoryToFoos < ActiveRecord::Migration[6.0]
def change
add_column :foos, :category, :string
end
end
# migration 2
class BackfillCategoryToFoos < ActiveRecord::Migration[6.0]
class MigrationFoo < ActiveRecord::Base
self.table_name = 'foos'
end
def change
MigrationFoo.find_each { |foo| foo.update!(category: 'primary') } # <= The real problem is here
end
end
# migration 3
class RestrainCategoryToFoosToNotNull < ActiveRecord::Migration[6.0]
def change
change_column_null :foos, :category, false # BOOM, `PG::NotNullViolation: ERROR: column "category" contains null values`
end
end The problem is with I don't have a good explanation as to why this happened. My theory is that if those two last migrations were executed in the same run, Rails loaded both, and immediately cached model's schema for Might be related to:
But I couldn't find One suggestion would be to dynamically create model classes instead: def change
MigrationFoo = Class.new(ActiveRecord::Base) { self.table_name = 'foos' }
MigrationFoo.find_each { |foo| foo.update!(category: 'primary') }
end but I can't recommend it without testing it. And I have some doubts if schema cache will be updated for a new model pointing to the same table, or an existing cached definition (that e.g. lacks |
Another approach is to use: def change
MigrationFoo.reset_column_information
MigrationFoo.find_each { |foo| foo.update!(category: 'primary') }
end See https://stackoverflow.com/a/9728033/202914 for explanation.
Apparently, migrations are loaded all at once, or somehow column information is shared between I'm on the fence if it's worth adding to the guide, as it's for this relatively rare case when a project tries to keep migrations from early days in a state that allows running |
|
One case when it's not possible to properly use migration model in migration when you use it to schedule an Active Job job: class MigrationUser < ActiveRecord::Base
self.table_name = :users
end
def up
MigrationUser.find_each { |user| SyncWithThirdParty.perform_later(user) }
end as this will end up with a failure to deserialize it, as the global id would use a weird model name. As a workaround, global id key should be overridden in the migration model. |
An example of data corruption (logical):
In the case of One could argue this is a problem with the usage of default scopes, and I'd also agree. Doesn't change the fact that avoiding direct references to your models in migrations prevents this kind of error. Thought it was worth sharing. |
I agree with not using regular models in migrations - that's just asking for trouble, but I strongly disagree with defining model classes inside of migration - that is too cumbersome and goes against the first principle of the rails doctrine (Optimize for programmer happiness), since it's a fact that it causes headaches for engineers. I propose just using raw SQL to change data in a migration like this. I know some people will disagree (maybe even barf a little) when reading this. However, this produces simpler and cleaner code, is much easier for developers to use, but is controversial and not widespread. It's very common in Basecamp (https://discuss.rubyonrails.org/t/patterns-for-data-only-rails-migrations/75320/10) and plays nicely with what the migrations were designed to be. I'll quote the linked reply from DHH here for ease of readability
|
Can you think of a use case when an old migration could cause a serious data corruption?
Yes, it will most probably break
db:migrate
on a clean database.If you refer to a real model in an older migration (e.g. for
update_all
), Rails will cache column information.And subsequent migrations that would alter that table's schema won't affect the column cache.
This will lead to breakages due to an outdated (fixed for the moment when the model was first referred to in migrations) table schema. Specifically, if you add a column in a later migration and
update_all
that new column, this will result in an error.Also, if you run
rake db:migrate db:seed
in one command, this will also affect your seeds, and it will also fail with:The text was updated successfully, but these errors were encountered: