Avoiding train-wrecks in ActiveRecord

By Gavin Morrice on 29 Jul 2023

One of the problems I tend to notice a lot when working on a mature Rails project is that there are train-wrecks everywhere!

To avoid confusion, I don’t mean that I think the code is bad, or that I’m criticising its quality. Specifically, in this context, I am referring to a software anti-pattern that can make code buggier and harder to read by hiding important abstractions behind long, composed chains of methods and scopes.

The following example should demonstrate the problem:

class PetsController < ApplicationController
  def index
    @dogs = PetType.with_four_legs.omnivore.where(weight_kg: 2..60)
    @cats = PetType.with_four_legs.carnivore.where(weight_kg: 1..20)
    @rodents = PetType.with_four_legs.omnivore.where(weight_kg: 0.25..2)
  end
end

In this hypothetical pet store example, we have a controller action that loads a set of dog, cat, and rodent breeds—all of which are represented by the model PetType.

It’s a contrived example to demonstrate the point, but one doesn’t have to look far in a Ruby on Rails codebase to see chains of ActiveRecord scope methods being called on a model or ActiveRecord::Relation object like this. This chain of methods all smooshed together like a crashed train with many carriages is what some people affectionately call a “train-wreck”.

Here’s why it’s a problem…

Why is this a problem?

Readability

When we chain multiple scope methods together like this, it’s not always immediately clear what this chain of methods represents in the context of your application. Someone else reading your code, or even future you, will have to do a fair bit of work mentally composing those scopes before it’s clear what this code is supposed to accomplish.

The PetTypes example is a simple one to understand, but more typically an ActiveRecord train-wreck will look like this:

@transactions = Transaction.system.pending.reviewed.where(created_at: 1.week.ago…)

I just wrote this above example, and even I don’t know what that chain of scopes is supposed to represent 🤷‍♂️

Reliability

Readability aside, this anti-pattern can also make code more prone to bugs, particularly if the same train-wreck of scopes is expressed in more than one place in your code.

Continuing the pet store example, what if we have two separate controllers that load all of the dog PetTypes like this? When future product requirements mean we decide to add an extra scope to our definition dog, we must further extend the controller code to include this scope:

 @dogs = PetType.with_four_legs.omnivore.where(weight_kg: 2..60).where(uses_leash: true)

But if we use this same chain of scopes in more than one place, then we have to remember to update each of them in the same way. In a large codebase, these duplicated areas are not always easy to detect, and the particular methods in the chain might expressed in a different order, further hiding the similarity. Having two inconsistent ways of expressing the same concept in a codebase can often lead to bugs.

Testability

Lastly, expressing code like this can make it difficult to unit test classes that consume the class on which the train-wreck is being called. This is because there isn’t a simple and clearly defined interface to test interactions through.

Unit testing our PetsController would require us to create a collection of PetTypes, each with different properties, and then test that the desired instance variables contain the expected PetTypes. This is a very laborious and frustrating way of unit testing controller actions, and can only be achieved by including expensive set-up that is outside of the scope of a controller unit test.

What’s the solution?

From a more fundemental code-design perspective, the problem with the code described in the first pet shop example is not that we are chaining multiple ActiveRecord scopes together. The problem is in where we are chaining the scopes together.

The example implementation relies on the controller holding the definitions of what a dog, cat, and rodent are within its methods, rather than these definitions being made available to the controller by the PetType model. This is an improper use of the respective architectural layers, as we are defining business domain concepts within the application controller layer of our codebase.

A better way to express the code in our pet shop example, would be to move these chained scopes into class methods on the PetType model itself:

class PetsController < ApplicationController
  def index
    @dogs = PetType.dog
    @cats = PetType.cat
    @rodents = PetType.rodent
  end
end

class PetType < ApplicationRecord
  DOG_WEIGHT_RANGE_KG = 2..60
  CAT_WEIGHT_RANGE_KG = 1..20
  CAT_WEIGHT_RANGE_KG = 0.25..2

  scope :dog, -> {
    with_four_legs.omnivore.where(weight_kg: DOG_WEIGHT_RANGE_KG)
  }
  scope :cat, -> {
    with_four_legs.carnivore.where(weight_kg: CAT_WEIGHT_RANGE_KG)
  }
  scope :rodent, -> {
    with_four_legs.omnivore.where(weight_kg: RODENT_WEIGHT_RANGE_KG)
  }

  # …

end

Not only is this controller action more readable at a glance, but by moving the definitions of dogs, cats, and rodents into the domain model we’ve removed the risk of having multiple expressions for the same concept—defining specific methods for each abstraction. Finally, unit testing our controller is now simple and easy, as we only need to check that the correct scope is called on the PetType and its value is assigned to the right variable. Now, our tests won’t even have to touch the database.

Explicit abstractions and DDD

In Domain Driven Design, we describe the problem shown in the first implementation as having too many implicit abstractions.

Dogs, cats, and rodents are evidently meaningful concepts or abstractions within this application, otherwise there wouldn’t be code to collect them together. But until we refactored the code we had no way of explicitly defining what a dog, cat, or rodent type is. The abstraction was implied, but we refactored our code to make it explicit.

By constantly looking for opportunities to refactor our code in this way, we make it more readable and maintainable, as we move towards what Eric Evans calls a Deep Model of our business domain.

It’s generally a good idea to define single method interfaces for controllers to interact with models through. A good rule of thumb to follow, to help you develop a deep model in your Rails application, is to avoid calling any of the default ActiveRecord scope methods (where, order, limit) etc. outside of the model they are being called upon.