Using Nested Selects for Performance in Rails
June 26, 2015

Using Nested Selects for Performance in Rails

by Zach Schneider

Databases are fast, even at performing fairly complex operations. This is easy to forget in the age of ORMs and abstraction and many of us haven’t written a line of raw SQL in months.

But a solid, production-ready SQL database is mature, low-level, and thoroughly optimized for the task it’s designed to do: create, read, update, and delete a set of well-structured records.

It’s important to keep the speed of your database in the back of your mind at all times — especially for Rails developers. The convenience of the ActiveRecord ORM does not always translate into a performant result; even with eager loading and joining, it’s often easy to write an n+1 query that kills production performance. This is especially true with a complex Rails architecture, which can magnify even small mistakes in optimization.

Let’s take a relatively common case.

Suppose we have a simple blog application written in Rails. It has a Post model (with predictable attributes like title, content, and author) and a Comment model (with post_id, body, and author). Now, let’s say we want to display some recent posts in a table, and include a column for the number of comments that have been made on each post.

We might write something like this:

Pretty straightforward. But what happens behind the scenes?

ActiveRecord will first execute a query to get the list of posts (e.g. “select * from posts;”) and then execute a subsequent query for each post to calculate how many comments it has (e.g. “select count(*) from comments where post_id = ?;”). A classic n+1 problem; we end up with an unbounded number of queries as the number of posts increases.

We can solve this problem with eager loading: @posts.eager_load(:comments) will instruct ActiveRecord to include the comments in the initial query and thus prevent any further queries. When it works, eager loading is a simple and elegant solution to n+1 query problems. But unfortunately, it often only actually helps in the most trivial cases.

To demonstrate this, let’s now posit that comments has an additional is_approved boolean attribute, so inappropriate comments can be removed by a moderator. Obviously, we only want to include comments that are visible in the comment count. But there is no easy way to do this in ActiveRecord without sacrificing eager loading.

Adding a new where query to the comments relation causes ActiveRecord to discard the eager loaded data and perform a new query with an appropriate WHERE clause; the n+1 problem has come roaring back. As you can imagine, this quandary only worsens as you add layers of complexity to the application.

What if we were using raw SQL to load the posts and comment counts? This problem would be easy to solve. We would write a single query, such as:

But unfortunately, there is not a convenient way of achieving this result built into the ActiveRecord ORM, short of running a completely custom SQL query from scratch. The exceptional convenience of the ActiveRecord paradigm has come into conflict with the expedient performance and flexibility of raw SQL.

To solve this problem at Aha! (which is product roadmap software), we built and open-sourced a Ruby gem called calculated_attributes. The gem extends ActiveRecord to permit definition of calculated attributes on models using a bit of raw SQL in a lambda. Returning to the example above, calculated_attributes would allow us to define a :comments_count attribute on the Post model:

And then reference it in the controller/view to automatically incorporate the comments count in the ActiveRecord SQL query:

Presto! An elegant solution to eliminate the n+1 query problem and incorporate the flexibility of raw SQL into the ActiveRecord paradigm.

ORMs and abstractions such as ActiveRecord are incredibly useful in eliminating boilerplate code and allowing Agile development teams to rapidly implement new features. But they also allow for mounting performance problems if used carelessly.

It is important when writing Rails to think carefully about what the abstractions are doing behind the scenes and find or build in optimizations when necessary. If you want to push the limits of what is possible with Rails at at a rapidly growing SaaS company founded by two Silicon Valley veterans, check out our careers page.

When have you been burned by a performance problem caused by a naive abstraction? Join the discussion on Hacker News.

Zach Schneider

Zach Schneider

Zach was a Director of Software Engineering at Aha! — the world’s #1 product development software.

Follow Aha!

Related articles

Most Popular New Aha! Features Launched in Q4 2021
December 22, 2021
Most Popular New Aha! Features Launched in Q4 2021

What does it take for product teams to be happy? I believe it comes down to delivering real value — for customers and the business. You need to have the right people in…

Share Visual Roadmaps With Customers via Ideas Portals
January 19, 2022
Share Visual Roadmaps With Customers via Ideas Portals

Eighteen thousand ideas portals. Three million ideas. Altogether our customers collect an incredible amount of information from their own customers via portals in Aha!…

Create One Unified Product and Engineering Backlog in Aha! Develop
December 2, 2021
Create One Unified Product and Engineering Backlog in Aha! Develop

How do you balance building new product functionality against improving technical capabilities? Many product development teams struggle with this challenge. You want to…

Aha! Commits $400,000 to Charitable Giving in 2022
December 20, 2021
Aha! Commits $400,000 to Charitable Giving in 2022

Learn about the history of our philanthropic giving program Aha! Cares and the nonprofit organizations that are receiving funding in 2022.