Alex Bartlow · 2022-04-22 · postgresql, devops, backend, logging

Optimizing with the PostgreSQL deterministic query planner

Feed the planner, trust the plan

Of all the Aha! engineering tool expenses, the money I'm happiest to spend is on a big RDS instance running PostgreSQL. It powers our full-text search, our reporting, and even some of our analytics. If you run it at scale, I recommend dedicating a Friday to curling up somewhere cozy with a cup of coffee and reading the fine manual. It will be time well spent.

PostgreSQL is an easy tool to start using. The defaults provide a fast, secure experience out of the box. However, once you scale to a certain size, performance can start to degrade in some cases. Most people blame the query planner when this happens and start trying to find ways to "trick" the planner into getting a high-performance query.

The problem with these tricks is that they usually end up biting you later — after you've grown as a business for a couple of years and have far higher throughput on your database. If there's one thing to take away from this article, it is this:

The PostgreSQL query planner will always discover the optimum query plan, given its statistical knowledge of your system and your data.

We've discovered some tuning settings in the manual that give the planner the information it needs to do its job correctly. I'm going to cover three settings that we've adjusted over the last couple of years that have improved our query performance up to 20,000%.

Set the right cost for page access

One decision the planner sometimes makes is to full-scan your table instead of using an index. This may seem counter-intuitive, but remember, the planner will always discover the optimum plan. Using an index requires reading random pages from disk; whereas reading the full table gets to read pages sequentially, which is faster.

For a table of a certain size, on a slow hard drive not already cached in ram, with a query that will return a lot of rows — the full scan will be faster. The planner is right. That is, unless you're running a well-provisioned database (caching a large fraction of your data ^cacherate) with a fast NVME disk. (It's 2022 — these things are cheap; use one!) In that case, the index would be faster. How do we tell the planner that we have well-cached data and that the disk is really good at random reads?

The query planner's estimates for the relative speed difference between a random and a sequential read is determined by (random_page_cost and seq_page_cost). The default values for these columns assume that a random read from disk is going to be 40x slower than a sequential read from disk and that 90% of your disk pages will be cached in memory. These defaults assume a modestly sized database server using spinning-metal storage device.

Using a fast NVME drive, you don't have a 40x latency multiple for random access — that number is probably closer to 5–10x. Additionally, if your ram cache rate is close to 99%, then the random_page_cost parameter should be set to something like 1.05. (Multiply how much slower random access is than sequential by the inverse of your cache rate, and you will get a proper value for random_page_cost. 5 * (1 / 0.99) = 1.05)

After implementing these cost changes, we noticed PostgreSQL would stop deciding to full-scan tables unless the tables were very small. This lead to a drastic improvement in our query performance around custom pivots.

Set up statistics for dependent columns

Let's assume the following query, with all three relevant columns indexed:

select count(*) from features where account_id=1 and product_id=2 and initiative_id=3

By default, PostgreSQL assumes that each column is independent of all other columns on your table. This can lead to PostgreSQL deciding to do a bitmap merge of all three indexes, assuming that you're going to get a different set of results for each index, like this:

A Venn diagram showing three independently overlapping circles.

However, in reality, the data looks like this:

A Venn diagram showing one large circle containing a smaller circle, which mostly contains a third small circle.

Given an individual product_id or initiative_id on a feature, all features with that initiative_id or product_id will have the same account_id. Moreover, features with the same initiative_id will be highly likely to have the same product_id. So how do we tell the planner that if we know the initiative_id, we can basically ignore the account_id and product_id checks for the initial data fetch process?

These statistical dependencies can be leveraged by the query planner to decide that only one index fetch on initiative_id is necessary. That is the most selective index, so it will give us the smallest result set. The result is likely to be small and the rows that have the initiative_id are likely to have the same product_id and account_id. So doing the one index lookup and then filtering the results in memory is almost assuredly faster than setting up a bitmap scan, which requires going out to disk for random page reads.

Use CREATE STATISTICS features_stats (dependencies) on account_id, product_id, initiative_id from features; analyze features. PostgreSQL will sample those columns and keep track of the coefficient of correlation between their values, coming up with better query plans. In this example, the index on initiative_id is going to be the most selective (largest total cardinality), which means that selecting on that will give us a small number of rows. Then, since all items with initiative_id=3 are probably going to have product_id=2, are certainly going to have account_id=1, and the result set from the initiative_id index can be filtered in memory, all of those rows will likely make it through the check anyway.

The example I describe here is not hypothetical. We found a pathological case where a query for one customer was taking 20 seconds. By adding dependent statistics to the table, the same query's time was reduced to less than a millisecond = 20,000% improvement.

Beware the join collapse

Many candidates that I interview say something to the effect of "PostgreSQL falls over when you have too many joins." This is a commonly known phenomenon but very few candidates know the exact reason for this.

PostgresSQL's query planner by default will exhaust every possibility trying to determine the fastest way to execute a query you request. The complexity of this is exponential with the amount of joins added. Attempting to plan the join of many tables together when the join order is not constrained could take longer than doing the actual query.

To avoid this planner's analysis paralysis, PostgreSQL also has a genetic algorithm-based query optimizer, which attempts to find a heuristic solution. Most of the time, it's pretty good but sometimes the results are pretty bad. The decision to use heuristic planning instead of deterministic planning is determined by the geqo_threshold. The amount of work the heuristic planner does is determined by geqo_effort , which is a simple 1-10 scale with a default to 5.

Since the cost of a bad query is far more than the cost of some extra time planning, we've opted to increase our geqo_threshold to 20. This covers the vast majority of our reporting queries. We're also looking into increasing our geqo_effort values, since when we're joining more than 20 tables together, it's worth spending extra time planning to ensure that we get the query plan right. This is a continued area of experimentation for us - Aha! has a lot of very small, optimized queries; it also has some very large, gnarly ones in order to report on customer's data in the way the customer desires. Striking a balance between these extremes appears to be a bit of an art form.

Additionally, it pays to increase the values of join_collapse_limit and from_collapse_limit. These parameters determine the maximum number of tables for which PostgreSQL is willing to optimize the join order. And if you have a lot of tables, get the query plan right so you can save far more time executing the query.

Feed the planner, trust the plan

The deterministic query planner will always find the optimal result. If the planner is generating bad plans, it means you need to find ways to give it better statistical data. Instead of "tricks," use the prescribed methods that come with PostgreSQL to instruct the planner on the shape of your data. You'll be very happy with the result. And seriously, read the manual.

Sign up for a free trial of Aha! Develop

Aha! Develop is a fully extendable agile development tool. Prioritize the backlog, estimate work, and plan sprints. If you are interested in an integrated product development approach, use Aha! Roadmaps and Aha! Develop together. Sign up for a free 30-day trial or join a live demo to see why more than 5,000 companies trust our software to build lovable products and be happy doing it.

You can check your database cache rate with the following query. You want your cache hit rate to be near 99%. If it's lower than 95% for heavily accessed tables, consider increasing your database's RAM.
```
SELECT
  sum(heap_blks_read) as heap_read,
  sum(heap_blks_hit)  as heap_hit,
  sum(heap_blks_hit) / (sum(heap_blks_hit) + sum(heap_blks_read)) as ratio
FROM
  pg_statio_user_tables;
```
↩