Select N+1 Problem
For developers, using an ORM (Object-relational mapping) can significantly simplify their lives, but it has its own set of catches that one needs to be aware of. One of the more serious ones is the Select N + 1 issue. In this article, you’ll learn about the N+1 problem and how to fix and prevent them.
The N + 1 problem is a performance problem that is recognised as an inefficient way to query a database where an application makes N + 1 database calls. So basically, you’re spamming your database with small queries instead of using one or two more complex ones. N is the critical issue here; N being large will cause performance to degrade by making extra database calls for a single transaction. However, this is something that can be overlooked if the database is small.
Okay, now let’s define the problem more sensibly. Why do we call this an N+1 problem? What is the N? what is the 1 here? Here’s how it usually goes down: Consider you have two database tables with a parent/child relationship. Collection of parent objects ( or called database rows/records) and each Parent has a collection of Child objects ( rows) where the relationship is a 1-to-many relationship. Let’s say you want to iterate through all the records, iterate through all the parents, and for each one, print out a list of the children. So you do this, the naive object-relational implementation:
SELECT id FROM Parent
and then executing a query for each record:
SELECT * FROM Child WHERE parent_id = ?
You have one select for the Parent (There is your 1), and then you have N additional selects for each Child (There is your N), where N is the total number of Child records.
The above approach isn’t necessarily a poor way of doing it, especially if there aren’t many parents or children. But what if you’re dealing with thousands of parent and child records? Suddenly each transaction is calling the database thousands of times, which is not an ideal situation. The problem could only get worse.
The solution is developer should tell in advance to the ORM that it will need additional data. The ORM can then fetch that data in advance. This solution is called Earger Fetching. Through eager fetching, whenever the top-level object is loaded, the associated collection will be loaded entirely as well, regardless if it will be used or not. However, one thing that should be noted is, it is common and recommended to map most associations for lazy Fetching, so knowing to use which fetching is very important.
The N+1 query problem is a general problem that could already exist in your codebase, and the hit to performance can be pretty significant where N is large. It’s hidden in plain sight and is costing you money. So it’s best to avoid this query pattern even when your app is small, so it doesn’t become a problem as you scale.