Single vs Split Query in EF Core
Single Queries and their Problems
By default while working with LINQ queries in EF queries are treated as an single queries unless we define. Let’s try to understand it with an example.
using (var context = new BloggingContext())
{
var blogs = context.Blogs
.Include(b => b.Posts)
.Include(b => b.Contributors)
.ToList();
}
SELECT
[b].[Id],
[b].[Name],
[p].[Id],
[p].[BlogId],
[p].[Title],
[c].[Id],
[c].[BlogId],
[c].[FirstName],
[c].[LastName]
FROM
[Blogs]AS[b]
LEFT JOIN [Posts]AS[p]ON[b].[Id]=[p].[BlogId]
LEFT JOIN [Contributors]AS[c]ON[b].[Id]=[c].[BlogId]
ORDER BY
[b].[Id],
[p].[Id];
You can notice that from generated SQL that this query was treated as a single query. This approach can create few issues for us.
- Cartesian Explosion
- Data Duplication
Cartesian Explosion
In this example, since both Posts and Contributors are collection navigations of Blog they’re at the same level - relational databases return a cross product: each row from Posts is joined with each row from Contributors. This means that if a given blog has 10 posts and 10 contributors, the database returns 100 rows for that single blog. This phenomenon - sometimes called cartesian explosion.
Data Duplication
If we don’t specify SELECT then it will bring all columns from all tables if we are joining 3 tables it will bring all columns of 3 tables. For few numbers of columns it will not create issue but for large number of columns it is going to give us a performance heart attack.
Split Queries and its Benefits
To solve these issues EF has introduced a feature of Split Query , it can solve both issues of Single Queries. Let’s see through example.
using (var context = new BloggingContext())
{
var blogs = context.Blogs
.Include(blog => blog.Posts)
.AsSplitQuery()
.ToList();
}
-- First query for Blogs
SELECT
[b].[BlogId],
[b].[OwnerId],
[b].[Rating],
[b].[Url]
FROM [Blogs] AS [b]
ORDER BY [b].[BlogId];
-- Second query for Posts
SELECT
[p].[PostId],
[p].[AuthorId],
[p].[BlogId],
[p].[Content],
[p].[Rating],
[p].[Title],
[b].[BlogId]
FROM [Blogs] AS [b]
INNER JOIN [Posts] AS [p]
ON [b].[BlogId] = [p].[BlogId]
ORDER BY [b].[BlogId];
We can globally enable split query behavior and if not needed we can override it as well.
Although Split Query has some advantages as described above but it has some draw backs as well.
-
While most databases guarantee data consistency for single queries, no such guarantees exist for multiple queries if data updated concurrently.
-
Each query currently implies an additional network roundtrip to your database which can cause degrade in performance.
-
While some databases allow consuming the results of multiple queries at the same time (SQL Server with MARS, SQLite), most allow only a single query to be active at any given point.
Whenever you're ready, there are 3 ways I can help you:
- Subscribe to my youtube channel : For in-depth tutorials, coding tips, and industry insights.
- Promote yourself to 9,000+ subscribers : By sponsoring this newsletter
- Patreon community : Get access to all of my blogs and articles at one place