HiveBrain v1.2.0
Get Started
← Back to all entries
patterncsharpMinor

Entity Framework query optimization

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
optimizationqueryframeworkentity

Problem

A user can track a show, and mark episodes and seasons of that show as watched. To support this I have the models below:

Models

public class Show
{
    public int ID { get; set; }
    public string Name { get; set; }
    ...

    public ICollection Seasons { get; set; }
    public ICollection UserShows { get; set; }

    public Show()
    {
        this.Seasons = new List();
        this.UserShows = new List();
    }
}

public class Season
{
    public int ID { get; set; }
    public string Name { get; set; }
    ...

    public ICollection Episodes { get; set; }
    public ICollection UserSeasons { get; set; }

    public Season()
    {
        this.Episodes = new List();
        this.UserSeasons = new List();
    }
}

public class Episode
{
    public int ID { get; set; }
    public string Name { get; set; }
    ...

    public ICollection UserEpisodes { get; set; }
    public Season Season { get; set; }

    public Episode()
    {
        this.UserEpisodes = new List();
        this.Season = new Season();
    }
}

public class WatchedSeason
{
    public string UserID { get; set; }
    public int SeasonID { get; set; }
    public bool Watched { get; set; }

    public ApplicationUser User { get; set; }
    public Season Season { get; set; }
}

public class WatchedEpisode
{
    public string UserID { get; set; }
    public int EpisodeID { get; set; }
    public bool Watched { get; set; }

    public ApplicationUser User { get; set; }
    public Episode Episode { get; set; }
}


The troubling query is trying to get a list of all shows the user is tracking, along with the latest episode they have watched for each tracking show.

ShowController.cs

```
public ActionResult UserShows()
{
var userID = User.Identity.GetUserId();

var shows = db.UserShows
.Where(x => x.UserID == userID && x.Tracking)
.Include(x => x.Show.Seasons.Select(y => y.UserSeasons))
.Include(x => x.Show.Seasons.Select(y => y.Episodes.Select(z => z.UserEpisode

Solution

The Problem

The performance killers are the Includes this query:

var shows = db.UserShows
    .Where(x => x.UserID == userID && x.Tracking)
    .Include(x => x.Show.Seasons.Select(y => y.UserSeasons))
    .Include(x => x.Show.Seasons.Select(y => y.Episodes.Select(z => z.UserEpisodes)))
    .ToList()


These Includes generate a very wide query (as you show), because all fields of as many as six tables are queried. Often, when tackling performance troubles, people focus on reducing the number of rows that are fetched by a query, but reducing the number of queried fields can be at least as beneficial. This is especially true for ORMs, where materializing entity objects can take a significant amount of time.

So it would make a lot of difference if you'd project to the view models, ShowViewModel and EpisodeViewModel in the LINQ query as IQueryable, i.e. not after ToList(). This would narrow down the query to only the fields required to populate these models.

However, you don't do this, obviously because this function call GetLatestEpisode is not supported in a LINQ-to-Enties query.

Then, of course, you also query too many records. In the end, you only need data from one Show and one Episode, but because of the Includes you fetch all seasons and all episodes of a show from the database.

The Solution

I think you can both reduce the number of queried rows and fields if you start the query at the bottom.

If a user watched an episode, they obviously also watched its season and its show. So if you get the latest episode directly, you don't have to worry about getting watched seasons and shows any more. You don't need this GetLatestEpisode function, because you start by getting the latest episode. Further, you can get required Show data by accessing parent navigation properties. Like this (only showing the essentials):

var shows = context.WatchedEpisodes
                   .Where(we => we.UserID == userId
                             && we.Watched)
                   .GroupBy(we => we.Episode.Season.Show)
                   .Select(grp => grp.OrderBy(we => we.Episode.Season.SeasonNumber)
                                     .ThenBy(we => we.Episode.EpisodeNumber)
                                     .Select(we => new ShowViewModel
                                     {
                                         ShowID = grp.Key.ID,
                                         Name = grp.Key.Name,
                                         // etc.
                                         LatestEpisode = new EpisodeViewModel
                                         {
                                             EpisodeID = we.EpisodeID,
                                             // etc.
                                         }
                                     }).FirstOrDefault());


Now you get only one record per show in the result set and only the required fields are included in the SELECT. You'll still see a number of joins, but they don't cause any widening of the query, as Includes do.

Code Snippets

var shows = db.UserShows
    .Where(x => x.UserID == userID && x.Tracking)
    .Include(x => x.Show.Seasons.Select(y => y.UserSeasons))
    .Include(x => x.Show.Seasons.Select(y => y.Episodes.Select(z => z.UserEpisodes)))
    .ToList()
var shows = context.WatchedEpisodes
                   .Where(we => we.UserID == userId
                             && we.Watched)
                   .GroupBy(we => we.Episode.Season.Show)
                   .Select(grp => grp.OrderBy(we => we.Episode.Season.SeasonNumber)
                                     .ThenBy(we => we.Episode.EpisodeNumber)
                                     .Select(we => new ShowViewModel
                                     {
                                         ShowID = grp.Key.ID,
                                         Name = grp.Key.Name,
                                         // etc.
                                         LatestEpisode = new EpisodeViewModel
                                         {
                                             EpisodeID = we.EpisodeID,
                                             // etc.
                                         }
                                     }).FirstOrDefault());

Context

StackExchange Code Review Q#135479, answer score: 3

Revisions (0)

No revisions yet.