HiveBrain v1.2.0
Get Started
← Back to all entries
patterncsharpMajor

Method returning IEnumerable<T> should ToList() or not

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
methodreturningshouldnottolistienumerable

Problem

I can't decide whether my method that returns an IEnumerable should itself be lazy or whether it should build a list and return that. I often opt for the latter so I can be sure the enumeration of each value isn't performed multiple times.
For example:

public IEnumerable GetUsers()
{
    var allDepartments = GetAllDepartments(active: true); // Returns IEnumerable
    var allUsers = GetAllUserDepartments(active: true); // Returns IEnumerable
    var users
        = allDepartments
            .Join(allUsers, x => x.Department, y => y.Department, (x, y) => y, StringComparer.OrdinalIgnoreCase)
            .Distinct()
            .Select(x => GetUserProfile(x))
            .ToList(); // Is this recommended or not

    return users;
}


The key part is each enumeration is doing something (GetUserProfile) non trivial, it might be expensive, it might not, but what's the recommendation for a method that returns an IEnumerable? Should it care about whether the caller might enumerate it several times or not?

Assume only IEnumerable functionality is required by the caller and my question can be reworded into:

How do I signify to the caller that each enumeration may be expensive?

How expensive it is, may change their implementation decisions. If they enumerate the whole lot several times, then a ToList() makes sense to them to perform. If they don't enumerate the whole lot, then a ToList() would be a waste for me (or the caller) to perform.

Solution

Does code that calls the method always expect List functionality (access by index, etc.)? Return a List. Does code that calls the method only expect to iterate over it? Return an IEnumerable.

You shouldn't care about what the caller does with it, because the return type clearly states what the returned value is capable of doing. Any caller that gets an IEnumerable result knows that if they want indexed access of the result, they will have to convert to a List, because IEnumerable simple isn't capable of it until it's been enumerated and put into an indexed structure. Don't assume that the callers are stupid, otherwise you end up taking functionality away from them. For example, by returning a List, you've taken away the ability to stream results which can have its own performance benefits. Your implementation may change, but the caller can always turn an IEnumerable into a List if they need to.

Now that you've decided what the return type should be, you have to decide what the implementation should use. Deferred execution has benefits, but can also be confusing. Take this example:

public static IEnumerable GetRecordIds()
{
    return dbContext.Records.Select(r => r.Id);
}

IEnumerable currentRecordIds = GetRecordIds();

dbContext.Add(new Record { Id = 7 });

// Includes "7", despite GetRecordIds() being called before the Add():
currentRecordIds.Dump();


This can be "remedied" by GetRecordIds calling ToList before returning. The "correct" use here simply depends on what is expected of the class ("live" results, or time of call results).

Again, don't assume the caller is stupid, and don't take away functionality from the caller by making assumptions about how it will be used. Remember, the interface tells you what is expected to be returned. IEnumerable only means you are getting something that can be iterated over (potentially streaming results and making use of deferred execution), and List only means you're getting an in-memory collection that can be added to, removed from, accessed by index, etc.

Edit - To address your "Update 1":

If you are really concerned, you might add some information into the documentation, but I don't think it's necessary. The IEnumerable interface doesn't claim to make any promises about performance on repeat enumeration (or even each iteration for that matter), so it's on the caller to handle it intelligently.

Code Snippets

public static IEnumerable<int> GetRecordIds()
{
    return dbContext.Records.Select(r => r.Id);
}

IEnumerable<int> currentRecordIds = GetRecordIds();

dbContext.Add(new Record { Id = 7 });

// Includes "7", despite GetRecordIds() being called before the Add():
currentRecordIds.Dump();

Context

StackExchange Code Review Q#66756, answer score: 27

Revisions (0)

No revisions yet.