Differences Between IEnumerable And IQueryable And Its Memory Usage

IEnumerable and IQueryable are two of the most commonly used interfaces in .NET for dealing with collections of data. While both of these interfaces provide a way to access collections of data, they differ in their implementation and usage, as well as in their memory usage.

IEnumerable is an interface that is part of the System.Collections namespace. It is used to represent a collection of elements that can be enumerated. An enumerable collection is simply a collection that can be looped over in a foreach statement. An example of an enumerable collection would be a List, an array, or a Dictionary.

On the other hand, IQueryable is a more specialized interface that is part of the System.Linq namespace. It is used to represent a queryable collection of data. A queryable collection is a collection that can be queried using LINQ (Language Integrated Query) to filter, sort, and project data. An example of a queryable collection would be an Entity Framework DbSet or a LINQ to SQL Table.

The key difference between IEnumerable and IQueryable is how they handle data retrieval.

When using IEnumerable, data is retrieved and processed in memory. This means that if you have an extensive collection of data, you will be required to load all of the data into memory, which can lead to performance issues. 

On the other hand, IQueryable allows you to perform operations on the data without having to load all of the data into memory. Instead, IQueryable generates an expression tree representing the query, and the data is only retrieved from the source when the query is executed. This means that IQueryable can be much more efficient when working with large collections of data. 

Here is an example of how Entity Framework can use expression trees to optimize a LINQ query. Let's say we have a simple DbContext with a Blog entity and a Posts entity, where each blog can have many posts,

public class Blog
{
    public int BlogId { get; set; }
    public string Name { get; set; }
    public ICollection<Post> Posts { get; set; }
}

public class Post
{
    public int PostId { get; set; }
    public string Title { get; set; }
    public string Content { get; set; }
    public int BlogId { get; set; }
    public Blog Blog { get; set; }
}

public class BloggingContext : DbContext
{
    public DbSet<Blog> Blogs { get; set; }
    public DbSet<Post> Posts { get; set; }
}
C#

Now let's say we want to retrieve all the blog posts for a specific blog, sorted by creation date,

int blogId = 1;
using(var context = new BloggingContext()) {
    var query = context.Posts.Where(p => p.BlogId == blogId).OrderByDescending(p => p.DateCreated).ToList();
}
C#

When this query is executed, Entity Framework will generate an expression tree representing the LINQ query like this:

var queryExpression = Expression.Call(
    typeof(Queryable),
    "Where",
    new Type[] { typeof(Post) },
    Expression.Constant(context.Posts),
    Expression.Lambda<Func<Post, bool>>(predicate, parameterExpression));

queryExpression = Expression.Call(
    typeof(Queryable),
    "OrderByDescending",
    new Type[] { typeof(Post), typeof(DateTime) },
    queryExpression,
    Expression.Lambda<Func<Post, DateTime>>(orderExpression, parameterExpression));

var query = context.Posts.Provider.CreateQuery<Post>(queryExpression).ToList();
C#

The queryExpression is an expression tree representing the entire query, including the Where and OrderByDescending clauses. This expression tree is then used to generate the SQL that will be executed against the database.

Showing you another similar example where an expression tree is used to generate the SQL that will be executed against the database. Example of how the expression tree generated by Entity Framework is used to generate SQL and execute a query against the database:

int blogId = 1;
using (var context = new BloggingContext())
{
    var query = context.Posts
        .Where(p => p.BlogId == blogId)
        .OrderByDescending(p => p.DateCreated)
        .ToList();
}
C#

The expression tree generated by Entity Framework for this query is:

var queryExpression = Expression.Call(
    typeof(Queryable),
    "Where",
    new Type[] { typeof(Post) },
    Expression.Constant(context.Posts),
    Expression.Lambda<Func<Post, bool>>(predicate, parameterExpression));

queryExpression = Expression.Call(
    typeof(Queryable),
    "OrderByDescending",
    new Type[] { typeof(Post), typeof(DateTime) },
    queryExpression,
    Expression.Lambda<Func<Post, DateTime>>(orderExpression, parameterExpression));
var query = context.Posts.Provider.CreateQuery<Post>(queryExpression).ToList();
C#

This expression tree represents the entire query, including the Where and OrderByDescending clauses. Entity Framework uses this expression tree to generate SQL that will be executed against the database.

Here is an example of the SQL that Entity Framework might generate for this query:

SELECT * FROM Posts
WHERE BlogId = 1
ORDER BY DateCreated DESC
SQL

This SQL query is generated by Entity Framework by analyzing the expression tree and translating it into SQL. This SQL query is then executed against the database, and the results are returned to the application as a list of Post objects. 

Entity Framework uses expression trees to analyze and optimize LINQ queries and then generates SQL based on the expression tree that will be executed against the database. This allows for more efficient querying and better performance when working with large datasets. Entity Framework uses the expression tree to analyze the query and optimize the SQL that is generated. For example, it may generate a single SQL statement that combines the Where and OrderByDescending clauses rather than executing two separate SQL statements. This can reduce the number of database round-trips needed to execute the query and improve performance. While this is just a simple example, Entity Framework can use expression trees to optimize much more complex queries, making it a powerful tool for querying databases. 

The use of IQueryable can also result in improved performance because the data retrieval and processing are deferred until the actual results are needed, which can reduce the amount of data that needs to be retrieved from the data source. Another difference between IEnumerable and IQueryable is the way they handle data manipulation. When using IEnumerable, data manipulation is performed in memory. Any changes you make to the data will be performed in memory and will not be reflected in the data source. For example, if you have a list of integers and want to filter out all the even numbers, you can use the Where method to filter the data. The Where method will create a new in-memory collection that contains only the elements that match the specified condition. This new collection will be a separate object from the original list, and any changes made will not be reflected in the original data source. 

If you want to sort the data, you can use the OrderBy method to sort the data in memory. The OrderBy method will create a new in-memory collection containing the sorted elements. Again, any changes made to this collection will not be reflected in the original data source. When working with IEnumerable, data manipulation is performed in memory. Any changes made to the data are done on the in-memory representation of the data rather than being reflected in the underlying data source. On the other hand, IQueryable allows you to perform data manipulation in the data source, which means that any changes you make to the data will be reflected in the data source.

How IQueryable handle data manipulation

When working with IQueryable, data manipulation is performed in the underlying data source rather than in memory. This means that any changes you make to the data are reflected in the underlying data source rather than just in the in-memory representation of the data. For example, if you have an Entity Framework DbSet or a LINQ to SQL Table, you can use LINQ to filter, sort, and project the data in the underlying database.  

Let me share a sample code here for better understanding, an example of how to use IQueryable to perform data manipulation on a Blog and Post example:

using (var context = new BloggingContext())
{
    // Get a queryable list of all posts
    IQueryable<Post> queryablePosts = context.Posts;

    // Sort the list by date created
    IQueryable<Post> sortedPosts = queryablePosts.OrderBy(post => post.DateCreated);

    // Filter the list to only include posts with the word "Entity Framework" in the title
    IQueryable<Post> filteredPosts = sortedPosts.Where(post => post.Title.Contains("Entity Framework"));

    // Take the first 10 posts from the filtered list
    IQueryable<Post> pagedPosts = filteredPosts.Take(10);

    // Materialize the results into a list
    List<Post> result = pagedPosts.ToList();
}
C#

In this example, we get a Queryable list of all Post objects from the database. We then use the OrderBy method to sort the list by the DateCreated property, which generates an expression tree that represents the sort operation. We then use the Where method to filter the sorted list to only include posts with the word "Entity Framework" in the title, generating another expression tree representing the filter operation. Finally, we use the Take method to take the first 10 posts from the filtered list, generating another expression tree representing the paging operation. We then materialize the results into a list using the ToList method, which executes the entire expression tree against the database and returns the result. This allows us to perform data manipulation operations on the Post objects in the database without having to load all the objects into memory first, resulting in better performance and more efficient use of resources. In case you are interested in the expression tree generated by Entity Framework for the above example I provided earlier:

var queryablePosts = context.Posts;
var sortedPosts = queryablePosts.OrderBy(post => post.DateCreated);
var filteredPosts = sortedPosts.Where(post => post.Title.Contains("Entity Framework"));
var pagedPosts = filteredPosts.Take(10);

var expression = pagedPosts.Expression;
C#

The final expression tree generated by Entity Framework is:

.Call(System.Linq.Queryable.Take(
    .Call(System.Linq.Queryable.Where(
        .Call(System.Linq.Queryable.OrderBy(
            .Call(System.Linq.Queryable.Select(
                .Constant(value: System.Data.Entity.Internal.Linq.InternalSet`1[ConsoleApp1.Post]),
                '($it) => $it'
            )),
            '($it) => $it.DateCreated'
        )),
        '($it) => $it.Title.Contains("Entity Framework")'
    )),
    10)
)
C#

This expression tree represents the entire query, including the OrderBy, Where, and Take clauses. Entity Framework uses this expression tree to generate SQL that will be executed against the database. Here is an example of the SQL that Entity Framework might generate for this query:

SELECT TOP 10
    [PostTable].[Id] AS [Id],
    [PostTable].[Title] AS [Title],
    [PostTable].[Content] AS [Content],
    [PostTable].[DateCreated] AS [DateCreated],
    [PostTable].[BlogId] AS [BlogId]
FROM
    (SELECT
        [PostTable].[Id] AS [Id],
        [PostTable].[Title] AS [Title],
        [PostTable].[Content] AS [Content],
        [PostTable].[DateCreated] AS [DateCreated],
        [PostTable].[BlogId] AS [BlogId]
     FROM
        [dbo].[Posts] AS [PostTable]
     WHERE
        [PostTable].[Title] LIKE N'%Entity Framework%'
     ORDER BY
        [PostTable].[DateCreated] DESC
    ) AS [PostTable]
SQL

This SQL query is generated by Entity Framework by analyzing the expression tree and translating it into SQL. This SQL query is then executed against the database, and the results are returned to the application as a list of Post objects. 

Conclusion 

IEnumerable and IQueryable are both useful interfaces for dealing with collections of data, but they differ in their implementation, usage, and memory usage. IEnumerable is best used for simple enumeration of data, while IQueryable is best used for querying large collections of data. When working with large collections of data, it is recommended to use IQueryable to take advantage of its ability to perform operations on data without having to load all of the data into memory.