Rate limiting in Minimal APIs with .NET 8

Ali Benchaaban
1y
23.5k
0
2

Article

Rate Limiter

Before reading this article, I recommend looking at my article: Minimal API in .NET 8: A Simplified Approach to Build Web APIs and Middleware in Minimal API with .NET 8.

What is rate limiting?

Rate limiting is a method employed to manage the volume of incoming traffic to a web application or API by restricting the number of requests permitted within a specified timeframe. Implementing rate limiting can enhance the overall performance of the site or application and prevent it from becoming unresponsive.

Why We Use Rate Limiting?

It facilitates commercial viability through subscription models, where users pay for a set number of API calls. This encourages upgrades for higher usage.
It acts as a defense against malicious activities like DoS attacks. By restricting the number of API calls, it prevents hackers from overwhelming the system with automated bot requests, preserving service availability.
In cloud-based APIs using a "pay as you go" model, rate limiting helps regulate traffic according to infrastructure capacity. This ensures optimal resource usage and maintains a balanced API
operation within the constraints of the underlying infrastructure.

Which type of rate limiter should we use?

Fixed window limit: limits the maximum of requests to a fixed window time.
Concurrency limit: limits the maximum number of concurrent requests at a time.
Token bucket limit: limits the number of requests based on a defined amount of allowed requests, or “tokens”.
Sliding window limit: works similarly to the fixed window limit but slides the maximum allowed requests through defined segments.

Implementing Rate Limiting Middleware

To add the rate-limiting middleware, first, we add the required services to the container

builder.Services.AddRateLimiter(options =>
{
});

To add the middleware to the pipeline

app.UseRateLimiter();

This project is a basic Minimal API (if you need a refresher, check out my earlier post) featuring a single endpoint that displays a collection of "GitHubIssues". I opted to maintain a slim Program.cs with minimal configuration while incorporating each limiter into integration tests.

Fixed Window Limiter

using Microsoft.AspNetCore.Builder;
using Microsoft.AspNetCore.Http;
using Microsoft.AspNetCore.Mvc.Testing;
using MinimalApi.Constants;
using System.Net;
using System.Threading.RateLimiting;


namespace MinimalApi.Tests.Base;

public class FixedWindowLimiterTests : IntegrationTestBase
{
    public FixedWindowLimiterTests(WebApplicationFactory<Program> factory)
        : base(factory) { }

    [Fact]
    public async Task ListIssues_WhenFixedWindowLimitOf10RequestsPerMinute_5out15RequestsShouldBeRejected()
    {
        // Arrange
        var numberOfRequests = 15;
        var permitLimit = 10;
        var timeWindow = TimeSpan.FromMinutes(1);
        var results = new List<HttpResponseMessage>();

        using var client = CreateClient(services =>
            services.AddRateLimiter(options =>
            {
                options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(httpContext =>
                    RateLimitPartition.GetFixedWindowLimiter(
                        partitionKey: GetPartitionKey(httpContext),
                        factory: partition => new FixedWindowRateLimiterOptions
                        {
                            PermitLimit = permitLimit,
                            Window = timeWindow
                        }
                    )
                );
                options.OnRejected = async (context, token) =>
                    await HandleRateLimiterRejectionAsync(context, token);
            })
        );

        var route = BuildFullRoute(Routes.ListIssues);

        // Act
        for (int i = 0; i < numberOfRequests; i++)
            results.Add(await client.GetAsync(route));

        // Assert
        AssertStatusCodeResponses(results, HttpStatusCode.TooManyRequests, expectedCount: 5);
        AssertStatusCodeResponses(results, HttpStatusCode.OK, expectedCount: 10);
    }

    [Fact]
    public async Task ListIssues_WhenQueueingFixedWindowLimitOf10RequestsPer10Sec_0out15RequestsShouldBeRejected()
    {
        // Arrange
        var numberOfRequests = 15;
        var permitLimit = 10;
        var queueLimit = 5;
        var timeWindow = TimeSpan.FromSeconds(10);
        var results = new List<HttpResponseMessage>();

        using var client = CreateClient(services =>
            services.AddRateLimiter(options =>
            {
                options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(httpContext =>
                    RateLimitPartition.GetFixedWindowLimiter(
                        partitionKey: GetPartitionKey(httpContext),
                        factory: partition => new FixedWindowRateLimiterOptions
                        {
                            AutoReplenishment = true,
                            PermitLimit = permitLimit,
                            QueueLimit = queueLimit,
                            Window = timeWindow,
                            QueueProcessingOrder = QueueProcessingOrder.OldestFirst
                        }
                    )
                );
                options.OnRejected = async (context, token) =>
                    await HandleRateLimiterRejectionAsync(context, token);
            })
        );

        var route = BuildFullRoute(Routes.ListIssues);

        // Act
        for (int i = 0; i < numberOfRequests; i++)
            results.Add(await client.GetAsync(route));

        // Assert
        AssertStatusCodeResponses(results, HttpStatusCode.TooManyRequests, expectedCount: 0);
        AssertStatusCodeResponses(results, HttpStatusCode.OK, expectedCount: 15);
    }

    [Fact]
    public async Task ListIssues_WhenChainedFixedWindowLimitOf60RequestsPerMinute_10out20RequestsShouldBeRejected()
    {
        // Arrange
        var numberOfRequests = 20;
        var initialPermitLimit = 10;
        var initialTimeWindow = TimeSpan.FromSeconds(10);

        var totalPermitLimit = 60;
        var totalTimeWindow = TimeSpan.FromMinutes(1);
        var results = new List<HttpResponseMessage>();

        using var client = CreateClient(services =>
            services.AddRateLimiter(options =>
            {
                options.GlobalLimiter = PartitionedRateLimiter.CreateChained(
                    PartitionedRateLimiter.Create<HttpContext, string>(httpContext =>
                        RateLimitPartition.GetFixedWindowLimiter(
                            partitionKey: GetPartitionKey(httpContext),
                            factory => new FixedWindowRateLimiterOptions
                            {
                                PermitLimit = initialPermitLimit,
                                Window = initialTimeWindow,
                            })
                    ),
                    PartitionedRateLimiter.Create<HttpContext, string>(httpContext =>
                        RateLimitPartition.GetFixedWindowLimiter(
                            partitionKey: GetPartitionKey(httpContext),
                            factory => new FixedWindowRateLimiterOptions
                            {
                                PermitLimit = totalPermitLimit,
                                Window = totalTimeWindow
                            })
                        )
                );
                options.OnRejected = async (context, token) =>
                    await HandleRateLimiterRejectionAsync(context, token);
            })
        );

        var route = BuildFullRoute(Routes.ListIssues);

        // Act
        for (int i = 0; i < numberOfRequests; i++)
            results.Add(await client.GetAsync(route));

        // Assert
        AssertStatusCodeResponses(results, HttpStatusCode.TooManyRequests, expectedCount: 10);
        AssertStatusCodeResponses(results, HttpStatusCode.OK, expectedCount: 10);
    }    
}

Fixed Window Limit of 10 Requests Per Minute: This test verifies that out of 15 requests made within a minute, 5 should be rejected due to exceeding the limit.
Queueing Fixed Window Limit of 10 Requests Per 10 Seconds: Here, the test ensures that all 15 requests made within 10 seconds should be accepted since the queueing strategy is used.
Chained Fixed Window Limit of 60 Requests Per Minute: This test examines a more complex scenario where two fixed window rate limiters are chained together. It validates that out of 20 requests made within a minute, 10 should be rejected based on the combined limit.

Concurrency Limiter

using Microsoft.AspNetCore.Builder;
using Microsoft.AspNetCore.Http;
using Microsoft.AspNetCore.Mvc.Testing;
using MinimalApi.Constants;
using System.Net;
using System.Threading.RateLimiting;


namespace MinimalApi.Tests.Base;

public class ConcurrencyLimiterTests : IntegrationTestBase
{
    public ConcurrencyLimiterTests(WebApplicationFactory<Program> factory)
        : base(factory) { }

    [Fact]    
    public async Task ListIssues_WhenConcurrencyLimitOf2Requests_8out10RequestsShouldBeRejected()
    {
        // Arrange
        var numberOfRequests = 10;
        var permitLimit = 2; // Only two requests at the time

        using var client = CreateClient(services =>
            services.AddRateLimiter(options =>
            {
                options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(httpContext =>
                    RateLimitPartition.GetConcurrencyLimiter(
                        partitionKey: GetPartitionKey(httpContext),
                        factory: partition => new ConcurrencyLimiterOptions
                        {
                            PermitLimit = permitLimit
                        }
                    )
                );
                options.OnRejected = async (context, token) =>
                    await HandleRateLimiterRejectionAsync(context, token);
            })
        );

        var route = BuildFullRoute(Routes.ListIssues);
        var apiCalls = Enumerable.Range(0, numberOfRequests)
            .Select(_ => client.GetAsync(route));

        // Act
        var results = await Task.WhenAll(apiCalls); // concurrent requests

        // Assert
        AssertStatusCodeResponses(results, HttpStatusCode.TooManyRequests, expectedCount: 8);
        AssertStatusCodeResponses(results, HttpStatusCode.OK, expectedCount: 2);
    }
}

Concurrency Limit of 2 Requests: This test ensures that out of 10 requests made concurrently, only 2 should be accepted, while the remaining 8 should be rejected due to exceeding the concurrency limit.

Token Bucket Limiter

using Microsoft.AspNetCore.Builder;
using Microsoft.AspNetCore.Http;
using Microsoft.AspNetCore.Mvc.Testing;
using MinimalApi.Constants;
using System.Net;
using System.Threading.RateLimiting;

namespace MinimalApi.Tests.Base;

public class TokenBucketLimiterTests : IntegrationTestBase
{
    public TokenBucketLimiterTests(WebApplicationFactory<Program> factory)
        : base(factory) { }

    [Fact]
    public async Task ListIssues_WhenTokenBucketLimitOf20Requests_5out25RequestsShouldBeRejected()
    {
        // Arrange
        var numberOfRequests = 25;
        var bucketTokenLimit = 20;
        var tokensToRestorePerPeriod = 10;
        var replenishmentPeriod = TimeSpan.FromMinutes(1);
        var results = new List<HttpResponseMessage>();

        using var client = CreateClient(services =>
            services.AddRateLimiter(options =>
            {
                options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(httpContext =>
                    RateLimitPartition.GetTokenBucketLimiter(
                        partitionKey: GetPartitionKey(httpContext),
                        factory: partition => new TokenBucketRateLimiterOptions
                        {
                            AutoReplenishment = true,
                            TokenLimit = bucketTokenLimit,
                            ReplenishmentPeriod = replenishmentPeriod,
                            TokensPerPeriod = tokensToRestorePerPeriod,
                            QueueProcessingOrder = QueueProcessingOrder.OldestFirst
                        }
                    )
                );
                options.OnRejected = async (context, token) =>
                    await HandleRateLimiterRejectionAsync(context, token);
            })
        );

        var route = BuildFullRoute(Routes.ListIssues);

        // Act        
        for (int i = 0; i < numberOfRequests; i++)
            results.Add(await client.GetAsync(route));

        // Assert
        AssertStatusCodeResponses(results, HttpStatusCode.TooManyRequests, expectedCount: 5);
        AssertStatusCodeResponses(results, HttpStatusCode.OK, expectedCount: 20);
    }
}

Token Bucket Limit of 20 Requests: This test checks that out of 25 requests made, 5 should be rejected due to exceeding the token bucket limit.

Sliding Window Limiter

using Microsoft.AspNetCore.Builder;
using Microsoft.AspNetCore.Http;
using Microsoft.AspNetCore.Mvc.Testing;
using MinimalApi.Constants;
using System.Net;
using System.Threading.RateLimiting;

namespace MinimalApi.Tests.Base;

public class SlidingWindowLimiterTests : IntegrationTestBase
{
    public SlidingWindowLimiterTests(WebApplicationFactory<Program> factory)
        : base(factory) { }

    [Fact]    
    public async Task ListIssues_WhenSlidingWindowLimitOf10Requests_10out20RequestsShouldBeRejected()
    {
        // Arrange        
        var numberOfRequests = 20;
        var permitLimit = 10; // maximum of requests per segment
        var window = TimeSpan.FromSeconds(30); // 30 seconds window
        var segmentsPerWindow = 3;
        var results = new List<HttpResponseMessage>();        

        // Segments sliding interval = (30 seconds / 3) equals 3 segments of 10 seconds

        using var client = CreateClient(services =>
            services.AddRateLimiter(options =>
            {
                options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(httpContext =>
                    RateLimitPartition.GetSlidingWindowLimiter(
                        partitionKey: GetPartitionKey(httpContext),
                        factory: partition => new SlidingWindowRateLimiterOptions
                        {
                            Window = window,
                            PermitLimit = permitLimit,
                            SegmentsPerWindow = segmentsPerWindow,
                        }
                    )
                );
                options.OnRejected = async (context, token) =>
                    await HandleRateLimiterRejectionAsync(context, token);
            })
        );

        var route = BuildFullRoute(Routes.ListIssues);

        // Act
        for (int i = 0; i < numberOfRequests; i++)
            results.Add(await client.GetAsync(route));

        // Assert
        AssertStatusCodeResponses(results, HttpStatusCode.TooManyRequests, expectedCount: 10);
        AssertStatusCodeResponses(results, HttpStatusCode.OK, expectedCount: 10);
    }
}

Sliding Window Limit of 10 Requests: This test verifies that out of 20 requests made within a sliding window of 30 seconds, 10 should be rejected due to exceeding the limit

Custom policies

The AuthenticatedUserPolicy is used to tailor rate limits based on the requester's authentication status. Authenticated users are permitted up to 400 requests per minute, whereas non-authenticated users are limited to 40 requests per minute.

Although I've globally applied this policy to all routes using the .RequireRateLimiting() extension method, you have the flexibility to refine it further by employing distinct policies for individual endpoints or opting out altogether.

using Microsoft.AspNetCore.RateLimiting;
using System.Threading.RateLimiting;

namespace MinimalApi.Constants;

public class AuthenticatedUserPolicy : IRateLimiterPolicy<string>
{
    public static readonly AuthenticatedUserPolicy Instance = new();

    public RateLimitPartition<string> GetPartition(HttpContext httpContext)
    {
        var nonAuthPermitLimit = 40; // 40 requests per minute
        var authenticatedPermitLimit = 400; // 400 requests per minute
        var window = TimeSpan.FromMinutes(1);

        var isAuthenticated = httpContext.User.Identity?.IsAuthenticated == true;

        // Authenticated requests
        if (isAuthenticated)
        {
            var identityName = httpContext.User.Identity?.Name!.ToString();

            return RateLimitPartition.GetFixedWindowLimiter(
                partitionKey: identityName!,
                partition => new FixedWindowRateLimiterOptions
                {
                    PermitLimit = authenticatedPermitLimit,
                    Window = window
                }
            );
        }

        // Non-authenticated requests
        return RateLimitPartition.GetFixedWindowLimiter(
                partitionKey: httpContext.Request.Headers.Host.ToString(),
                partition => new FixedWindowRateLimiterOptions
                {
                    PermitLimit = nonAuthPermitLimit,
                    Window = window
                }
            );
    }

    public Func<OnRejectedContext, CancellationToken, ValueTask>? OnRejected
    {
        get => (context, lease) =>
        {
            context.HttpContext.Response.StatusCode = StatusCodes.Status429TooManyRequests;
            return new ValueTask();
        };
    }
}

Let’s test it.

using Microsoft.AspNetCore.Authentication;
using Microsoft.AspNetCore.Mvc.Testing;
using Microsoft.Extensions.DependencyInjection;
using MinimalApi.Constants;
using MinimalApi.Tests.Base;
using System.Net;
using System.Net.Http.Headers;

namespace MinimalApi.Tests;

public class AuthenticatedUserPolicyTests : IntegrationTestBase
{
    public AuthenticatedUserPolicyTests(WebApplicationFactory<Program> appFactory)
        : base(appFactory) {}

    [Fact]
    public async Task ListIssues_WhenUnderAuthenticatedUserPolicy_0out50RequestsShouldBeRejected()
    {
        // Arrange        
        var numberOfRequests = 50;
        var results = new List<HttpResponseMessage>();
        var scheme = "TestScheme";

        using var client = CreateClient(services =>
            services
                .AddAuthentication(defaultScheme: scheme)
                .AddScheme<AuthenticationSchemeOptions, TestAuthHandler>(
                    scheme, options => {}
                )
            );

        client.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue(
            scheme: scheme);

        var route = BuildFullRoute(Routes.ListIssues);

        // Act
        for (int i = 0; i < numberOfRequests; i++)
            results.Add(await client.GetAsync(route));

        // Assert
        AssertStatusCodeResponses(results, HttpStatusCode.TooManyRequests, expectedCount: 0);
        AssertStatusCodeResponses(results, HttpStatusCode.OK, expectedCount: 50);
    }

    [Fact]
    public async Task ListIssues_WhenUnderAuthenticatedUserPolicy_10out50RequestsShouldBeRejected()
    {
        // Arrange
        var numberOfRequests = 50;
        var results = new List<HttpResponseMessage>();

        using var client = CreateClient(services => { });
        var route = BuildFullRoute(Routes.ListIssues);

        // Act                
        for (int i = 0; i < numberOfRequests; i++) // limited requests for non-authenticated users
            results.Add(await client.GetAsync(route));

        // Assert
        AssertStatusCodeResponses(results, HttpStatusCode.TooManyRequests, expectedCount: 10);
        AssertStatusCodeResponses(results, HttpStatusCode.OK, expectedCount: 40);
    }
}

ListIssues_WhenUnderAuthenticatedUserPolicy_0out50RequestsShouldBeRejected: Tests if 50 authenticated requests are accepted.
ListIssues_WhenUnderAuthenticatedUserPolicy_10out50RequestsShouldBeRejected: Checks if 50 non-authenticated requests are partially rejected, with 10 rejected and 40 accepted.

Using Postman we can easily set up a runner within the collection to create multiple iterations for testing.

The server begins rejecting requests after the 40th non-authorized request, indicating that the policy is functioning as expected. Everything seems to be working fine!

Postman

To sum up, rate limiting plays a crucial role in managing request flow within a .NET Core. It safeguards server resources, curtails abuse, and promotes equitable usage among clients.

However, it's important to recognize that rate limiting is just one component of crafting a secure and scalable API. As developers, staying abreast of best practices and continually optimizing both performance and security are imperative for delivering a seamless and dependable user experience.

References: https://learn.microsoft.com/en-us/aspnet/core/performance/rate-limit?view=aspnetcore-8.0;

The source code is available on the following repository: https://github.com/alibenchaabene/MinimalApi

Thank you for reading, please let me know your questions, thoughts, or feedback in the comments section. I appreciate your feedback and encouragement.

Happy Documenting!