# Rate limiting

> Apply and tune rate-limit policies on endpoints, and the defaults that ship with the template.

## How it works

Rate limiting is a backend concern. A request that exhausts its partition is rejected with HTTP
`429 Too Many Requests` before the handler runs. The frontend treats that as just another error: the
API client surfaces it and the calling code can show a toast. See [the API client](/docs/api-client)
for how 429s reach the UI.

Policies are named and defined once, in `api/src/Slicekit.Api/Configuration/RateLimiting.cs`. An
endpoint opts into one with `.RequireRateLimiting(RateLimitPolicies.)`. Each policy declares a
partition key (who shares a quota) and a limiter (how many requests, over what window).

## Shipped policies

Five policies cover the template's needs. Most new endpoints reuse one of these rather than adding a
sixth.

| Policy | Partition key | Limit |
|---|---|---|
| `RateLimitPolicies.Default` | `sub` claim, fallback to remote IP | Sliding window: 100 / minute |
| `RateLimitPolicies.Anonymous` | Remote IP | Sliding window: 20 / minute |
| `RateLimitPolicies.Auth` | Remote IP | Sliding window: 10 / minute |
| `RateLimitPolicies.CreateApiKey` | `sub` claim, fallback to remote IP | Sliding window: 10 / minute |
| `RateLimitPolicies.ExportData` | `sub` claim, fallback to remote IP | Fixed window: 5 / 24 hours |

`Anonymous` and `Auth` partition by IP because the caller is not yet authenticated when those
endpoints are hit, so there is no `sub` claim to key on. Everything else partitions by user (`sub`),
which means several users behind one NAT do not share a quota.

`ExportData` uses a fixed window because the legitimate use is a handful of requests per day. A
sliding window would let a determined caller burst across the boundary.

## Applying a policy to an endpoint

Add `.RequireRateLimiting(...)` to the route, alongside the other route policies (authorization,
validation, CSRF). The endpoint stays thin: it declares the limit, it does not enforce it by hand.
See [adding a vertical slice](/docs/vertical-slices) for the full endpoint shape.

```csharp
public static void Map(IEndpointRouteBuilder routes) =>
    routes.Auth().MapPost("/register", HandleAsync)
        .WithName("Auth_Register")
        .AllowAnonymous()
        .RequireRateLimiting(RateLimitPolicies.Auth)
        .ProducesProblem(429);
```

`.ProducesProblem(429)` is mandatory whenever an endpoint declares a rate limit. The 429 is part of
the public contract, so it belongs in the OpenAPI document the same way any other failure response
does.

## Adding a new policy

Reach for a new policy only when none of the five fit. A new policy is two consts and a registration.

1. Add a `const string` on `RateLimitPolicies`:

   ```csharp
   public const string PasswordReset = "password-reset";
   ```

2. Register it inside `ConfigureRateLimiting`:

   ```csharp
   opts.AddPolicy(RateLimitPolicies.PasswordReset, context =>
       RateLimitPartition.GetSlidingWindowLimiter(
           $"password-reset:{GetRemoteIp(context)}",
           _ => new SlidingWindowRateLimiterOptions
           {
               PermitLimit = 3,
               Window = TimeSpan.FromHours(1),
               SegmentsPerWindow = 6,
               QueueProcessingOrder = QueueProcessingOrder.OldestFirst,
               QueueLimit = 0
           }));
   ```

3. Reference it from the endpoint with `.RequireRateLimiting(RateLimitPolicies.PasswordReset)`.

`QueueLimit = 0` makes the limiter reject immediately rather than queue the request. That is the
right default for HTTP: a queued request would just hold the connection open. A non-zero queue only
makes sense for in-process producer and consumer flows, not request handling.

## Tuning the limits

Two knobs do most of the work: the partition key (who shares the quota) and the window type (how the
count is spread over time).

**Window type.**

- Sliding window (`GetSlidingWindowLimiter`) spreads the limit across `SegmentsPerWindow` sub-buckets,
  which prevents end-of-window bursts. Use it for typical request limits.
- Fixed window (`GetFixedWindowLimiter`) is a strict count per window. Use it for slow, expensive
  operations like `ExportData`.
- Token bucket and concurrency limiters are supported by .NET but unused here. Add one only if you
  need its specific semantics.

**Partition key.**

- User-scoped operation: key on the `sub` claim, falling back to IP with
  `?? GetRemoteIp(context)`. The fallback covers misconfigured auth and any endpoint mistakenly
  tagged with a user policy while anonymous.
- Unauthenticated endpoint: key on IP only, since `sub` is absent.
- Cross-tenant admin action: prefer `sub`. Keying on IP would let one admin's quota be eaten by
  another admin behind the same NAT.

To change an existing limit, edit `PermitLimit` and `Window` on the policy in question. Nothing else
references the numbers.

## Verify

- `dotnet build` passes.
- Start the API, hit a rate-limited endpoint past its limit, and confirm a `429 Too Many Requests`
  with a problem-details body.
- Check the OpenAPI document at `/scalar`: every endpoint that declares `.ProducesProblem(429)` shows
  the 429 response.

## Checklist

- [ ] Endpoint declares `.RequireRateLimiting(RateLimitPolicies.)`.
- [ ] Endpoint declares `.ProducesProblem(429)` to match.
- [ ] The chosen policy's partition key fits the endpoint (user-scoped uses `sub`, public uses IP).
- [ ] A new policy was added only because none of the five shipped policies fit.
- [ ] New policies keep `QueueLimit = 0`.
- [ ] `dotnet build` passes and the 429 shows up in `/scalar`.
