If you have ever interacted with an API service, you may have encountered a limit on the number of requests you can make within a specific timeframe. If you exceed that limit, the API service prevents you from making any other requests during the time period.
While rate limits may seem annoying at first glance, they are essential to guarantee the service's performance, ensure the quality of your code, and improve the overall customer experience. The fact is that being forced to meet rate limits forces you to optimize your coding and caching strategies. This results in lighter, faster websites that convert better.
Furthermore, rate limits keep you (and others) safe. Like a speed camera, a rate limiter keeps you from driving too fast. Streets are multi-tenant environments. The speed camera controls not only your speed, but also the speed of others on the same street, protecting you from accidents caused by other drivers.
Respecting the limits.
A rate limit is typically applied to requests based on the IP addresses they are coming from, and the time between each request. The API service returns a 429 status code if the same IP address makes too many requests in a short period of time.
In order to help you control your speed, each API response should include specific headers with information such as the maximum number of requests allowed in the time window, the total duration of the time window, and the number of allowed requests remaining in the current time window (which is zero in case of HTTP 429 errors). Using the same analogy as before, if the speed camera is the rate limiter, then those headers are your tachometer.
In an ideal scenario, your code would dynamically adjust its request rate based on the headers in order to avoid being blocked by rate limits. In an asynchronous job, the headers could be used to calculate how much delay to introduce between subsequent requests, so that it never exceeds the request rate limit. There are, however, some circumstances in which this approach cannot be applied.
You certainly don't want to introduce any delays in responding to page requests if your site receives a lot of traffic. In fact, you should serve as much traffic as you can and absorb any spikes without worrying about any rate limit.
To achieve that, it is often sufficient to avoid unnecessary API requests and use a wise caching strategy. During the validity period of your API access token, for example, you don't need to request a new access token before each request. You should instead store the token locally and obtain a new token only once it expires.
An example specific to ecommerce websites is the management of shopping carts. There is no need to create and fetch an empty shopping cart from your commerce API at the beginning of a user session. You can instead hardcode a "zero" counter for the number of items in the cart and only create the cart when the first item is added.
The list of examples and use cases could go on and on. In general, the purpose of rate limits is to ensure the quality of service without impacting your business, and all you have to do is follow some best practices that are also beneficial to the performance of your own website and code. By improving your cache's hit rate, you would also reduce the number of API requests, lowering your cost if you're charged per request.
Requests aren't all the same
A different rate limit can be applied to different API calls based on the types of requests you send. A REST API, for example, can limit requests based on the request method. Since GET requests are much more cacheable than POST or PUT requests, the API service can allow higher rate limits on those requests or remove the limits altogether.
Rate limiting can also be heavily influenced by whether the request is handled by the server or the client. As rate limits are typically calculated by IP address, server-side requests (same IP address) may reach the limit while the same volume of client-side requests (different IP addresses) may not. Therefore, the API service should be able to provide different rates for server-side and client-side applications.
Rate limiting is yet another factor that can influence the never-ending debate between those in favor of server-side versus client-side rendering. This article does not cover this topic in depth, but in my opinion, server side rendering works best for GET requests, while client side rendering is more suitable for POSTs. The reason is that one of the main advantages of SSR is SEO, which relates to read-only content. Instead, website actions such as buttons and search bars can be better handled by the client.
According to this approach, the editorial and product content of an ecommerce website should be rendered server side, thereby optimizing SEO. Instead, the add to cart button could be implemented on the client. If you follow this simple rule of thumb, together with more relaxed GET rate limits, you will be able to build a perfect customer experience while respecting all limits.
Content, commerce, and rate limiting
I might sound like a broken record, but every time I learn more about a new topic, I observe how it validates the fundamental difference between content and commerce. Specifically, the fact that product catalogs belong to the content domain, whereas commerce is all about transactional capabilities.
Content is read-only, while commerce is read-write. Server-side and client-side rendering, data caching, and rate limiting all depend on this fundamental difference. It is this separation of concerns that should drive many of your development decisions, as it adds so many benefits to the whole architecture. It all makes sense and ultimately helps you become a better ecommerce developer.