API Rate-Shaping with F5 iRules

New theme, new blog post…

Many larger websites running software as a service platforms may opt to provide web API’s or other integration points for third-party developers to consume, thus providing an open-architecture for sharing content or data. Obviously when allowing others to reach into your application there is always the possibility that the integration point could be abused… perhaps someone writes some rubbish code and attempt to call your API 500 times a second and effectively initiates a denial of service (DoS). One method is to check something unique, such as an API key in your application and check how frequently its called, however this can become expensive especially if you need to spark up a thread for each check.

The solution – Do checking in software, but also on the edge, perhaps on an F5 load balancer using iRules…

The concept is fairly simple – We want to take both the users IP address and API Key concatenate it together and store it in a session table with a timeout. If the user/application requesting the resource attempts to call your API endpoint beyond a a pre-configured threshold (i.e. 3 times per second) they are returned a 503 HTTP status and told to come back later. Alternatively, if they don’t even pass in an API Key they get a 403 HTTP status returned. This method is fairly crude, but its effective when deployed alongside throttling done in the application. Lets see how it fits together:

As mentioned above the users IP/API Key are inserted into an iRule Table – This is a global table shared across all F5 devices in a H.A deployment and, it stores values that are indexed by keys.

Each table contains the following columns:

  • Key – This is the unique reference to the table entry and is used during table look up’s
  • Value – This is the concatenated IP/API Key
  • Timeout – The timeout type for the session entry
  • Lifetime – This is the lifetime for the session, it will expire after a certain period of time no matter how many changes or lookups are performed on it. An entry can have a lifetime and a timeout at the same time. It will expire whenever the timeout OR the lifetime expires, whichever comes first.
  • Touch Time – Indicates when the key entry was last touched – It’s used internally by the session table to keep track of when to expire entries.
  • Create Time – Indicates when the key was created.

The table would look something like this:
F5 iRule Session Table

The Rule itself:

when RULE_INIT {

	#Allow 3 Requests every 1 Second
	set static::maxRate 3
	set static::windowSecs 1

}

when HTTP_REQUEST {

	if { ([class match [string tolower [HTTP::path]] starts_with Ratelimit-URI] ) } {

		#Whitelist IP Addresses
		if { [IP::addr [IP::client_addr] equals 192.168.0.1/24] || [IP::addr [IP::client_addr] equals 10.0.0.1/22]  } {
				return
			}

			#Main logic:

		#Check if API 'APIKey' header is passed through, break if not.
		if { !( [HTTP::header exists APIKey] ) } {

			HTTP::respond 403 content "<html><h2>No API Key provided - Please provide an API Key</h2></html>"		

			#Drop the Connection afterwards
			drop
		}

		#Set VARS: - Do this after the check for an API Key...
        set limiter [crc32 [HTTP::header APIKey]]
        set clientip_limitervar [IP::client_addr]:$limiter
        set get_count [table key -count -subtable $clientip_limitervar]

			#Check if current requests breach the configured max requests per-second?
        if { $get_count < $static::maxRate } {
            incr get_count 1
             table set -subtable $clientip_limitervar $get_count $clientip_limitervar indefinite $static::windowSecs
			 } else {

					log local0. "$clientip_limitervar has exceeded the number of requests allowed"

					HTTP::respond 503 content "<html><h2>You have exceeded the maximum number of requests per minute allowed... Try again later.</h2></html>"

					#Drop the Connection afterwards
					drop
            return
        }
    }
}

The iRule DataGroup:

RateLimit URI(Click for larger Image)

So how does this iRule work? Lets step through it:

  1. When the rule i initialized two static variables are set: The “Max Rate”, how many requests are allowed within the “windowSecs” period. i.e. 3 requests per 1 second.
  2. When the HTTP request is parsed, the rule scans HTTP paths (i.e. /someservice.svc”) inside an iRule Datagroup named “Ratelimit-URI” to check if its a page that requires rate-limiting, if not breaks and returns the page content.
  3. We check if the request is coming from a white-listed IP address, if it is we return the page content without rate-limiting, otherwise the rule will continue
  4. The rule then checks if the request contains an HTTP header of “APIKey”, if not a 403 message is returned and the connection is dropped, if it is the rule continues.
  5. We then setup the variables that will be inserted into the iRule table. First we hash the APIKey as a CRC32 value to cut down on the size if its large. We then concatenate the client IP address with the resulting hash. Finally we drop it into an table
  6. A check is then performed to see if the count of requests didn’t breach the maximum number of requests set when the rule initialized, if it didn’t then when the count of requests is incremented by one and the table is updated. Otherwise if the count did breach the maximum number of requests, a 503 is returned to the user and the connection is dropped.

That’s it, simple – fairly crude, but effective as a first method of protection from someone spamming your API. Making changes to the rule is fairly simple (i.e. changing whats checked, perhaps you want to look for full URI’s instead of just the path). It may also be worth while adding a check for the size of the header before you hash to ensure no one abuses the check and forces your F5 to do a lot of expensive work, perhaps do away with the hashing all together… your call 🙂

It must be noted that the LTM platform places NO limits on the amount of memory that can be consumed by tables, because of this its recommenced that you don’t do this on larger platforms or investigate some time in setting up monitoring on your F5 device to warn you if memory is getting drastically low – “tmsh show sys mem” is your friend.

Let me know if you have any questions.

-Patrick