A Million Events in 5 Minutes! Know How We Do It.23 August 2016
How real-time, big data analytics can be a competitive advantage3 November 2016
In the previous blog, ‘A Million Events in 5 Minutes! Know How We Do It.’, we provided a high-level view of the architecture of piStats and how an incoming clickstream event travels through our system to get a visual representation on our real-time dashboards.
In this particular blog, we’ll give a detailed view of that component of our architecture where the clickstream event first knocks, i.e., the API Gateway and the reasons why we ended up choosing this one over the other available options.
-API Gateway was required that could take the request posted from the user end and then send it to the Kinesis Streams using authentication known only to system communicating with Kinesis, maintaining security and keeping the credentials safe and confidential.
-This API Gateway is responsible for consuming the request from the library on the user’s browser and posting it to the Kinesis streams and sending the response received from the Kinesis back to the library for it to extract the essential information,i.e., the user id, to be used in further future requests.
-This API Gateway acts as a proxy pass between the user and the Kinesis Streams to ensure security and single point of contact with the system but having multiple exits depending on the type of the request received.
-It helps the outside world to interact with our system in a secure, robust and cost-effective(depending on the choice of a gateway) way, gluing together the two different worlds.
Kong API Gateway: Kong, built on top of NGINX, is an Open Source Scalable API Gateway that can be run and used on any infrastructure and gets integrated with any number of APIs. Kong can be configured using the Rest Admin APIs with any number of plugins as per one’s use case. It also allows creating custom plugins for any custom implementations. Kong also has the capability to limit the number of API request that a user can make based on several parameters, one of them being the Response Header returned by the Upstream API. These request limits can be configured on requests per second, minutes, hours, and so on. Kong provides plugins to configure the rate limits as per once choice of limiting objects and thresholds. Kong allows clustering of multiple instances in order to take heavy loads and making it horizontally scalable.
Kong acts as both a Restful interface as well as a plugin-oriented application that allows implementing custom functionalities that can sit behind Kong and be used without having to worry about scaling and performance.
Kong makes clustering easier by storing the admin information in a database(Cassandra/Postgres), that again can be externalized in order to be used by multiple instances running Kong. Kong-Our Use Case: As said earlier we required a scalable, cost-effective, secure and robust API Gateway that could help us consume the post request from the user end and send it to the Kinesis Streams, without having to expose the credentials to the outside world.
Kong suited our requirement in every sense:
-Kong can be run in multiple instances and share the same database in order to have same configurations in terms of APIs, plugins, authentication, etc.
-Kong allows using the Nginx-lua scripts in order to perform custom functionalities before passing the request down the stream.
-Kong is open source and hence you only pay for the instance in which kong runs.
-It is easily customizable and gives a satisfactory performance even on the smallest EC2 instance available.
How and What We Did: Kong can easily be configured using configuration files based on yaml. The Kong config contains the configuration for nginx as well, wherein we provided filters for server and ports.
The .yml file of kong allows one to list the servers, ports,plugins and allows to include custom scripts to execute functions to be performed before passing on the request to Kinesis.
As said earlier we wanted to post data to Kinesis which required a custom Lua scripts to generate signed requests for Kinesis using the Signature Version 4. Along with that, we do some intermediate manipulation of the data before posting to Kinesis.
The intermediate manipulation involves extracting the IP Address of the user from the header data, using the remote address variable of nginx, to add to the body, required by the system in further journey.
Extracting Ip and header and setting to body
Calling the Lua Script to set headers for Kinesis
Lua script to fetch creadentials and set haeders
Headers have to be signed using AWS4 and HMAc We configured kong on our systems with Postgres installed on the local machine itself. Since we needed to accept 600 requests per second with a response of less than 200ms we configured our systems for high network throughput.
It runs so efficiently that even with high traffic the Utilization on the instance remains low and the response time remains below 150ms.
Checking whether enhanced networking is enables in ami
Enabling enhanced networking
Our Tests: We tested the Kong instance for load and perform tests. We load tested our single kong instance(a micro ec2 instance) with 10000 approximate post request in 20 seconds. The results were as follows:- None of the requests failed or timed out
-The average latencies were below 600ms
-The minimum latencies being 200ms and maximum being 7sec
This was the detailed view of one of our components of piStats. In the next upcoming blog we will takeup another interesting component of the piStats architecture.