KEDA https://github.com/kedacore is an open source K8s controller that acts as a man in the middle between a data or event store such as Azure Event Hub, Storage or Bus queue (but also AWS, Kafka, etc.) and an event handler such as Azure Functions. Its scalers ensure that the appropriate number of handlers are started according to the load happening at the level of the source. KEDA comes with its own K8s Custom Resource Definition called ScaledObject. Here is an example of such a resource deployment:
#this one is optional. KEDA scales out automatically but we can still limit the number of pods
- type: azure-blob
In the above example, we can see that this resource is linked to a deployment called kedablob (not present here) that is the handler. The source is in this case an Azure Blob Storage and the connection to use is specified in the metadata. When deploying this, you end up with an HPA (Kubernetes's built-in Horizontal Pod Autoscaler) being created for you:
The important part is highlighted in red. It takes the number of blobs present in the Blob Storage into account to scale out the related deployment accordingly. Note that if the handler does not delete the incoming blob, the HPA will never scale down. Also important to notice that one can influence the HPA by specifying the target metric ourselves. KEDA defaults to 5 for various event stores.
The below schema shows high level interactions and components:
To test it out with some realistic scenario, I deployed a QueueWriter pod that writes 5000 messages every 2 seconds to a Storage Queue. I scaled out the QueueWriter to 15 instances, meaning 37500 messages/s. I let KEDA scale out automatically and ended up with ~90 virtual kubelets (meaning ~90 ACIs) handling the load. I let it run during an hour to treat about 135 million messages. Whenever I checked the queue, it was empty or could see a message from time to time, meaning that the handlers had no problem to follow the pace. KEDA can be used in conjunction with worker nodes but it is wiser to use it together with Virtual Kubelets which translates to ACIs and a dedicated agent pool coming with the following characteristics:
This requires a serious amount of resources. However, and that is not related to KEDA, one must pay attention to multiple things. Despite of the nice figures above, some limitations such as the total number of concurrent ACI (quota) apply. By default, you can't exceed 100 per region and since your cluster is bound to a region, you simply can't exceed 100. This means that letting KEDA scale your handlers without control will easily lead to hit this limit. You'll end up with containers in the ProviderFailed state. A ticker to MS support can be done to increase the default threshold.
Also, admittedly, ACIs are not started in a few secs only, meaning that under high load, KEDA will attempt to scale even more. In messaging and event-driven scenarios that are very asynchronous by nature, waiting a bit is not an issue but as long as no handler is ready to handle queue messages, KEDA will keep scaling, unless a maxiumReplicaCount is specified for the ScaledObject. Last but not least, sometimes some ACIs hang in a pending state. Here again, although a bit scary the scheduler will terminate them once the HPA is back to its lower targets.
Overall, this produced rather good results, and sure, KEDA is something to keep an eye on!