DeployR has the concept of Grid nodes for scalable execution of R scripts. One DeployR instance can have multiple Grid nodes in order to horizontally scale R script execution. By default a grid node is installed for you when you install DeployR and you can add new grid nodes on separate machines.
Each Grid node executes a request for script execution in what is called "slots". You can think of slots as roughly an R session that you would execute for example from R terminal or any other IDE, except this is not interactive.
You can configure you many such slots you want to have on each Grid node and DeployR will limit how many R sessions can be launched simultaneously to this number.
A DeployR project is the abstraction for an R session in DeployR. When DeployR creates an R session for script execution, it needs to maintain some information about the session, e.g., the logged in user for which the session is (in case of authenticated projects), adding to count of slots used on a Grid node, which node's slot is used for the project (with host and port) etc.
Different types of slots
Criteria for configuring slot limits
The main factors that determine how many slots you should configure for a given Grid node are the following:
The type of load you want to run : authenticated, anonymous, mixed etc.
Capacity of the machines running the grid nodes (number of cores, amount of RAM etc.)
How many concurrent executions you need
What is happening in your R script execution, e.g., mostly executing on the node vs doing distributed computation using one of the remote compute contexts (e.g. Hadoop, Spark, Teradata or SQL Server). If most of your execution happens on the node, then you should configure slot limit to be close to the number of cores available. Whereas if most of the execution is happening in a remote compute context then you can increase the slot limit to be more than the number of cores since R session will most likely be waiting on the remote results anyway.
Concurrent operation policies
One other factor that can influence slot usage are your concurrent operation policies settings under DeployR Server policies. These allow you to place a limit on per user or per session limit for slot usage instead of server wide limit which is the default.
Load Balancing across Grid nodes
The way DeployR balances Grid nodes is quite simple. It basically looks at following factors:
Type of requested execution i.e., authenticated or anonymous
Total number of slots configured for the grid nodes
Number of used slots
Detailed walk through example of Grid Load Balancing
Let us take a look at a system with two grid nodes A and B which has a slot limit of 30 and 20 respectively and assume for simplicity that both allow mixed mode of execution i.e., they can execute either anonymous or authenticated scripts.
When a request for a slot is made during project creation DeployR checks each grid node in turn for available slots. If for this example grid node A has used 5 slots and grid node B has not used any, it will calculate how many slots are available on each grid node (slots available is 25 and 20 on A & B respectively) and will chose the node with greater number of available slots. Hence, Node A will be chosen and total slots used on it will be 6. Since the configured number of available slots are not same on both nodes, node A will keep getting selected up to 10 slots before number of available slots becomes equal on both nodes i.e. 20 on both A and B. The next request could go to either A or B. Let's say it goes to A such that no of used slots on A is now 11. Subsequent request will go to B as at this point B has more slots available i.e., 20 whereas A has 19.
As you can see if you want to balance the number of slots across grid nodes, you should configure equal number of slots for each grid node. On the other hand if you have nodes on machines with unbalanced capacity you should configure your slots accordingly.
Understanding how grid nodes operate with slot limits is important to get optimal performance and scale of your DeployR installation. Hopefully this gives some insight into the issue. It is important to do some level of performance testing to get an optimal configuration for your scenario.