When you’re looking to embed a Power BI dashboard or report in a web application for your customers, I.e. app owns data scenario, how should you go about choosing the tenancy model?
In this blog, I want to examine a useful way of navigating this question. The main tenancy models are:
Our documentation suggests that the workspace per tenant approach is suitable for large-scale ISVs with thousands of customers, while smaller and medium ISVs should choose the RLS model. This makes sense, as choosing the single-tenant per workspace approach requires a bit more work in the beginning. However, the key to any solution serving a big number of customers will be to automate, and this is no exception. Creating the workspace, report and dataset for each of your new customers should be part of the new tenant setup automation.
I would like to offer a way of approaching it that considers the cost from the very beginning. The first important piece when deciding between one dataset and report for all customers with RLS and workspaces per tenant with service principal profiles will be to determine what will be the dataset size for these 2 scenarios.
As an oversimplified example, if for 100 customers your dataset size is 100 GB, and with workspaces per tenant with individual datasets per customer of 10 GB size, the pricing is very different. Please check it here.
This is important because with Premium Gen2 and Embedded Gen 2, the amount of memory available on each node size is set to the limit of memory footprint of a single dataset, and not to the cumulative consumption of memory. For example, in Premium Gen2 P1 capacity, only a single dataset size is limited to 25 GB, in comparison to the original Premium, where the total memory footprint of the datasets being handled at the same time was limited to 25 GB.
It’s oversimplified because it depends on the dataset type. E.g. for the import mode dataset, the total need could be: loading the dataset to memory + refreshing the dataset + interacting with the report. Please see here a more detailed explanation of dataset memory allocation.
Real-world example
Let's say you have 500 customers/tenants with separate datasets with average dataset size of 200 MB and maximum dataset size of 10 GB (total size 100 GB). Not all customers are active at the same time, but most of them will be active and need to be ready for queries most times. You could have a P1/A4 with max 25 GB memory to make room in import mode for 10 GB dataset + refresh + user queries; even though your total size is much larger. There is no limit on total memory.
Other considerations
You should of course look at all the SKU limits to choose the correct one. Parallel DirectQueries/Live connections per second are also super important. Here it would be necessary to know your end-user usage patterns. How many concurrent users do you expect?
What is important here is that these DirectQueries/Live connections per second follow a leaky bucket design. For P1/A4, let’s assume you have 40 concurrent requests to Power BI service in 1 second. 10 will be served in that second, and the remaining 10 in the next second.
Sizing is challenging. I would say start by determining the dataset size. You can start by measuring the memory footprint of the Power BI Desktop tool.
Conclusion
Consider both the number of customers you have and the expected growth for 2-3 years AND the dataset size for each of the tenancy models to understand what the most suitable choice for your specific scenario will be.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.