Classifications
2 TopicsLineage Limitation on Wide Power BI Semantic Models & Built-in Classification Rule Sets
Hello everyone, I’m evaluating the new Microsoft Purview Governance Portal for our finance data governance needs. Previously with the Azure Purview classic version we had a couple of blocking issues such as it failed to scan wide semantic models. But now we're migrating to Fabric and we'd like to try the new Microsoft Purview Governance Portal. I’d appreciate any insights or confirmation from the product team or the community. Lineage Limitation on Wide Power BI Semantic Models Background: When we first ran the Purview Data Map Scanner against our finance semantic model, it failed once the total column count across all tables exceeded roughly 500 columns. In our case, a single SAP table alone has about 450 columns—so the scan wouldn’t complete, and we couldn’t capture any lineage. Questions: Has Purview removed or raised any “hidden” column-count limits for Power BI semantic models? Is there any official documentation on maximum supported column counts (e.g. 200, 500, or otherwise)? Are there recommended workarounds for very wide models—such as splitting into sub-datasets, using incremental scans, etc. to get full lineage? 2. Built-in Classification Rule Sets Background: Purview ships with a set of Microsoft-provided “Sensitive Information Types” that appear in every scan rule set. In many of our scans these defaults aren’t needed, and they clutter the results. Questions: Can we delete or permanently disable the built-in classification rules? If not, what’s the best way to ensure they’re not applied during a full scan? Are there any APIs or PowerShell commands that let us automate the exclusion of Microsoft’s defaults from our scan rule sets? Thank you in advance for any pointers, documentation links, or best-practice advice!127Views0likes1CommentFuture Support for Configurable Sampling in Purview Classification
We have a question regarding the sampling method used in Microsoft Purview for classification. Based on the documentation, we understand that for tabular data sources (e.g., SQL databases), Purview samples only the top 128 rows for classification. However, our client has tables with millions of rows, and this small sample size may not be representative of the actual data. Is there any plan to allow users to configure the number of sampled rows in future updates? This would greatly improve classification accuracy for large datasets. Thanks in advance for your insights!Solved93Views0likes1Comment