Forum Discussion
madsb1999
Sep 15, 2022Copper Contributor
Filter Power Query
Hi,
I have a database in a SQL-server with 8 million rows and +50 columns. I want to be able to filter what data Excel imports, such that it does not load all 8 million rows. In particular I want to be able to filter in an Excel workbook by clicking the things I want to import. As an example I have tried to first select the columns I want to be able to filter by in a Power Query, taking only distinct rows then loading them into Excel, created slicers on this table, loaded it back into Power Query selecting only the rows left after choosing in the slicers and used this for input in a new Power Query that would be the one to import my main data but it is either a really slow approach and sometimes it do not work - it just says "Connecting to source". Any suggestions?
Thanks in advance!
- LOTR_Nerd99Copper Contributor
madsb1999, you could create your distinct list in Power Query as well.
- To create the distinct list, you will need to create a new query. The final step should be Source[column_name]. This will output a list.
- In the main query, use the following code:
Table.SelectRows(Source, each List.Contains( Distinct_List, [field_name_in_main_table] ) )
This will select the rows in the main table that match the distinct list from step 2.
- madsb1999Copper Contributor
Hi LOTR_Nerd99 , thanks for the answer - I really appreciate any help!
I think what you are saying is exactly what I have tried to do. In details I have:
1. Created a Power Query with 6 of my 50+ columns, selected all distinct rows of this and loaded this table into Excel.
2. Created 6 slicers, one for each of the columns.
3. Connected the table to Power Query, combined my 6 columns into one and made sure to keep only those rows that was left after filtering in the slicers. Thus I have a list.
4. Created a Power Query - combined the same 6 columns into a new column and then done exactly as you suggest with List.Contains.
But for some reason this works really bad. Do you have any suggestions as why this is the case?- LOTR_Nerd99Copper Contributor
madsb1999, you could increase performance by buffering the distinct list.
BufferedList = List.Buffer(Distinct_List), #"Filtered to matching list" = Table.SelectRows(Source, each List.Contains( BufferedList, [field_name_in_main_table] ) )
Also, you could build your distinct list in power query as well:
You will need three tables.
- Main table that will be reference by table 2 and table 3 below.
- Distinct List that reference the main table
- Table to filter that reference the main table
To build the Distinct List, you can use the following code:
let Source = main_table, #"Inserted Merged Column" = Table.AddColumn(Source, "key", each Text.Combine( {[name], [address], [city], Text.From([zip], "en-US")}, //use Text.From if [zip] is a type number ";"), type text ), #"Removed Duplicates" = Table.Distinct(#"Inserted Merged Column", {"key"}) in #"Removed Duplicates"
Also you could try the Power BI forum: https://community.powerbi.com/t5/Power-Query/bd-p/power-bi-services