Forum Discussion

madsb1999's avatar
madsb1999
Copper Contributor
Sep 15, 2022

Filter Power Query

Hi,

I have a database in a SQL-server with 8 million rows and +50 columns. I want to be able to filter what data Excel imports, such that it does not load all 8 million rows. In particular I want to be able to filter in an Excel workbook by clicking the things I want to import. As an example I have tried to first select the columns I want to be able to filter by in a Power Query, taking only distinct rows then loading them into Excel, created slicers on this table, loaded it back into Power Query selecting only the rows left after choosing in the slicers and used this for input in a new Power Query that would be the one to import my main data but it is either a really slow approach and sometimes it do not work - it just says "Connecting to source". Any suggestions?

Thanks in advance! 

  • LOTR_Nerd99's avatar
    LOTR_Nerd99
    Copper Contributor

    madsb1999, you could create your distinct list in Power Query as well.

     

    1. To create the distinct list, you will need to create a new query. The final step should be Source[column_name]. This will output a list.
    2. In the main query, use the following code:

     

    Table.SelectRows(Source, 
                     each List.Contains(
                             Distinct_List, 
                             [field_name_in_main_table]
                         )
    )

     

     

     

    This will select the rows in the main table that match the distinct list from step 2.

     

     

     

    • madsb1999's avatar
      madsb1999
      Copper Contributor

      Hi LOTR_Nerd99 , thanks for the answer - I really appreciate any help!
      I think what you are saying is exactly what I have tried to do. In details I have:
      1. Created a Power Query with 6 of my 50+ columns, selected all distinct rows of this and loaded this table into Excel.
      2. Created 6 slicers, one for each of the columns.
      3. Connected the table to Power Query, combined my 6 columns into one and made sure to keep only those rows that was left after filtering in the slicers. Thus I have a list.
      4. Created a Power Query - combined the same 6 columns into a new column and then done exactly as you suggest with List.Contains.

      But for some reason this works really bad. Do you have any suggestions as why this is the case?

      • LOTR_Nerd99's avatar
        LOTR_Nerd99
        Copper Contributor

        madsb1999, you could increase performance by buffering the distinct list. 

         

         

        BufferedList = List.Buffer(Distinct_List),
        #"Filtered to matching list" = Table.SelectRows(Source, 
                                              each List.Contains(
                                                     BufferedList, 
                                                     [field_name_in_main_table]
                                              )
                                       )

         

         

        Also, you could build your distinct list in power query as well:

        You will need three tables.

        1. Main table that will be reference by table 2 and table 3 below.
        2. Distinct List that reference the main table
        3. Table to filter that reference the main table

        To build the Distinct List, you can use the following code:

         

         

        let
            Source = main_table,
            #"Inserted Merged Column" = Table.AddColumn(Source, "key", each 
                                              Text.Combine(
                                                          {[name], 
                                                           [address], 
                                                           [city],
                                                           Text.From([zip], "en-US")}, //use Text.From if [zip] is a type number                                                
                                                           ";"),
                                              type text
                                        ),
            #"Removed Duplicates" = Table.Distinct(#"Inserted Merged Column", {"key"})
        in
            #"Removed Duplicates"

         

        Also you could try the Power BI forum: https://community.powerbi.com/t5/Power-Query/bd-p/power-bi-services

         

         

Resources