Forum Discussion

Jeffino's avatar
Jeffino
Copper Contributor
May 25, 2024
Solved

Removing duplicates and sorting data in columns instead of rows

Hello, one of our software exports data in the following format. I don't want to touch this file since it will be exported frequently.

I want to create a different Excel file with formulas which is linked to this file and configure it in a way to obtain the results in the following way:

Basically removing the duplicates and sorting the data in columns and leaving blanks when nothing is found.

Thank you!

 

  • Jeffino 

    As variant bit more close to UI.

    First query, SourceRaw, loads the file and declares data types (not sure which exactly locale shall be used)

    let
        pFilepath = "C:\Test\My_Projects_&_Quotations_2024_06_05_22_36.csv",
    
        Source = Csv.Document(
            File.Contents(pFilepath),
            [Delimiter=",", Columns=15, Encoding=65001, QuoteStyle=QuoteStyle.None]
        ),
        PromoteHeaders = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
        DeclareType = Table.TransformColumnTypes(
            PromoteHeaders,
            {
                  {"Gross Value", Currency.Type}
                , {"Date Created", type datetime}
                , {"Date Modified", type datetime}
                , {"Completion date", type date}
            }, "fr-LU")
    in
        DeclareType

    Referenced SourceRaw query, Projects, keeps columns with repeated values for each project and removes dupicates

    let
        Source = SourceRaw,
        SelectColumns = Table.RemoveColumns(Source,{"Industry Name", "Industry", "Quoted to"}),
        RemoveDuplicates = Table.Distinct(SelectColumns)
    in
        RemoveDuplicates

    Query Industry pivots projects on them

    let
        Source = SourceRaw,
        SelectColumns = Table.SelectColumns(Source,{"Project Name", "Industry", "Industry Name"}),
        ProjectIndustry = Table.Pivot(
            SelectColumns,
            List.Distinct(SelectColumns[Industry]),
            "Industry", "Industry Name", List.Min)
    in
        ProjectIndustry

    Query QutedTo expands quotes to columns

    let
        Source = SourceRaw,
        SelectColumns = Table.SelectColumns(Source,{"Project Name", "Quoted to"}),
        NoBlanks = Table.SelectRows(SelectColumns, each ([Quoted to] <> "")),
        RemoveDuplicates = Table.Distinct(NoBlanks),
        CountRecords = Table.Group(
            RemoveDuplicates,
            {"Project Name"},
            {{"Count", each Table.RowCount(_), Int64.Type}}),
    
        QuoteCount = List.Max(CountRecords[Count]),
        names = List.Transform( {1..QuoteCount}, (q) => "Quote to #" & Text.From(q)),
    
        ListQutedTo = Table.Group(
            RemoveDuplicates,
            {"Project Name"},
            {{"Quoted to", each [#"Quoted to"] }}),
        CombineQuotedTo = Table.TransformColumns(
            ListQutedTo,
            {"Quoted to", each Text.Combine(List.Transform(_, Text.From), "="), type text}),
        ExpandQuotedTo = Table.SplitColumn(
            CombineQuotedTo,
            "Quoted to",
            Splitter.SplitTextByDelimiter("=", QuoteStyle.Csv),
            names)
    in
        ExpandQuotedTo

    Finally merge all togeteher

    let
        MergeWithIndustry = Table.NestedJoin(
            Projects, {"Project Name"},
            Industry, {"Project Name"},
            "Industry",
            JoinKind.LeftOuter),
        IndustryNames = List.Skip( Table.ColumnNames(MergeWithIndustry{0}[Industry]) ),
        ExpandIndustry = Table.ExpandTableColumn(
            MergeWithIndustry,
            "Industry",
            IndustryNames),
    
        MergeQuotedTo = Table.NestedJoin(
            ExpandIndustry, {"Project Name"},
            QuotedTo, {"Project Name"},
            "QuotedTo",
            JoinKind.LeftOuter),
        QuotedToNames = List.Skip( Table.ColumnNames(MergeQuotedTo{0}[QuotedTo]) ),
        ExpandQuotedTo = Table.ExpandTableColumn(
            MergeQuotedTo,
            "QuotedTo",
            QuotedToNames)
    in
        ExpandQuotedTo

15 Replies

  • Jeffino 

    Using aggressively modern formulae

    = LET(
        crosstab,  PIVOTBY(project, category, item, CONCAT,,0,,0),
        totalCost, DROP(SUMIFS(cost, project, TAKE(crosstab,,1)),1),
        HSTACK(crosstab, VSTACK("Total cost", totalCost))
      )

     

    • Jeffino's avatar
      Jeffino
      Copper Contributor

      Riny_van_Eekelen you opened a new world to me with Power Query! Thank you

      I partially solved my problem

       

      Could you please have a look at the attached and help me solving the below issue?

      CSV file is the database

      EXCEL File has the Query and my Target on a separate worksheet

      You might need to link them again

       

      Basically I need to transpose COLUMN: "Quoted to" and not succeeding

      • rachel's avatar
        rachel
        Steel Contributor

        Jeffino 

         

        let
          Source = Csv.Document(File.Contents("XXXX/My_Projects_&_Quotations_2024_06_05_22_36.csv"), [Delimiter = ",", Columns = 15, Encoding = 65001, QuoteStyle = QuoteStyle.None]),
          PromotedHeaders = Table.PromoteHeaders(Source, [PromoteAllScalars = true]),
          groupingKey = List.Difference(Table.ColumnNames(PromotedHeaders), {"Industry Name", "Industry", "Quoted to"}),
          grouped = Table.Group(PromotedHeaders, groupingKey, 
                                {"Records", each 
                                  let 
                                    quotedToList = List.Select(List.Distinct([Quoted to]), each _ <> ""),
                                    quotedToHeaders = List.Transform({0..List.Count(quotedToList)-1}, each "Quoted to" & " " & Number.ToText(_+1)),
                                    quotedToRecord = Record.FromList({[Industry Name]{0}, [Industry]{0}} & quotedToList, {"Industry Name", "Industry"} & quotedToHeaders)
                                  in 
                                    quotedToRecord
                                }),
          headers = List.Union(List.Transform(grouped[Records], each Record.FieldNames(_))),
          ExpandedRecords = Table.ExpandRecordColumn(grouped, "Records", headers, headers)
        in
          ExpandedRecords

         

         

Resources