Forum Discussion
Jeffino
May 25, 2024Copper Contributor
Removing duplicates and sorting data in columns instead of rows
Hello, one of our software exports data in the following format. I don't want to touch this file since it will be exported frequently. I want to create a different Excel file with formulas whi...
- Jun 08, 2024
As variant bit more close to UI.
First query, SourceRaw, loads the file and declares data types (not sure which exactly locale shall be used)
let pFilepath = "C:\Test\My_Projects_&_Quotations_2024_06_05_22_36.csv", Source = Csv.Document( File.Contents(pFilepath), [Delimiter=",", Columns=15, Encoding=65001, QuoteStyle=QuoteStyle.None] ), PromoteHeaders = Table.PromoteHeaders(Source, [PromoteAllScalars=true]), DeclareType = Table.TransformColumnTypes( PromoteHeaders, { {"Gross Value", Currency.Type} , {"Date Created", type datetime} , {"Date Modified", type datetime} , {"Completion date", type date} }, "fr-LU") in DeclareType
Referenced SourceRaw query, Projects, keeps columns with repeated values for each project and removes dupicates
let Source = SourceRaw, SelectColumns = Table.RemoveColumns(Source,{"Industry Name", "Industry", "Quoted to"}), RemoveDuplicates = Table.Distinct(SelectColumns) in RemoveDuplicates
Query Industry pivots projects on them
let Source = SourceRaw, SelectColumns = Table.SelectColumns(Source,{"Project Name", "Industry", "Industry Name"}), ProjectIndustry = Table.Pivot( SelectColumns, List.Distinct(SelectColumns[Industry]), "Industry", "Industry Name", List.Min) in ProjectIndustry
Query QutedTo expands quotes to columns
let Source = SourceRaw, SelectColumns = Table.SelectColumns(Source,{"Project Name", "Quoted to"}), NoBlanks = Table.SelectRows(SelectColumns, each ([Quoted to] <> "")), RemoveDuplicates = Table.Distinct(NoBlanks), CountRecords = Table.Group( RemoveDuplicates, {"Project Name"}, {{"Count", each Table.RowCount(_), Int64.Type}}), QuoteCount = List.Max(CountRecords[Count]), names = List.Transform( {1..QuoteCount}, (q) => "Quote to #" & Text.From(q)), ListQutedTo = Table.Group( RemoveDuplicates, {"Project Name"}, {{"Quoted to", each [#"Quoted to"] }}), CombineQuotedTo = Table.TransformColumns( ListQutedTo, {"Quoted to", each Text.Combine(List.Transform(_, Text.From), "="), type text}), ExpandQuotedTo = Table.SplitColumn( CombineQuotedTo, "Quoted to", Splitter.SplitTextByDelimiter("=", QuoteStyle.Csv), names) in ExpandQuotedTo
Finally merge all togeteher
let MergeWithIndustry = Table.NestedJoin( Projects, {"Project Name"}, Industry, {"Project Name"}, "Industry", JoinKind.LeftOuter), IndustryNames = List.Skip( Table.ColumnNames(MergeWithIndustry{0}[Industry]) ), ExpandIndustry = Table.ExpandTableColumn( MergeWithIndustry, "Industry", IndustryNames), MergeQuotedTo = Table.NestedJoin( ExpandIndustry, {"Project Name"}, QuotedTo, {"Project Name"}, "QuotedTo", JoinKind.LeftOuter), QuotedToNames = List.Skip( Table.ColumnNames(MergeQuotedTo{0}[QuotedTo]) ), ExpandQuotedTo = Table.ExpandTableColumn( MergeQuotedTo, "QuotedTo", QuotedToNames) in ExpandQuotedTo
Jeffino
Jun 07, 2024Copper Contributor
Riny_van_Eekelen you opened a new world to me with Power Query! Thank you
I partially solved my problem
Could you please have a look at the attached and help me solving the below issue?
CSV file is the database
EXCEL File has the Query and my Target on a separate worksheet
You might need to link them again
Basically I need to transpose COLUMN: "Quoted to" and not succeeding
rachel
Jun 08, 2024Steel Contributor
let
Source = Csv.Document(File.Contents("XXXX/My_Projects_&_Quotations_2024_06_05_22_36.csv"), [Delimiter = ",", Columns = 15, Encoding = 65001, QuoteStyle = QuoteStyle.None]),
PromotedHeaders = Table.PromoteHeaders(Source, [PromoteAllScalars = true]),
groupingKey = List.Difference(Table.ColumnNames(PromotedHeaders), {"Industry Name", "Industry", "Quoted to"}),
grouped = Table.Group(PromotedHeaders, groupingKey,
{"Records", each
let
quotedToList = List.Select(List.Distinct([Quoted to]), each _ <> ""),
quotedToHeaders = List.Transform({0..List.Count(quotedToList)-1}, each "Quoted to" & " " & Number.ToText(_+1)),
quotedToRecord = Record.FromList({[Industry Name]{0}, [Industry]{0}} & quotedToList, {"Industry Name", "Industry"} & quotedToHeaders)
in
quotedToRecord
}),
headers = List.Union(List.Transform(grouped[Records], each Record.FieldNames(_))),
ExpandedRecords = Table.ExpandRecordColumn(grouped, "Records", headers, headers)
in
ExpandedRecords
- JeffinoJun 08, 2024Copper Contributor
rachel Thank you so much for this but i dont really know how to apply these steps...could you please elaborate it more or share with me the file? Much appreciate it
- rachelJun 08, 2024Steel ContributorYoutube video:
https://www.youtube.com/watch?v=EZXOQylIjVo
Start watching from 1:30, it shows where to paste the M code.
You will need to change the file path of the csv in "Source" too... - SergeiBaklanJun 08, 2024MVP
As variant bit more close to UI.
First query, SourceRaw, loads the file and declares data types (not sure which exactly locale shall be used)
let pFilepath = "C:\Test\My_Projects_&_Quotations_2024_06_05_22_36.csv", Source = Csv.Document( File.Contents(pFilepath), [Delimiter=",", Columns=15, Encoding=65001, QuoteStyle=QuoteStyle.None] ), PromoteHeaders = Table.PromoteHeaders(Source, [PromoteAllScalars=true]), DeclareType = Table.TransformColumnTypes( PromoteHeaders, { {"Gross Value", Currency.Type} , {"Date Created", type datetime} , {"Date Modified", type datetime} , {"Completion date", type date} }, "fr-LU") in DeclareType
Referenced SourceRaw query, Projects, keeps columns with repeated values for each project and removes dupicates
let Source = SourceRaw, SelectColumns = Table.RemoveColumns(Source,{"Industry Name", "Industry", "Quoted to"}), RemoveDuplicates = Table.Distinct(SelectColumns) in RemoveDuplicates
Query Industry pivots projects on them
let Source = SourceRaw, SelectColumns = Table.SelectColumns(Source,{"Project Name", "Industry", "Industry Name"}), ProjectIndustry = Table.Pivot( SelectColumns, List.Distinct(SelectColumns[Industry]), "Industry", "Industry Name", List.Min) in ProjectIndustry
Query QutedTo expands quotes to columns
let Source = SourceRaw, SelectColumns = Table.SelectColumns(Source,{"Project Name", "Quoted to"}), NoBlanks = Table.SelectRows(SelectColumns, each ([Quoted to] <> "")), RemoveDuplicates = Table.Distinct(NoBlanks), CountRecords = Table.Group( RemoveDuplicates, {"Project Name"}, {{"Count", each Table.RowCount(_), Int64.Type}}), QuoteCount = List.Max(CountRecords[Count]), names = List.Transform( {1..QuoteCount}, (q) => "Quote to #" & Text.From(q)), ListQutedTo = Table.Group( RemoveDuplicates, {"Project Name"}, {{"Quoted to", each [#"Quoted to"] }}), CombineQuotedTo = Table.TransformColumns( ListQutedTo, {"Quoted to", each Text.Combine(List.Transform(_, Text.From), "="), type text}), ExpandQuotedTo = Table.SplitColumn( CombineQuotedTo, "Quoted to", Splitter.SplitTextByDelimiter("=", QuoteStyle.Csv), names) in ExpandQuotedTo
Finally merge all togeteher
let MergeWithIndustry = Table.NestedJoin( Projects, {"Project Name"}, Industry, {"Project Name"}, "Industry", JoinKind.LeftOuter), IndustryNames = List.Skip( Table.ColumnNames(MergeWithIndustry{0}[Industry]) ), ExpandIndustry = Table.ExpandTableColumn( MergeWithIndustry, "Industry", IndustryNames), MergeQuotedTo = Table.NestedJoin( ExpandIndustry, {"Project Name"}, QuotedTo, {"Project Name"}, "QuotedTo", JoinKind.LeftOuter), QuotedToNames = List.Skip( Table.ColumnNames(MergeQuotedTo{0}[QuotedTo]) ), ExpandQuotedTo = Table.ExpandTableColumn( MergeQuotedTo, "QuotedTo", QuotedToNames) in ExpandQuotedTo
- JeffinoJun 11, 2024Copper Contributor
Thank you all very much! Really appreciate it!
I have used SergeiBaklan response since it is more in line with my capabilities and knowledge
Great community!