Forum Discussion
Power Query - removing identical generic text from the start and end of multiple columns
- Nov 08, 2024
Hi Insert_Key
Thanks for sharing all what you shared and TBH I wouldn't have bet on GenAI regarding data types. What they came up with is very similar to what I've seen so far and what I referred to in my previous reply. Though, AI got it wrong when it comes to numbers vs. whole numbers... In the workbook you shared - just for testing purpose - enter a decimal number in column [Parent id] then refresh the query... 😒
In attached v1.2 I fixed it as follow:
DetectDataType = (column as list) as type => if List.IsEmpty(column) then type nullable text // Default to text if empty else let sampleValue = List.First(List.RemoveNulls(column)), // Get the first non-null value valueType = Value.Type(sampleValue) in if valueType = type text then type nullable text else if valueType = type number and (sampleValue = Number.Round(sampleValue)) then type nullable Int64.Type // Check for whole numbers else if valueType = type number then type nullable number // Decimal numbers else if valueType = type logical then type nullable logical else if valueType = type date or valueType = type datetime then type nullable datetime // Handle Date/Time else type nullable text, // Fallback to textand IMHO the last else should be type nullable any and not text but that doesn't matter in your context as any will end as text once the query loads to the Data Model
A comment, just in case... I see you added a Sort step at the end of your query. I guess that's for checks on the query output within Excel. Otherwise, as far as I know, the Data Model doesn't mind about ordered data
Last thing. Now that I understand your solution has no external connection, you can slightly speed up the time to refresh your query by disabling the PQ Firewall. Excel data (tab) > Get Data > Query Options... > CURRENT WORKBOOK > Privacy > Choose Ignore the Privacy Levels...
Hope this helps and makes sense
Attached Query3 merges the Label & Custom fields in 1 step with List.Accumulate:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
SourceLabelColumnNames = Table.ColumnNames( Source ),
TransformedList = List.Transform(SourceLabelColumnNames, each
if Text.Contains(_, "(" ) then Text.BetweenDelimiters(_, "(", ")" )
else try Text.Range(_, 0, Text.PositionOfAny(_, {"0".."9"} ) ) otherwise _
),
DistinctList = List.Distinct( TransformedList ),
ColumnsToMerge = List.Distinct( List.Difference( TransformedList, DistinctList ) ),
MergedColumns = List.Accumulate( ColumnsToMerge, Source,
(accum,name) as table =>
Table.CombineColumns(
accum,
List.Select( SourceLabelColumnNames, each Text.Contains(_, name ) ),
each Combiner.CombineTextByDelimiter(" ; ", QuoteStyle.None)(List.RemoveNulls(_) ),
name
)
),
Result = if List.IsEmpty( ColumnsToMerge ) then Source else MergedColumns
in
Result- Insert_KeyNov 04, 2024Brass Contributor
Thanks for your help again, Lorenzo
Unfortunately, when I open the files, the renaming does not appear to have been successful:
If I didn't do a good job of explaining myself in my post, please see this image that Riny_van_Eekelen posted above as the end result of his solution:
The column names above are exactly what I am after in this sample table 🙂- LorenzoNov 04, 2024Silver Contributor
A better option (in Query5):
TransformedHeaders = Table.TransformColumnNames( MergedColumns, (name) => if Text.Contains( name, "(" ) then Text.BeforeDelimiter( Text.AfterDelimiter( name , "(" ), ")", {0, RelativePosition.FromEnd} ) else name ), Result = if List.IsEmpty( ColumnsToMerge ) then Source else TransformedHeaders in Result- Insert_KeyNov 04, 2024Brass Contributor
This (Query 5) looks awesome, Lorenzo 👏😊 I'll get to plugging it into my actual workbook and will return to mark it as the best response/leave feedback once I've got it up and running! Thanks again.
- LorenzoNov 04, 2024Silver Contributor
Apologies for missing the point re. the other headers. This can be addressed as follow (in attached Query4):
TransformedHeaders = Table.TransformColumnNames( MergedColumns, (name) => if Text.Contains( name, "(" ) then if Text.Contains( name, "))" ) then Text.BetweenDelimiters( name, "(", ")" , 0, 1 ) else Text.BetweenDelimiters( name, "(", ")" ) else name ), Result = if List.IsEmpty( ColumnsToMerge ) then Source else TransformedHeadersIn Query4 I also took another approach to identify the fields to merge. Won't work if you have fields where digits are in the middle of the field name:
ColumnsToMerge = List.Distinct( List.RemoveNulls( List.Transform( SourceLabelColumnNames, each let split = Splitter.SplitTextByCharacterTransition( (c) => not List.Contains({"0".."9"}, c), {"0".."9"})(_) in if List.Count( split ) > 1 then List.First( split ) else null ) ) )