Forum Discussion
Re: Power Query - removing identical generic text from the start and end of multiple columns
In attached file the query appears to do it...
On sheet 'ExpectedHeaders' there's currently a query that pulls the QueryOutput headers to compare them with the expected one (I used XMATCH assuming you run Excel >/= 2021 otherwise let me know)
10 Replies
- LorenzoSilver Contributor
There was a little mistake in QueryResult line4:
SourceHeaders = Table.FromColumns( { Table.ColumnNames( JiraExport ) }, {"JiraHeader"} ), // Must be replaced with: SourceHeaders = Table.FromColumns( { Table.ColumnNames( Source ) }, {"JiraHeader"} ),Fixed in attached file with my apologies
- LorenzoSilver Contributor
I did something in attached v2 for:
- Created a mapping table (MapCustomFieldNumber) in the PQ Editor. Probably you'll want to have it as a Table in your workbook for easier expansion/maintenance
- Updated the NewColumnNames step to merge (after converting it as table) the list of ColumnsToRename with the mapping table. Then we Expand the [Replacer] field from the mapping table and recombine the appropriate strings. Any question about this let me know
- Insert_KeyBrass Contributor
Hi Lorenzo
Thank you for all of your hard work to find a solution for me.
I must confess that I've become a bit confused by the many versions and their different approaches. I inserted your V1 with corrected source string (immediately previous comment) and it worked... but then I saw that there was another post beneath it, so grabbed that code and updated my workbook.
They both seemed to work fine, so I'm unsure which - if either - is superior. When I "Refreshed All" within V1, it took a little over three minutes to complete updating all queries, but I'd made the mistake of unnecessarily loading the JiraExport to Data Model, and the Results as a table to the workbook. When I wrote V2 over the top, I corrected this - JiraExport is Connection Only and Results are loaded to the Data Model so that I can write some measures and create some Power Pivot Tables and Charts.
With the queries loaded as described, it takes two minutes to refresh all from a data source of 3202 rows. I did modify JiraExport in its own query (i.e. so the source is updated before it's handled by your solution) by removing the Issue id and Parent id columns, and add a couple of steps to QueryResult to address blanks (converting them to nulls) and applying an autodetect of data type:
RenamedMergedColumns = Table.RenameColumns( MergedColumns, List.Zip({ColumnsToRename, NewColumnNames}) ), ReplacedEmptyWithNull = Table.ReplaceValue(RenamedMergedColumns,"",null,Replacer.ReplaceValue,{"Issue Type", "Priority", "Issue key", "Status", "Resolution", "Created", "Updated", "Resolved", "Summary", "Change type", "Impacted Business System (BS)", "Environment (BS)", "Downstream / Upstream Impacted Business System/s (Secondary)", "Impacted Agency / Agencies (BS)", "Environment.", "Number of Users", "Expected Outage", "Change Category", "Labels"}), ChangedType = Table.TransformColumnTypes(ReplacedEmptyWithNull,{{"Issue Type", type text}, {"Priority", type text}, {"Issue key", type text}, {"Status", type text}, {"Resolution", type text}, {"Created", type datetime}, {"Updated", type datetime}, {"Resolved", type datetime}, {"Summary", type text}, {"Change type", type text}, {"Impacted Business System (BS)", type text}, {"Environment (BS)", type text}, {"Downstream / Upstream Impacted Business System/s (Secondary)", type text}, {"Impacted Agency / Agencies (BS)", type text}, {"Environment.", type text}, {"Number of Users", type text}, {"Expected Outage", type text}, {"Change Category", type text}, {"Labels", type text}}), Result = if Table.IsEmpty(FilteredCount) then Source else ChangedType in ResultThere is no maintenance required for me with this mapping table approach is there... is it dynamic in response to the count of how many headers are being merged, or may I need to manually intervene sometime? I don't really understand how it functions.
EDIT: Actually, I've just has a look at it, and I'm not sure that it serves a function for my requirement? The "(BS)", I believe is annotation to differentiate those fields from others that are utilised by other projects/teams. Not a great approach, but not my Jira project and not my problem 😁 Even the "." after "Environment" has been placed there intentionally for the same reason.
"Downstream / Upstream Impacted Business System/s (Secondary)" is a Jira field name should always read exactly that and never be transformed/translate to "Downstream / Upstream Impacted Business System/s (Third)" or anything else. I hope that clarifies 🙂
The transformed data that has come out of the back looks great; I would be very pleased to select a final version and get cracking with my analysis 🤓
Is the latest the greatest, as the saying goes? Which version would you recommend I adopt and stick with?
Thanks again, so much!
For reference, I am in Darwin, NT, Australia. I just wanted to make you aware, as I believe that there is a difference in our time zones and I'm normally not active when you are. I am meant to knock off work in about 30 minutes but do tend to stay back a bit on most nights; fighting the urge to do so tonight, as I feel like a lovely colleague has shared their cold with me 🙄😩