Sep 21 2022 05:24 AM
Hello everybody,
I have a question regarding a data table imported from a PDF file.
I have a product catalogue where a range of codes are 'grouped' in a single cell, expressed as 'Lowest number' '-' 'highest number'. This is not true for all rows in the column.
As the list is over 1100 rows long, is there a way to automatically detect the cells containing ranges and expand them so that every code is individually listed in the range order?
thank you
Sep 21 2022 05:42 AM
@Costan08 If you are confident with Power Query, please see the attached (very simple) example. I took the liberty to assume that the table might contain more than just one column.
Sep 21 2022 06:14 AM
I have never used power query nor the microsoft techcommunity.
Therefore it's difficult for me to understand how you achieved the result you posted, which is the exact answer I am looking for. However, I cannot seem to post neither pictures or the list I am trying to parse.
Could you please explain the process you went through?
Thank you
Sep 21 2022 06:20 AM - edited Sep 21 2022 06:21 AM
@Costan08 Try sharing the file on OneDrive or similar. Otherwise, click on my "name tag" and send it to me via private message.
Power Query take a bit of an effort to learn, but once you do, you "can't live without it".
Sep 21 2022 07:14 AM
Sep 21 2022 07:26 AM
@Costan08 Aha! Alpha-numeric is a different thing. Am attaching the PDF you sent to me. Don't have time right now to look into it. Perhaps someone else.
Sep 21 2022 10:42 PM
@Costan08 Please find a partial PQ solution. I call it partial because I didn't manage to get the alpha-numeric ranges to perfection. Although the number sequences are in place I couldn't think of an easy way to re-insert the characters. You'll find one column containing the original range followed by one which just numbers but in a correct sequence.
See if this is something you can live with. If not, someone else has to step in and fix it. Or perhaps think of a better way all-together.
Sep 22 2022 01:02 AM
Your solution taught me a lot. I appreciate it. To resolve the very last bit of house cleaning (re-inserting the characters) I offer this:
DG_ListDiff = Table.AddColumn(#"Expanded Custom.2", "List Difference", each List.Difference(Text.ToList([#"Range colori.1"]),Text.ToList(Text.From([#"Custom - Copy"])))),
DG2_ExtractList = Table.TransformColumns(DG_ListDiff, {"List Difference", each Text.Combine(List.Transform(_, Text.From)), type text}),
DG3_PositionOf = Table.AddColumn(DG2_ExtractList, "PositionOf", each List.PositionOfAny(Text.ToList([#"Range colori.1"]),Text.ToList([List Difference]))),
DG4_ListInsert = Table.AddColumn(DG3_PositionOf, "Rebuild Alpha", each List.InsertRange(Text.ToList(Text.From([#"Custom.2"])),[PositionOf],Text.ToList([List Difference]))),
DG5_RebuildAlpha = Table.TransformColumns(DG4_ListInsert, {"Rebuild Alpha", each Text.Combine(List.Transform(_, Text.From)), type text}),
#"Removed Columns3" = Table.RemoveColumns(DG5_RebuildAlpha,{"Range colori.2", "Custom", "Custom.1", "Range colori.1", "List Difference", "PositionOf","Custom.2","Custom - Copy"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns3",{{"Rebuild Alpha", "Colori"}}),
The end result (refer to "Rebuild Alpha" vs "Range colori" to see the text characters have been inserted into the numerical sequence. This requires contiguous text characters when there are multiple - I looked briefly and didn't notice non-contiguous text characters - but admittedly did not do a formal validation to ensure such.
Sep 22 2022 01:32 AM
@DexterG_III Nice, but unfortunately, the color codes do have non-continues character sequences. I noticed 5 different alpha-numeric structures:
n, nCn, Cn, CnC and CnCn where n stands for a number (or number sequence) and C stands for a character (or character sequence).
So, you will find codes like A90000 and AA229A just to give an example.
Sep 22 2022 01:35 AM
Sep 22 2022 01:48 AM
@Riny_van_Eekelen So all this time my hoping the data would be simple enough to support my formula has been a failed tactic. Reality sucks. lol.
Sincerely though, thanks for confirming. Is there a max number of non-contiguous possibilities? Is it possible to have CnCnC or more?
Dexter
Sep 22 2022 01:59 AM
@DexterG_III Not as far I could discover.
Sep 22 2022 02:41 AM
@Riny_van_Eekelen I unfortunately closed that workbook without option to recover and need to sleep soon, but I do believe it's possible to
I admit, it's not clean and there's likely a much more elegant solution but I do believe that would work in a pinch.
Sorry I'm not able to provide the actual code now.
I'll check in tomorrow and provide if this thread remains unresolved by then.
Sep 22 2022 04:54 AM
Solution@Costan08 Perhaps the attached file is what you want. It allows for the five code structures that I mentioned earlier. It seems a rather clumsy approach, though.
The combined tables from the PDF contained 2238 line-items. The query produces 14648 line-items (216 more than in my previous attempt). Difficult to be sure that this one is complete now. Check it thoroughly please!
Sep 22 2022 04:54 AM
Solution@Costan08 Perhaps the attached file is what you want. It allows for the five code structures that I mentioned earlier. It seems a rather clumsy approach, though.
The combined tables from the PDF contained 2238 line-items. The query produces 14648 line-items (216 more than in my previous attempt). Difficult to be sure that this one is complete now. Check it thoroughly please!