SOLVED

Expanding range stated in same cell.

Copper Contributor

Hello everybody,

 

I have a question regarding a data table imported from a PDF file. 

 

I have a product catalogue where a range of codes are 'grouped' in a single cell, expressed as 'Lowest number' '-' 'highest number'. This is not true for all rows in the column.

 

As the list is over 1100 rows long, is there a way to automatically detect the cells containing ranges and expand them so that every code is individually listed in the range order? 

 

thank you

 

13 Replies

@Costan08 If you are confident with Power Query, please see the attached (very simple) example. I took the liberty to assume that the table might contain more than just one column.

Riny_van_Eekelen_0-1663764054801.png

 

@Riny_van_Eekelen 

 

I have never used power query nor the microsoft techcommunity. 

 

Therefore it's difficult for me to understand how you achieved the result you posted, which is the exact answer I am looking for. However, I cannot seem to post neither pictures or the list I am trying to parse.

 

Could you please explain the process you went through? 

 

Thank you 

@Costan08 Try sharing the file on OneDrive or similar. Otherwise, click on my "name tag" and send it to me via private message.

 

Power Query take a bit of an effort to learn, but once you do, you "can't live without it".

I have tried to reverse-engineer your power query, however I cannot seem to automatically fill the values in between the range extremes. I am not sure if this has to do to the fact that some of the ranges include alphanumeric values. Anyhow, I have sent you a copy of the PDF list I am trying to expand.

@Costan08 Aha! Alpha-numeric is a different thing. Am attaching the PDF you sent to me. Don't have time right now to look into it. Perhaps someone else.

@Costan08 Please find a partial PQ solution. I call it partial because I didn't manage to get the alpha-numeric ranges to perfection. Although the number sequences are in place I couldn't think of an easy way to re-insert the characters. You'll find one column containing the original range followed by one which just numbers but in a correct sequence.

 

See if this is something you can live with. If not, someone else has to step in and fix it. Or perhaps think of a better way all-together. 

@Riny_van_Eekelen 

 

Your solution taught me a lot.  I appreciate it.  To resolve the very last bit of house cleaning (re-inserting the characters) I offer this: 

 

DexterG_III_3-1663833157831.png

 

    DG_ListDiff = Table.AddColumn(#"Expanded Custom.2", "List Difference", each List.Difference(Text.ToList([#"Range colori.1"]),Text.ToList(Text.From([#"Custom - Copy"])))),
    DG2_ExtractList = Table.TransformColumns(DG_ListDiff, {"List Difference", each Text.Combine(List.Transform(_, Text.From)), type text}),
    DG3_PositionOf = Table.AddColumn(DG2_ExtractList, "PositionOf", each List.PositionOfAny(Text.ToList([#"Range colori.1"]),Text.ToList([List Difference]))),
    DG4_ListInsert = Table.AddColumn(DG3_PositionOf, "Rebuild Alpha", each List.InsertRange(Text.ToList(Text.From([#"Custom.2"])),[PositionOf],Text.ToList([List Difference]))),
    DG5_RebuildAlpha = Table.TransformColumns(DG4_ListInsert, {"Rebuild Alpha", each Text.Combine(List.Transform(_, Text.From)), type text}),
    #"Removed Columns3" = Table.RemoveColumns(DG5_RebuildAlpha,{"Range colori.2", "Custom", "Custom.1", "Range colori.1", "List Difference", "PositionOf","Custom.2","Custom - Copy"}),
    #"Renamed Columns" = Table.RenameColumns(#"Removed Columns3",{{"Rebuild Alpha", "Colori"}}),

 

The end result (refer to "Rebuild Alpha" vs "Range colori" to see the text characters have been inserted into the numerical sequence.   This requires contiguous text characters when there are multiple - I looked briefly and didn't notice non-contiguous text characters - but admittedly did not do a formal validation to ensure such.  

 

DexterG_III_4-1663833207298.png

 

 

@DexterG_III Nice, but unfortunately, the color codes do have non-continues character sequences. I noticed 5 different alpha-numeric structures:

n, nCn, Cn, CnC and CnCn where n stands for a number (or number sequence) and C stands for a character (or character sequence).

So, you will find codes like A90000 and AA229A just to give an example.

I was trying to re-insert the numbers by splitting columns an re-combining them based off the knowledge I have about how the codes are structured. This is a tedious and error prone process though. Again, though, I am a power query novice so I am unsure if there is a better process to achieve the same result.

@Riny_van_Eekelen So all this time my hoping the data would be simple enough to support my formula has been a failed tactic.  Reality sucks.  lol.  

 

Sincerely though, thanks for confirming.  Is there a max number of non-contiguous possibilities?  Is it possible to have CnCnC or more?  

 

Dexter

 

 

@DexterG_III Not as far I could discover.

@Riny_van_Eekelen I unfortunately closed that workbook without option to recover and need to sleep soon, but I do believe it's possible to

  1. Add two more columns directly after line 3 of the code above and use Text.Range in combination with the value from DG3_PositionOf to parse the remaining portion (after the first text character(s) were recognized) of both Range colori.1 & Custom - Copy .
  2. Add another column & use List Difference comparing the two new columns to return the second set of non contiguous characters
  3. Insert another column to return the position of the 2nd set of non-contiguous characters and add that to the original DG3_PositionOf value
  4.  After the DG5RebuildAplpha step, use the same logic to insert the 2nd set of non-contiguous characters. 

I admit, it's not clean and there's likely a much more elegant solution but I do believe that would work in a pinch.  

 

Sorry I'm not able to provide the actual code now.  

 

I'll check in tomorrow and provide if this thread remains unresolved by then.  

best response confirmed by Costan08 (Copper Contributor)
Solution

@Costan08 Perhaps the attached file is what you want. It allows for the five code structures that I mentioned earlier. It seems a rather clumsy approach, though.

 

The combined tables from the PDF contained 2238 line-items. The query produces 14648 line-items (216 more than in my previous attempt). Difficult to be sure that this one is complete now. Check it thoroughly please!

 

1 best response

Accepted Solutions
best response confirmed by Costan08 (Copper Contributor)
Solution

@Costan08 Perhaps the attached file is what you want. It allows for the five code structures that I mentioned earlier. It seems a rather clumsy approach, though.

 

The combined tables from the PDF contained 2238 line-items. The query produces 14648 line-items (216 more than in my previous attempt). Difficult to be sure that this one is complete now. Check it thoroughly please!

 

View solution in original post