Forum Discussion

DGuzmanG's avatar
DGuzmanG
Copper Contributor
Mar 06, 2020
Solved

How to number each occurrence of a substring in a cell in Power Query?

Hello,

I'm fairly new to Power Query and have hit a hiccup that's been bothering me all day. I've read multiple threads here and on the Power BI community and none has really cleared my question, and my logic suggests a few different options to achieve what I want, but my lack of experience blocks any solution I attempt.

Context: I'm building a database for product import/export into WooCommerce, eBay and other channels; which takes some inputs by the (non tech savyy) users in Excel and develops several of the required fields. One of those is the image file names for each product.

I have this columns (in a much larger query table):

Image Base NameImageQtyImageIDs
product-name.jpg3product-name.jpg product-name.jpg product-name.jpg
other-product.jpg5other-product.jpg other-product.jpg...other-product.jpg

And my desired output would be:

 

ImageBaseNameImageQtyImageIDs
product-name.jpg3product-name.jpg product-name.jpg product-name.jpg
other-product.jpg5other-product-1.jpg other-product-2.jpg...other-product-5.jpg

In fact I don't need the two first columns if I get the ImageIDs like that. The ImageBaseName column is generated from the input product name. The ImageQTY column is direct input by the user. The ImageIDs column I got so far is from using:

= Table.AddColumn(#"previous step", "ImageIDs", each Text.Trim(Text.Repeat ([ImageBaseName]&" ", [ImageQty]))) 

And these are the options I've considered thus far:

Option 1: Text.Combine(Text.Split ImageIDs and (somehow) count and number each item in the list) and concatenate it all back... Which would probably start like this: Text.Combine(Text.Split,,,

Option 2 Using the UI, splitting the ImageIDs by each space and by a high number of columns (as I don't know how many images each product will have, but probably no more than 12) and then assign a number suffix to each of those columns and then putting it all back together, but it feels messy as hell.

Option 3 Probably theres a clean calculated way to generate the numbered image base names based on the number in the second column, and then attach the .jpg at the end of each, but honestly I don't know how.

I'd like it to be on the same table as I am already dealing with different queries...

Any help would be gladly accepted.

  • SergeiBaklan's avatar
    SergeiBaklan
    Mar 06, 2020

    ChrisMendoza 

    I'd simplify a bit

    let
        Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WKijKTylNLtHNS8xN1csqSFfSUTJWitWJVsovyUgt0oVKQ2VMlWJjAQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [#"Image Base Name" = _t, ImageQty = _t]),
        AddImageIDs = Table.AddColumn(
            Source,
            "ImageIDs",
            each
                Text.Combine(
                    List.Transform(
                        {1..Number.FromText([ImageQty])},
                        (x)=>
                            Text.BeforeDelimiter([Image Base Name],".") &
                            "-" &
                            Text.From(x) &
                            "." &
                            Text.AfterDelimiter([Image Base Name],".")
                    ),
                    " "
                )
        )
    in
        AddImageIDs

     

3 Replies

  • ChrisMendoza's avatar
    ChrisMendoza
    Iron Contributor

    DGuzmanG -

    Will this work for you?

    let
        Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WKijKTylNLtHNS8xN1csqSFfSUTJWitWJVsovyUgt0oVKQ2VMlWJjAQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [#"Image Base Name" = _t, ImageQty = _t]),
        #"Changed Type" = Table.TransformColumnTypes(Source,{{"Image Base Name", type text}, {"ImageQty", Int64.Type}}),
        #"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each List.Transform({1 ..Number.From([ImageQty])}, each Number.From(_))),
        #"Expanded Custom" = Table.ExpandListColumn(#"Added Custom", "Custom"),
        #"Added Custom1" = Table.AddColumn(#"Expanded Custom", "Custom.1", each Text.Insert([Image Base Name],Text.PositionOf([Image Base Name],".",Occurrence.Last),"-"&Text.From([Custom]))),
        #"Grouped Rows" = Table.Group(#"Added Custom1", {"Image Base Name", "ImageQty"}, {{"Data", each _, type table [Image Base Name=text, ImageQty=number, Custom=number, Custom.1=text]}}),
        #"Added Custom2" = Table.AddColumn(#"Grouped Rows", "Custom", each Table.Column([Data],"Custom.1")),
        #"Extracted Values" = Table.TransformColumns(#"Added Custom2", {"Custom", each Text.Combine(List.Transform(_, Text.From), " "), type text}),
        #"Removed Columns" = Table.RemoveColumns(#"Extracted Values",{"Data"})
    in
        #"Removed Columns"
    • SergeiBaklan's avatar
      SergeiBaklan
      Diamond Contributor

      ChrisMendoza 

      I'd simplify a bit

      let
          Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WKijKTylNLtHNS8xN1csqSFfSUTJWitWJVsovyUgt0oVKQ2VMlWJjAQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [#"Image Base Name" = _t, ImageQty = _t]),
          AddImageIDs = Table.AddColumn(
              Source,
              "ImageIDs",
              each
                  Text.Combine(
                      List.Transform(
                          {1..Number.FromText([ImageQty])},
                          (x)=>
                              Text.BeforeDelimiter([Image Base Name],".") &
                              "-" &
                              Text.From(x) &
                              "." &
                              Text.AfterDelimiter([Image Base Name],".")
                      ),
                      " "
                  )
          )
      in
          AddImageIDs

       

      • DGuzmanG's avatar
        DGuzmanG
        Copper Contributor

        SergeiBaklan @ChrisMendoza 

        Thanks for your answers guys!!

        Like Sergei mentioned, the solution could be simplified and I used a similar solution to what he suggested, which came up on Stack Overflow:

         

        What I did, was to remove the ".jpg" suffix in a previous step where it generated the ImageBaseName, so that it would just output what should be the filename. And it allowed me to generate the names based on the two first columns of my example (no "ImageIDs" required).

        It looked like this:

         

        #"ImageIDsFromList" = Table.AddColumn (
           #"PreviousStep", 
           "ImageIDsList", 
           each 
            Text.Combine(
              List.Transform(
                 List.Zip(
                    {
                    List.Repeat(
                       {[ImageBaseName]},
                       [ImageQTY]),
                    List.Transform(
                          {1..[ImageQTY]}, each "-" & Number.ToText(_) & ".jpg"
                       )
                    }
                 )
               , each Text.Combine(_)
              )
              , ", "
             )
           )
        IN
        ImageIDsFromList

         

         

Resources