Forum Discussion

mohamedsaleem18's avatar
mohamedsaleem18
Copper Contributor
Jul 06, 2021
Solved

Text (address) formatting in Excel

I'm new to Excel text formatting.  Can you please help me to do this below text formatting

Requirement:

 Address column is a free text.  This should be split in to three column. 

  1.  1st column should contain house number.
  2. 2nd column should contain 40 characters after house number in the first column.  If the 40th character ends in between a text then the substring should end space before that text.
  3. 3rd column - If any text is left after 40 characters then the remaining 40 should be in this column. 
  4.  If there is any comma in the text then it should be removed.
  5. “No” or “No:” in the address are not required.

Example

Original text:

Address

No: 303 Exte 7/23e Taiua Daau Keota

18 Exte 7/23e Taiua Daau Keota

No 10-3-23 Exte 2/1A

No 851 External jasa pssang thnsgal 080000 Subngai Keota

No 5 Exte sri rssrawak 30 tman ram spaalas klang

No:303 Exten 17 Jenjarom 426000, Bihar

After formatting:

Column 1

Column 2

Column 3

303

Exte 7/23e Taiua Daau Keota

 

18

Exte 7/23e Taiua Daau Keota

 

10-3-23

Exte 2/1A

 

851

External jasa pssang thnsgal 080000

Subngai Keota

5

Exte sri rssrawak 30 tman ram spaalas

klang

303

Exten 17 Jenjarom 426000, Bihar

 

 

  • SergeiBaklan's avatar
    SergeiBaklan
    Jul 06, 2021

    HansVogelaar 

    Another mockup with lambdas

    prefix = {"No", "No:"};
    removeFirst = lambda(str, txt, SUBSTITUTE(str, txt, "",1) );
    
    ////
    noPrefix = lambda(str, k,
        IF(k=0, TRIM(str),
            LET(t, removeFirst(str, INDEX(prefix,1,k)),
                noPrefix(t, k-1) )
        ));
    cleanText = lambda(str, noPrefix(str, COLUMNS(prefix)));
    nFirstSpace = lambda( str, FIND(" ", str)-1);
    textStart = lambda(str, n, TRIM(LEFT(str, n)));
    textEnd = lambda(str, n, TRIM(RIGHT(str, LEN(str)-n)));
    lastPos = lambda(str, chr, n,
        IF(RIGHT(str)=chr, n,
         lastPos(LEFT(str, n-1), chr, n-1)));
    posRightSpace = lambda(str, lastPos(str, " ", LEN(str)));
    
    ////
    splitParts = lambda( str,
        LET(
            txt, cleanText(str),
            nA, nFirstSpace(txt),
            partA, textStart(txt, nA),
            partBC, textEnd(txt, nA),
            first40, textStart(partBC, 40),
            space40, posRightSpace(first40), 
            partB, IF(LEN(partBC)<=40,
                    partBC,
                    textStart(partBC, space40)
            ),
            partC, IF(LEN(partBC)<=40,
                    "",
                    textEnd(partBC, space40)
            ),
            IFERROR(CHOOSE({1,2,3}, partA, partB, partC),"")
        )
        );
    

10 Replies

  • mohamedsaleem18 

    Copy the following function into a module in the Visual Basic Editor:

    Function SplitAddress(ByVal s As String)
        Dim p As Long
        Dim v(1 To 3) As String
        If Left(s, 3) = "No " Then
            s = Mid(s, 4)
        ElseIf Left(s, 3) = "No:" Then
            s = Trim(Mid(s, 4))
        End If
        p = InStr(s, " ")
        v(1) = Left(s, p - 1)
        s = Mid(s, p + 1)
        If Len(s) <= 40 Then
            v(2) = s
        Else
            If Mid(s, 41, 1) = " " Then
                p = 41
            Else
                p = InStrRev(s, " ", 40)
            End If
            v(2) = Left(s, p - 1)
            v(3) = Mid(s, p + 1)
        End If
        SplitAddress = v
    End Function

    Let's say the addresses start in A2.

    Select B2:D2.

    Enter the following formula and if you don't have Excel in Microsoft 365, confirm it with Ctrl+Shift+Enter.

    =SplitAddress(A2)

    Fill down.

    Save the workbook as a macro-enabled workbook (.xlsm) and make sure that you allow macros when you open it.

     

    Others will probably post a solution using Lambda functions or PowerQuery.

    • SergeiBaklan's avatar
      SergeiBaklan
      Diamond Contributor

      HansVogelaar 

      Another mockup with lambdas

      prefix = {"No", "No:"};
      removeFirst = lambda(str, txt, SUBSTITUTE(str, txt, "",1) );
      
      ////
      noPrefix = lambda(str, k,
          IF(k=0, TRIM(str),
              LET(t, removeFirst(str, INDEX(prefix,1,k)),
                  noPrefix(t, k-1) )
          ));
      cleanText = lambda(str, noPrefix(str, COLUMNS(prefix)));
      nFirstSpace = lambda( str, FIND(" ", str)-1);
      textStart = lambda(str, n, TRIM(LEFT(str, n)));
      textEnd = lambda(str, n, TRIM(RIGHT(str, LEN(str)-n)));
      lastPos = lambda(str, chr, n,
          IF(RIGHT(str)=chr, n,
           lastPos(LEFT(str, n-1), chr, n-1)));
      posRightSpace = lambda(str, lastPos(str, " ", LEN(str)));
      
      ////
      splitParts = lambda( str,
          LET(
              txt, cleanText(str),
              nA, nFirstSpace(txt),
              partA, textStart(txt, nA),
              partBC, textEnd(txt, nA),
              first40, textStart(partBC, 40),
              space40, posRightSpace(first40), 
              partB, IF(LEN(partBC)<=40,
                      partBC,
                      textStart(partBC, space40)
              ),
              partC, IF(LEN(partBC)<=40,
                      "",
                      textEnd(partBC, space40)
              ),
              IFERROR(CHOOSE({1,2,3}, partA, partB, partC),"")
          )
          );
      
    • SergeiBaklan's avatar
      SergeiBaklan
      Diamond Contributor

      HansVogelaar 

      Tried with Power Query. Not optimal, at least from performance point of view.

      with

      let
          Source = Excel.CurrentWorkbook(){[Name="range"]}[Content],
          filterEmptyRows = Table.SelectRows(Source, each ([Column1] <> null)),
          promotHeader = Table.PromoteHeaders(filterEmptyRows, [PromoteAllScalars=true]),
          removeNo = Table.AddColumn(
              promotHeader,
              "Custom",
              each
                  if Text.Start(Text.Lower([Address]), 2) = "no"
                  then Text.TrimStart(Text.Range([Address], 3))
                  else
                      if Text.Start(Text.Lower([Address]), 3) = "no:"
                      then Text.TrimStart(Text.Range([Address], 4))
                      else Text.TrimStart([Address])
          ),
          keepResult = Table.SelectColumns(removeNo,{"Custom"}),
          extractLeftNumber = Table.SplitColumn(
              keepResult,
              "Custom",
              Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.Csv, false),
              {"Part.1", "Parts"}
          ),
          rightSpacePosition = Table.AddColumn(
              extractLeftNumber,
              "Custom",
              each Text.PositionOf(Text.Start([Parts], 40), " ", Occurrence.Last)
          ),
          first40 = Table.AddColumn(
              rightSpacePosition,
              "Part.2",
              each
                  if Text.Length([Parts]) > 40
                  then Text.Start([Parts],[Custom])
                  else [Parts]
          ),
          after40 = Table.AddColumn(
              first40,
              "Part.3",
              each
                  if Text.Length([Parts]) > 40
                  then Text.Trim(Text.Range([Parts],[Custom]))
                  else ""
          ),
          final = Table.SelectColumns(after40,{"Part.1", "Part.2", "Part.3"})
      in
          final
    • mohamedsaleem18's avatar
      mohamedsaleem18
      Copper Contributor

      HansVogelaar Thank you for your response.  I am getting "unrecognized text in formula error".  Can you please help me to resolve this

       

Resources