Forum Discussion

RecycleBin_Rob's avatar
RecycleBin_Rob
Copper Contributor
Oct 09, 2024

Extract Address from Cell Text

Background:

I'm working with address data that uses the quadrant system (ends with NW, NE, SW, or SE). There are only very few instances where it ends with N or S.

 

I have a column of data that starts with an address and is combined with descriptive data. Fortunately, all address data is at the beginning of the cell which makes things easier. With the help of co-pilot, I was able to get a formula that extracts the address (looks for the any of the quadrants and for most instances).

 

Co-pilot query (couldn't find history, so the search was along the lines of):

1) in excel, how do I extract text if there is "e & " or "w & " from a cell

2) adjust the formula so that it extracts information to end at NE, NW, SE, or SW after the "e & " or "w & "

 

=IF(OR(ISNUMBER(FIND(" NW", C2)), ISNUMBER(FIND(" NE", C2)), ISNUMBER(FIND(" SW", C2)), ISNUMBER(FIND(" SE", C2))),
IF(ISNUMBER(FIND("&", C2)),
LEFT(C2, MIN(IFERROR(FIND(" NW", C2, FIND(" NW", C2) + 1), LEN(C2) + 1), IFERROR(FIND(" NE", C2, FIND(" NE", C2) + 1), LEN(C2) + 1), IFERROR(FIND(" SW", C2, FIND(" SW", C2) + 1), LEN(C2) + 1), IFERROR(FIND(" SE", C2, FIND(" SE", C2) + 1), LEN(C2) + 1)) + 2),
LEFT(C2, MIN(IFERROR(FIND(" NW", C2), LEN(C2) + 1), IFERROR(FIND(" NE", C2), LEN(C2) + 1), IFERROR(FIND(" SW", C2), LEN(C2) + 1), IFERROR(FIND(" SE", C2), LEN(C2) + 1)) + 2)
),"")

 

The formula will look extract the string of data from the beginning of the cell to the first instance of the quadrant.

 

Issues

As mentioned, this will omit any instances where the quadrant happens to fall on N or S. A few times this formula fails is due to bad data entry (case sensitive). I'm not sure if there is anything can be done about it.

 

The main issue is some addresses are entered as intersections, so it will be ABC St NW & XYZ Ave NW. The formula looks for the first instance of the quadrant so the returning values only gives the first half of the intersection.

 

AddressFormula ReturnsResults
123 Fake St NE123 Fake St NEWorks
ABC St NW & XYZ Ave NWABC St NWMissing full intersection

 

Request:

Can this be done without VBA? Looking for formulas.

 

Is there a formula that can pull the intersection and the address? I thought of several ways to approach this like using wild cards (*) if there is an & in the text. This cell has the address and description, so sometimes in the description, the quadrant will be referenced later in the cell or you have a word that happens to end in NE. This is what caused me to focus on trying to use some logic when asking co-pilot. For example,
- look for the "E & " or "W & " (NE, NW, SE, or SW)

- look for & plus pairs of quadrants (intersections will usually be int he same quadrant)

 

Ideally, I have one formula that address both cases that might exist in dataset. If I need to "split" my data up and apply formula's separately, so be it... it's better than all manual (although it would be nice to fix the data at the data entry level).

 

Next Steps:

If a formula can be suggested along with the Data Transformation in Power Query, that would be great.  If not, hopefully someone can help me with the Excel formula first. Thanks!

  • RecycleBin_Rob 

    Playtime!  OK, I confess this is not my forte, but here is a start at a regular expression but with the regex hidden.

    all and compassPts are just defined names that return textPosLookAhd is a Lambda function.

     

    posLookAhd = LAMBDA(x,"(?=" & x & ")" )

     

    Such a function is not exactly in the spirit of REGEX!  ... but kind of fun.

     

  • RecycleBin_Rob 

    It's better to have samples of possible text variants from which to extract address. With some assumptions with Power Query it could be

    query is

    let
        Source = Excel.CurrentWorkbook(){[Name="source"]}[Content],
        quadrants = {" NW", " NE", " SW", " SE"},
        GetAddress = Table.AddColumn(
            Source,
            "Address",
            each Text.Start( [Text] ,
                List.Max(
                    List.Transform( quadrants, (q) => Text.PositionOf([Text], q, Occurrence.Last, Comparer.OrdinalIgnoreCase) )
                    ) + 3
                )
        )
    in
        GetAddress
  • JKPieterse's avatar
    JKPieterse
    Silver Contributor

    RecycleBin_Rob If you prefer a formula, this one seems to work for the two-character quadrants and for the two examples you gave:

    =LET(Q,{" NW"," SE"," NE"," SW"},countOfQ,(LEN(A2)-MIN(LEN(SUBSTITUTE(A2,Q,""))))/3,TEXTBEFORE(A2,Q,countOfQ))
    • RecycleBin_Rob's avatar
      RecycleBin_Rob
      Copper Contributor
      Thanks! Unfortunately, I couldn't get it to work. I'll play around with it though. The number of intersections I need to work with are only a handful, so I won't prioritize this portion yet. Thanks!
  • RecycleBin_Rob 

    Updated a bit

    as

     

    let
        Source = Excel.CurrentWorkbook(){[Name="source"]}[Content],
        quadrants = {" NW", " NE", " SW", " SE", " N", " E", " S", " W"},
    
        fn = (str as text, add as text) =>
            [
                position = List.Max(
                    List.Transform(
                        quadrants,
                        (q) => [ p=Text.PositionOf(
                            str,
                            q & add,
                            Occurrence.Last,
                            Comparer.OrdinalIgnoreCase
                        ),
                        e = if p < 0 then p else p + Text.Length(q)
                        ][e]
                    )
                ),
                length =    if      position < 0
                            then    Text.Length(str)
                            else    position
        ][length],
    
    
        GetAddress = Table.AddColumn(
            Source,
            "Address",
            each [
                sepMid  = fn([Text], " "),
                sep     = fn([Text], "" ),
                sepEnd  = List.Min({ sepMid, sep }),
                address = Text.Start( [Text], sepEnd)
            ][address]
        )
    in
        GetAddress

     

    • RecycleBin_Rob's avatar
      RecycleBin_Rob
      Copper Contributor
      Thanks so much for making the adjustments! Next time I will know to provide more dummy data to help illustrate my cases. I'm still developing my Power BI skills so I'm hoping you can offer some more guidance to a rookie when using this formula.

      The Excel file isn't actually the source data and allowed me an easier way to see the data as of right now.
      Source: Oracle
      Navigation: SYNERGY{[Name="WORK_ORDER"]}
      Column (with address): [WORK_DESC]

      I am not strong at querying and am mistyping something that is leading to errors:
      Source = Oracle.Database("Oracle",SYNERGY{[Name="WORK_ORDER"]}[WORK_DESC]),

      Any further assistance is greatly appreciated, and if you need more information, please let me know.

Resources