Forum Discussion
Extract Address from Cell Text
Background:
I'm working with address data that uses the quadrant system (ends with NW, NE, SW, or SE). There are only very few instances where it ends with N or S.
I have a column of data that starts with an address and is combined with descriptive data. Fortunately, all address data is at the beginning of the cell which makes things easier. With the help of co-pilot, I was able to get a formula that extracts the address (looks for the any of the quadrants and for most instances).
Co-pilot query (couldn't find history, so the search was along the lines of):
1) in excel, how do I extract text if there is "e & " or "w & " from a cell
2) adjust the formula so that it extracts information to end at NE, NW, SE, or SW after the "e & " or "w & "
=IF(OR(ISNUMBER(FIND(" NW", C2)), ISNUMBER(FIND(" NE", C2)), ISNUMBER(FIND(" SW", C2)), ISNUMBER(FIND(" SE", C2))),
IF(ISNUMBER(FIND("&", C2)),
LEFT(C2, MIN(IFERROR(FIND(" NW", C2, FIND(" NW", C2) + 1), LEN(C2) + 1), IFERROR(FIND(" NE", C2, FIND(" NE", C2) + 1), LEN(C2) + 1), IFERROR(FIND(" SW", C2, FIND(" SW", C2) + 1), LEN(C2) + 1), IFERROR(FIND(" SE", C2, FIND(" SE", C2) + 1), LEN(C2) + 1)) + 2),
LEFT(C2, MIN(IFERROR(FIND(" NW", C2), LEN(C2) + 1), IFERROR(FIND(" NE", C2), LEN(C2) + 1), IFERROR(FIND(" SW", C2), LEN(C2) + 1), IFERROR(FIND(" SE", C2), LEN(C2) + 1)) + 2)
),"")
The formula will look extract the string of data from the beginning of the cell to the first instance of the quadrant.
Issues
As mentioned, this will omit any instances where the quadrant happens to fall on N or S. A few times this formula fails is due to bad data entry (case sensitive). I'm not sure if there is anything can be done about it.
The main issue is some addresses are entered as intersections, so it will be ABC St NW & XYZ Ave NW. The formula looks for the first instance of the quadrant so the returning values only gives the first half of the intersection.
Address | Formula Returns | Results |
123 Fake St NE | 123 Fake St NE | Works |
ABC St NW & XYZ Ave NW | ABC St NW | Missing full intersection |
Request:
Can this be done without VBA? Looking for formulas.
Is there a formula that can pull the intersection and the address? I thought of several ways to approach this like using wild cards (*) if there is an & in the text. This cell has the address and description, so sometimes in the description, the quadrant will be referenced later in the cell or you have a word that happens to end in NE. This is what caused me to focus on trying to use some logic when asking co-pilot. For example,
- look for the "E & " or "W & " (NE, NW, SE, or SW)
- look for & plus pairs of quadrants (intersections will usually be int he same quadrant)
Ideally, I have one formula that address both cases that might exist in dataset. If I need to "split" my data up and apply formula's separately, so be it... it's better than all manual (although it would be nice to fix the data at the data entry level).
Next Steps:
If a formula can be suggested along with the Data Transformation in Power Query, that would be great. If not, hopefully someone can help me with the Excel formula first. Thanks!
- PeterBartholomew1Silver Contributor
Playtime! OK, I confess this is not my forte, but here is a start at a regular expression but with the regex hidden.
all and compassPts are just defined names that return text. PosLookAhd is a Lambda function.
posLookAhd = LAMBDA(x,"(?=" & x & ")" )
Such a function is not exactly in the spirit of REGEX! ... but kind of fun.
- RecycleBin_RobCopper ContributorThanks so much! You introduced me to topics (Parameter Query and Regular Expression) I didn't even know existed, and once I develop a better understanding, I can hopefully modify them for my uses.
Perhaps we may simplify to
= REGEXEXTRACT( source[Text], "(?xi) (.*\b( (n|s)?(w|e) | (n|s) )\b) | (.*)" )
- PeterBartholomew1Silver Contributor
I hadn't come across the case-insensitive modifier but then, there is much else that I haven't come across!
It's better to have samples of possible text variants from which to extract address. With some assumptions with Power Query it could be
query is
let Source = Excel.CurrentWorkbook(){[Name="source"]}[Content], quadrants = {" NW", " NE", " SW", " SE"}, GetAddress = Table.AddColumn( Source, "Address", each Text.Start( [Text] , List.Max( List.Transform( quadrants, (q) => Text.PositionOf([Text], q, Occurrence.Last, Comparer.OrdinalIgnoreCase) ) ) + 3 ) ) in GetAddress
- JKPieterseSilver Contributor
RecycleBin_Rob If you prefer a formula, this one seems to work for the two-character quadrants and for the two examples you gave:
=LET(Q,{" NW"," SE"," NE"," SW"},countOfQ,(LEN(A2)-MIN(LEN(SUBSTITUTE(A2,Q,""))))/3,TEXTBEFORE(A2,Q,countOfQ))
- RecycleBin_RobCopper ContributorThanks! Unfortunately, I couldn't get it to work. I'll play around with it though. The number of intersections I need to work with are only a handful, so I won't prioritize this portion yet. Thanks!
Updated a bit
as
let Source = Excel.CurrentWorkbook(){[Name="source"]}[Content], quadrants = {" NW", " NE", " SW", " SE", " N", " E", " S", " W"}, fn = (str as text, add as text) => [ position = List.Max( List.Transform( quadrants, (q) => [ p=Text.PositionOf( str, q & add, Occurrence.Last, Comparer.OrdinalIgnoreCase ), e = if p < 0 then p else p + Text.Length(q) ][e] ) ), length = if position < 0 then Text.Length(str) else position ][length], GetAddress = Table.AddColumn( Source, "Address", each [ sepMid = fn([Text], " "), sep = fn([Text], "" ), sepEnd = List.Min({ sepMid, sep }), address = Text.Start( [Text], sepEnd) ][address] ) in GetAddress
- RecycleBin_RobCopper ContributorThanks so much for making the adjustments! Next time I will know to provide more dummy data to help illustrate my cases. I'm still developing my Power BI skills so I'm hoping you can offer some more guidance to a rookie when using this formula.
The Excel file isn't actually the source data and allowed me an easier way to see the data as of right now.
Source: Oracle
Navigation: SYNERGY{[Name="WORK_ORDER"]}
Column (with address): [WORK_DESC]
I am not strong at querying and am mistyping something that is leading to errors:
Source = Oracle.Database("Oracle",SYNERGY{[Name="WORK_ORDER"]}[WORK_DESC]),
Any further assistance is greatly appreciated, and if you need more information, please let me know.- peiyezhuBronze Contributorhttps://www.oracletutorial.com/oracle-string-functions/oracle-regexp_like/
oracle support regular expression.
You can try it