lookup

Copper Contributor
Below is a sample of my data.  For each customer there is a 6 digit NAICS Code and an Industry Description  
       
       
NAICS CodeIndustry Description     
317910Abrasive Product Manufacturing     
423430Computer and Computer Peripheral Equipment and Software Merchant Wholesalers   
423510Metal Service Centers and Other Metal Merchant Wholesalers    
       
       
       
Below is a sample of NAICS Codes sourced from SBA.     
They are either 3, 4, 5, or 6 digits long.     
It is considered a match if the first 3, 4, 5, or 6 digits in my NAICS sourced data (below)    
match my sample customer data.     
       
External NAICS CodeExternal Industry Description     
211oll and gass estraction     
42311Automobile and other motor vehicle Merchant Wholesalers     
4235Metal and Mineral Merchant Wholesalers     
       
       
I would expect the result to be:     
       
       
NAICS CodeIndustry DescriptionExternal NAICS CodeExternal Industry Description 
317910Abrasive Product Manufacturing     
423430Computer and Computer Peripheral Equipment and Software Merchant Wholesalers42311Automobile and other motor vehicle Merchant Wholesalers
423510Metal Service Centers and Other Metal Merchant Wholesalers4235Metal and Mineral Merchant Wholesalers
       
 Can you help automate this logic?     
3 Replies

@redridgie  In the attached I copied your message text/example and my suggested formula:

=FILTER($A$17:$B$19,IFERROR(SEARCH($A$17:$A$19,A26),0)=1,"na")

my suggested formula does NOT agree with you sample results as I do NOT agree with your sample results.  the category 423430 is not a  subset of 42311 and the descriptions are not compatible.  Note that this formula may pull more than 1 result so you may want a variation like:

=INDEX(SORT(FILTER($A$17:$B$19,IFERROR(SEARCH($A$17:$A$19,A26),0)=1,"na"),,-1),1,)

to get the longest matching code.

 

Thank you very much! Can you walk me through the formula as I don't understand what it is doing.

sure. As with all formulas you start in the middle most () so:
FILTER($A$17:$B$19,IFERROR(SEARCH($A$17:$A$19,A26),0)=1,"na")
this FILTER will filter the rows in A17:B19 if the condition is true and that condition is:

IFERROR(SEARCH($A$17:$A$19,A26),0)=1

which is checking if it can find (i.e. search) the code from column A (A17:A19) inside this row's code value (A26) and it must be found starting at position 1 because a code like 123 we don't want to "match" inside a code 45123.  the IFERROR is to catch when the code is NOT found and returns a 0 and hence a false.
So that filter will return all the rows from the code table that have codes that "match" and then that list is SORT from biggest to smallest
and finally it takes the 1st row (i.e. INDEX(..., 1, ) ). note we could also have used TAKE(... , 1) instead