Jan 27 2023 12:03 PM
Hello,
I need to replace various extraneous characters found in invoice numbers so I can compare based only on the letters or numbers.
I need to replace the following with nothing:
How can I best do this? I am not sure if there is a new function in excel that handles this or is it Replace or Substitute? Or?
thank you
I have attached a sample
Jan 27 2023 12:54 PM
SUBSTITUTE is the correct function, but you shouldn't repeat B2:
=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(B2,"-",""),"/",""),"\",""),".",""),",",""),"(",""),")",""),"#","")," ","")
Jan 27 2023 01:06 PM - edited Jan 27 2023 01:11 PM
@Tony2021 Alternatively if you have Excel 365 try this:
=LET(in,A2,exclude,{",","/","\","-",".","(",")","#"," "},
CONCAT(TEXTSPLIT(in,exclude)))
Jan 27 2023 01:21 PM
A recursive SUBSTITUTE called 'Remove'
'Remove
=LAMBDA(cel,chars,IF(chars = "", cel, Remove(SUBSTITUTE(cel, LEFT(chars), ""), RIGHT(chars, LEN(chars) - 1))))
Sheet level formula:
=Remove(B2,"$%()/\|#-. ")
Jan 27 2023 04:39 PM - edited Jan 27 2023 04:40 PM
Hello Experts, thank you for the help! I like all the formulas. Remove is interesting indeed and I try to use it in my excel file and I get a #name? I dont see any code so not sure why it wont work in my file. I imagine the lamba is being used but I dont see where it is.
Also, I need to use this formula in a much larger formula and grateful for your assistance.
In the formula below, C11 is the invoice number that I need to strip I need to replace C11 with the formula.
I thought I could simply replace C11 with the formula but I dont think I can do that?
original formula:
FYI: D11 is the invoice amount.
=IF(TRIM(CLEAN(C11))="","",IF(ISNA(VLOOKUP(C11&D11,Project_Costs[[#All],[Invoice No Stripped]]&Project_Costs[[#Headers],[Invoice amount]],1,FALSE)),"NOT IN a DD","In Previous DD"))
I tried as follows:
using Remove:
=IF(TRIM(CLEAN(C11))="","",IF(ISNA(VLOOKUP(Remove(C11,"$%()/\|#-. ")&D11,Project_Costs[[#All],[Invoice No Stripped]]&Project_Costs[[#Headers],[Invoice amount]],1,FALSE)),"NOT IN a DD","In Previous DD"))
using substitute:
=IF(TRIM(CLEAN(C12))="","",IF(ISNA(VLOOKUP(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(C12,"-",""),"/",""),"\",""),".",""),",",""),"(",""),")",""),"#","")," ","")&D12,ProjectCostsPQ_FE.xlsx!Project_Costs[[#All],[Invoice '#]]&ProjectCostsPQ_FE.xlsx!Project_Costs[[#All],[Invoice amount]],1,FALSE)),"NOT IN a DD","In Previous DD"))
thank you very much.
I would post the file but there is a power query in it so I hope by giving the formula is enough information.
Jan 27 2023 08:17 PM
Jan 27 2023 11:19 PM
@Tony2021 Since you mentioned that you already use Power Query, I wonder if you perhaps can do the "lookups" within PQ as well. The code for removing the unwanted characters from the invoice numbers is quite easy. Though, you would not specify which characters to remove. In stead, you would explicitly tell PQ to keep only letters and numbers.
Add a custom column with the following code:
Text.Select([#"Invoice #"], {"a" .. "z", "A" .. "Z", "0" .. "9"})
It will look like this:
Jan 28 2023 07:03 AM - edited Jan 28 2023 07:15 AM
Hi Mtarler, thank you for the follow up. It seems to work but I noticed in my testing that I need to remove leading Zeros since my [Invoice No Stripped] strips out the leading zero's and I compare the 2 data sets (old data w 1000's of rows vs new data) and one has leading zeros and the other doesnt.
here is how I have it:
=IF(TRIM(CLEAN(C11))="","",IF(ISNA(VLOOKUP(
CONCAT(TEXTSPLIT(C11,{",","/","\","-",".","(",")","#"," "}))&D11,Project_Costs[[#All],[Invoice No Stripped]]&Project_Costs[[#All],[Invoice amount]],1,FALSE)),"NOT IN a DD","In Previous DD"))
Would you happen to know how to add that part to the above? thanks again for the expert help. Much appreciated.
Jan 28 2023 07:12 AM
Jan 28 2023 11:37 AM
Solution@Tony2021 If you know there will be at most x number you could use nested IF() statements to remove them but a recursive LAMBDA works really well. Go to Formulas -> Name Manager and add a NEW name called strip0 and give it this definition:
=lambda(n, if(left(n,1)="0",strip0(mid(n,2,len(n))),n))
As long as you are starting down this path why not go all the way and make a custom function to do it all for you. As shown in the next image is the name manager with 4 names:
strip0 as noted above will strip the preceding 0s off
punctuation is just a name for the array of characters to get removed
RemoveChars is the function I gave above that uses CONCAT and TEXTSPLIT to remove the characters
and
CleanUp puts them all together:
so now you can use:
=IF(TRIM(CLEAN(C11))="","",IF(ISNA(VLOOKUP(
CleanUp(C11)&D11,Project_Costs[[#All],[Invoice No Stripped]]&Project_Costs[[#All],[Invoice amount]],1,FALSE)),"NOT IN a DD","In Previous DD"))
Jan 28 2023 03:43 PM
mtarler, that is amazing. I have done my testing and it works! I have never used Lambda. Didnt even know it existed. You are very talented. Thank you very much for the help. I am very grateful. Have a good night.
Jan 29 2023 12:13 PM
Jan 29 2023 01:32 PM
Another formula but without the elegance of @mtarler 's recursive formula. It uses a side effect of TEXTSPLIT, namely it removes the separators.
= MAP(invoiceNo, LAMBDA(inv, CONCAT(TEXTSPLIT(inv, unwanted))))
where 'unwanted' refers to
= {"-";"\";"/";".";",";"(";")";"%";"$";"|";"#";" "}
The MAP/LAMBDA calculates cell by cell but returns the entire array of results.
Jan 29 2023 03:26 PM