SOLVED

Extracting data out of a String

Copper Contributor

Hi All, I'm after a simpler way of extracting data out of a string:

 

My dataset has a column that looks like this for example (1,000's of rows)

Expense
15AUG23 63.74L DSL FORD PICKUP
15FEB24 HAYES/C 55.57L UNL
15FEB24 LEE/L 43.39L UNL
15JAN24 HAYES/C 55.56L UNL
15JAN24 WILLCOX/R 19.173GAL DSL
15JUL23 100,84L UNL BENKE/S
15JUL23 32,10L UNL GAJDICS/M
15JUL23 32,46L UNL LINES/J SD
15JUL23 40,05L DSL SZEPESI/V
15JUL23 40,39 UNL VIRAG/B

 

I need to extract the fuel type (UNL = Unleaded & DSL = Diesel) and the amount (63.74L for example) into two extra columns so it looks like the below:

ExpenseFuel TypeAmount
15AUG23 63.74L DSL FORD PICKUPDSL63.74L
15FEB24 HAYES/C 55.57L UNLUNL55.57L

 

I did the below for Fuel Type but I couldn't work out how to incorporate the UNL, I just ran it, sorted the data set and then changed DSL to UNL after I pasted special values DSL.

 

=IF(ISNUMBER(SEARCH("DSL",N599)),"DSL","")

 

For the second part I copied the column to a new sheet split text to columns on space then deleted everything irrelevant from the rows, reformed using CONCATENATE and ran a VLOOKUP to return the data to the original sheet but it took me about 20 mins to check manually and there is a chance I might have missed something. 

 

Description12345Final Value
02FEB23 LUPI/D 40.19L DSL  40.19L  40.19L
17FEB23 LUPI/D 40.33L DSL  40.33L  40.33L
16MAR23 LUPI/D 43.17L UNL  43.17L  43.17L
12APR23 LUPI/D 7.607GAL UNL FUEL  7.607GAL  7.607GAL
22APR23 LUPI/D 13.223GAL UNL FUEL  13.223GAL  13.223GAL
27APR-08MAY23 LUPI/D FUEL 17.094GAL UNL   17.094GAL 17.094GAL
13MAY23 LUPI/D FUEL 51.35L UNL   51.35L 51.35L
02JUL23 LUPI/D FUEL 31.12L UNL   31.12L 31.12L



Is there a better way to do this?

 

7 Replies
best response confirmed by JamesDLG (Copper Contributor)
Solution

@JamesDLG 

 

dscheikey_0-1718720857366.png

=IFS(ISNUMBER(SEARCH("DSL",A2)),"DSL",ISNUMBER(SEARCH("UNL",A2)),"UNL",TRUE,"")
and
=TAKE(TEXTSPLIT(TEXTBEFORE(A2,{" DSL";" UNL"})," "),1,-1)

or
=IFS(ISNUMBER(SEARCH("DSL",A2)),"DSL",ISNUMBER(SEARCH("UNL",A2)),"UNL",TRUE,"")
and
=VALUE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(TAKE(TEXTSPLIT(TEXTBEFORE(A2,{" DSL";" UNL"})," "),1,-1),"GAL",""),"L",""),",","."))
and
=IFS(ISNUMBER(SEARCH("GAL",TAKE(TEXTSPLIT(TEXTBEFORE(A2,{" DSL";" UNL"})," "),1,-1))),"GAL",ISNUMBER(SEARCH("L",TAKE(TEXTSPLIT(TEXTBEFORE(A2,{" DSL";" UNL"})," "),1,-1))),"L",TRUE,"")
O wow, thank you very much for this! What software did you write that in please?

@JamesDLG 

I don't understand the question.

EXCEL

BrainSoft

 

Do I just put that into the formula bar?

@JamesDLG 

 

Take a look at the example file. All the formulae can be found there. If you have a different language environment, these will be reformatted and rewritten for you.

If you do not use the "." for decimal places, you must change the formula.

 

Ok perfect, thanks so much for your help!

@JamesDLG 

To play with Excel Beta functions

image.png

=REGEXEXTRACT(B3:B12,
   {"(?<=\s)(DSL|UNL)(\s|\b)","(?<=\s)\d(.*)?(?=\s(DSL|UNL))"} )
1 best response

Accepted Solutions
best response confirmed by JamesDLG (Copper Contributor)
Solution

@JamesDLG 

 

dscheikey_0-1718720857366.png

=IFS(ISNUMBER(SEARCH("DSL",A2)),"DSL",ISNUMBER(SEARCH("UNL",A2)),"UNL",TRUE,"")
and
=TAKE(TEXTSPLIT(TEXTBEFORE(A2,{" DSL";" UNL"})," "),1,-1)

or
=IFS(ISNUMBER(SEARCH("DSL",A2)),"DSL",ISNUMBER(SEARCH("UNL",A2)),"UNL",TRUE,"")
and
=VALUE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(TAKE(TEXTSPLIT(TEXTBEFORE(A2,{" DSL";" UNL"})," "),1,-1),"GAL",""),"L",""),",","."))
and
=IFS(ISNUMBER(SEARCH("GAL",TAKE(TEXTSPLIT(TEXTBEFORE(A2,{" DSL";" UNL"})," "),1,-1))),"GAL",ISNUMBER(SEARCH("L",TAKE(TEXTSPLIT(TEXTBEFORE(A2,{" DSL";" UNL"})," "),1,-1))),"L",TRUE,"")

View solution in original post