Jul 09 2020 10:26 AM - edited Jul 10 2020 04:38 AM
Hey all,
I Have records with a string that can have 1 ore more substring in surrounded by brackets.
I just want those substrings, the rest i want to toss.
Any handy tips or functions i can use?
In vba it's simple:
[Quick and dirty looping through the whole string]
Function Return_PartialString(ByVal ls_string As String) As String
'simple loop through string and get what we want from it.
'decl vars
Dim lb_wanted As Boolean
Dim ls_newstring As String
'init vars
lb_wanted = False
ls_newstring = ""
For i = 1 To Len(ls_string)
If lb_wanted = True Then
ls_newstring = ls_newstring & Mid(ls_string, i, 1)
End If
If Mid(ls_string, i, 1) = "{" Then
lb_wanted = True
End If
If Mid(ls_string, i, 1) = "}" Then
lb_wanted = False
End If
Next i
ls_newstring = Replace(ls_newstring, "}", " ")
Return_PartialString = ls_newstring
End Function
[Slightly more neat using a split function]
Function Return_PartialString_Strict(ByVal ls_string As String) As String
'simple loop through string and get what we want from it.
'decl vars
Dim ls_newstring As String
Dim las_string() As String
'init vars
ls_newstring = ""
las_string = Split(ls_string, "}")
For i = LBound(las_string) To UBound(las_string)
If InStr(1, las_string(i), "{", vbTextCompare) > 0 Then
ls_newstring = Replace(ls_newstring & Mid(las_string(i), InStr(1, las_string(i), "{", vbTextCompare) + 1), " ", "_") & "#"
End If
Next i
ls_newstring = Replace(ls_newstring, "#", " ")
Return_PartialString_Strict = ls_newstring
End Function
I am hoping there is something like this i can do in power query? Other then dozens of nested if you escape when there are none left or something.
Edit: Put both into a code block for more readability.
Jul 09 2020 05:05 PM
It could be done from user interface without coding:
- split column by right most opening bracket
- split second column by left most opening bracket
- remove other but middle column
Generate script will be like
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Split Column by Delimiter" = Table.SplitColumn(Source, "A", Splitter.SplitTextByEachDelimiter({"{"}, QuoteStyle.Csv, true), {"A.1", "A.2"}),
#"Split Column by Delimiter1" = Table.SplitColumn(#"Split Column by Delimiter", "A.2", Splitter.SplitTextByEachDelimiter({"}"}, QuoteStyle.Csv, false), {"A.2.1", "A.2.2"}),
#"Removed Other Columns" = Table.SelectColumns(#"Split Column by Delimiter1",{"A.2.1"})
in
#"Removed Other Columns"
Jul 10 2020 04:34 AM
Thanks for your answer. I'm going to try that as well.
I found a function to do this, but if your method works it's probably easier and better.
I'll implement it when I have the time. Thank you!
let func =
(InputText as text, StartDelimiter as text, EndDelimiter as text, optional Alternative_Dummy as text) =>
let
// If the end delimiter contains the same characters than the start delimiter then that portion has to be replaced in the end parameter by a dummy. Otherwise the query fails.
AlternativeDummy = Alternative_Dummy,
Dummy = if AlternativeDummy is null then "*+*" else AlternativeDummy,
String = InputText,
ParaStart = StartDelimiter,
ParaEnd = EndDelimiter,
// If the delimiters are identical, a different logic has to be applied.
IdenticalDelimiters = ParaEnd = ParaStart,
// Start of logic for identical delimiters
SplitText = Text.Split(String, ParaStart),
ResultIdenticalDelimiters = List.Alternate(SplitText,1,1),
// Start of logic for different start- and end-delimiters: Create dummy end-delimiter.
DummyParaEnd = Text.Replace(ParaEnd, ParaStart, Dummy),
DummyString = Text.Replace(String, ParaEnd, DummyParaEnd),
SplitByParaStart = List.Skip(Text.Split(DummyString, ParaStart),1),
ConvertToTable = Table.FromList(SplitByParaStart, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
ExtractBeforeParaEnd = Table.AddColumn(ConvertToTable, "Result", each Text.BeforeDelimiter([Column1], DummyParaEnd)),
Cleanup = Table.RemoveColumns(ExtractBeforeParaEnd,{"Column1"})[Result],
Result = if Text.Contains(String, ParaStart) and Text.Contains(String, ParaEnd)
then (if IdenticalDelimiters
then ResultIdenticalDelimiters
else Cleanup)
else error Error.Record( "At least one of the delimiters is not found in the string" )
in
Result ,
documentation = [
Documentation.Name = " Text.BetweenDelimitersOccAll ",
Documentation.Description = " Returns all occurrances of text strings between a delimiter pair. ",
Documentation.LongDescription = " Returns all occurrances of text strings between a delimiter pair. Optional parameter Alternative_Dummy can be used if Dummy shall not be *+*. ",
Documentation.Category = " Text ",
Documentation.Source = " https://wp.me/p6lgsG-Y5 ",
Documentation.Version = " 1.0 ",
Documentation.Author = " Imke Feldmann: www.TheBIccountant.com: https://wp.me/p6lgsG-Y5 . ",
Documentation.Examples = {[Description = " See this blogpost: https://wp.me/p6lgsG-Y5 ",
Code = " TextBetweenDelimitersOccAll(""Meanwhile I prefer *Python!* to *R!*"", ""*"", ""!*"") ",
Result = " {""Python"", ""R""} "]}]
in
Value.ReplaceType(func, Value.ReplaceMetadata(Value.Type(func), documentation))
Jul 10 2020 04:42 AM
I responded, but i don't see it appear.. so another response than... :)
I found a function in M to do it, if your solution does the same it's probably easier and maybe performs better. I will check it out when i have the time and i'll let you know. Thanks!
let func =
(InputText as text, StartDelimiter as text, EndDelimiter as text, optional Alternative_Dummy as text) =>
let
// If the end delimiter contains the same characters than the start delimiter then that portion has to be replaced in the end parameter by a dummy. Otherwise the query fails.
AlternativeDummy = Alternative_Dummy,
Dummy = if AlternativeDummy is null then "*+*" else AlternativeDummy,
String = InputText,
ParaStart = StartDelimiter,
ParaEnd = EndDelimiter,
// If the delimiters are identical, a different logic has to be applied.
IdenticalDelimiters = ParaEnd = ParaStart,
// Start of logic for identical delimiters
SplitText = Text.Split(String, ParaStart),
ResultIdenticalDelimiters = List.Alternate(SplitText,1,1),
// Start of logic for different start- and end-delimiters: Create dummy end-delimiter.
DummyParaEnd = Text.Replace(ParaEnd, ParaStart, Dummy),
DummyString = Text.Replace(String, ParaEnd, DummyParaEnd),
SplitByParaStart = List.Skip(Text.Split(DummyString, ParaStart),1),
ConvertToTable = Table.FromList(SplitByParaStart, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
ExtractBeforeParaEnd = Table.AddColumn(ConvertToTable, "Result", each Text.BeforeDelimiter([Column1], DummyParaEnd)),
Cleanup = Table.RemoveColumns(ExtractBeforeParaEnd,{"Column1"})[Result],
Result = if Text.Contains(String, ParaStart) and Text.Contains(String, ParaEnd)
then (if IdenticalDelimiters
then ResultIdenticalDelimiters
else Cleanup)
else error Error.Record( "At least one of the delimiters is not found in the string" )
in
Result ,
documentation = [
Documentation.Name = " Text.BetweenDelimitersOccAll ",
Documentation.Description = " Returns all occurrances of text strings between a delimiter pair. ",
Documentation.LongDescription = " Returns all occurrances of text strings between a delimiter pair. Optional parameter Alternative_Dummy can be used if Dummy shall not be *+*. ",
Documentation.Category = " Text ",
Documentation.Source = " https://wp.me/p6lgsG-Y5 ",
Documentation.Version = " 1.0 ",
Documentation.Author = " Imke Feldmann: www.TheBIccountant.com: https://wp.me/p6lgsG-Y5 . ",
Documentation.Examples = {[Description = " See this blogpost: https://wp.me/p6lgsG-Y5 ",
Code = " TextBetweenDelimitersOccAll(""Meanwhile I prefer *Python!* to *R!*"", ""*"", ""!*"") ",
Result = " {""Python"", ""R""} "]}]
in
Value.ReplaceType(func, Value.ReplaceMetadata(Value.Type(func), documentation))
Jul 11 2020 01:51 PM
SolutionPerhaps I misunderstood your requirements, initial code returns only very internal string.
If all substrings are to be returned, you may
1) Use almost UI only solutions - split using variant of splitter with Any, list delimiters {"{","}"} and with some gap on number of columns; unpivot all columns of the result
2) Use formula which splits all texts into the lists and expand them
let
Source = Table1,
splitTexts = Table.AddColumn(
Source,
"Custom",
each List.RemoveMatchingItems(Text.SplitAny([A],"{}"),{""})
),
expandSubstrings = Table.ExpandListColumn(
splitTexts,
"Custom")
in
expandSubstrings
3) Use Imke's function which do practically the same, but is more universal solution
All 3 variants are in attached file.
Jul 11 2020 01:55 PM
Jul 11 2020 01:51 PM
SolutionPerhaps I misunderstood your requirements, initial code returns only very internal string.
If all substrings are to be returned, you may
1) Use almost UI only solutions - split using variant of splitter with Any, list delimiters {"{","}"} and with some gap on number of columns; unpivot all columns of the result
2) Use formula which splits all texts into the lists and expand them
let
Source = Table1,
splitTexts = Table.AddColumn(
Source,
"Custom",
each List.RemoveMatchingItems(Text.SplitAny([A],"{}"),{""})
),
expandSubstrings = Table.ExpandListColumn(
splitTexts,
"Custom")
in
expandSubstrings
3) Use Imke's function which do practically the same, but is more universal solution
All 3 variants are in attached file.