JakeArmstrong
Here is the link to the data which is a web page containing delimited text file (";" as delimiter). I converted that to excel using power query.
Link to text file:
https://portal.amfiindia.com/DownloadNAVHistoryReport_Po.aspx?frmdt=21-May-2024
I copied and pasted text to excel and converted to a table named Table25.
Below is the text of my power query:
let
Source = Excel.CurrentWorkbook(){[Name="Table25"]}[Content],
#"Split Column by Delimiter" = Table.SplitColumn(Source, "Column1", Splitter.SplitTextByDelimiter(";", QuoteStyle.None), {"Column1.1", "Column1.2", "Column1.3", "Column1.4", "Column1.5", "Column1.6", "Column1.7", "Column1.8"}),
#"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Column1.1", type text}, {"Column1.2", type text}, {"Column1.3", type text}, {"Column1.4", type text}, {"Column1.5", type text}, {"Column1.6", type text}, {"Column1.7", type text}, {"Column1.8", type text}}),
#"Promoted Headers" = Table.PromoteHeaders(#"Changed Type", [PromoteAllScalars=true]),
#"Changed Type1" = Table.TransformColumnTypes(#"Promoted Headers",{{"Scheme Code", type text}, {"Scheme Name", type text}, {"ISIN Div Payout/ISIN Growth", type text}, {"ISIN Div Reinvestment", type text}, {"Net Asset Value", type number}, {"Repurchase Price", type text}, {"Sale Price", type text}, {"Date", type date}}),
#"Added Custom" = Table.AddColumn(#"Changed Type1", "Scheme Type", each if [Scheme Code] is null then [Scheme Code] else if (Text.Contains([Scheme Code],"Open Ended") or Text.Contains([Scheme Code],"Close Ended")) then [Scheme Code] else null),
#"Filled Down" = Table.FillDown(#"Added Custom",{"Scheme Type"}),
#"Reordered Columns" = Table.ReorderColumns(#"Filled Down",{"Scheme Code", "Scheme Type", "Scheme Name", "ISIN Div Payout/ISIN Growth", "ISIN Div Reinvestment", "Net Asset Value", "Repurchase Price", "Sale Price", "Date"}),
#"Removed Columns" = Table.RemoveColumns(#"Reordered Columns",{"Scheme Code"}),
#"Removed Top Rows" = Table.Skip(#"Removed Columns",1),
#"Filtered Rows" = Table.SelectRows(#"Removed Top Rows", each ([Scheme Name] <> null))
in
#"Filtered Rows"
As an additional information, after data is split into columns with above PQ, I tried the new function on the column named "Scheme Type"
Repeating my function again here:
=REGEXEXTRACT(PQ___Python_Combo___2021[Scheme Type],AJ6,1,1) [AJ6 contains the pattern: ((?!(equity|hybrid|Solution Oriented|FOF|elss)).)*$]
Pattern used for excel function does not seem to cover "^" meaning beginning of the string in regular expression. So I had to delete "^", else the function produced error.
The comparative Python pattern is:
(?i)^((?!(equity|hybrid|Solution Oriented|FOF|elss)).)*$ - where (?i) is for ignoring case sensitivity and ^ for start of the string. This pattern correctly returns data using python library re and that UDF was called via xlwings. Other details about that UDF can be found in my above responses.
Seems, excel is yet to support lookaheads / lookbehinds - Positive and Negative - (?=, ?!, ?<=, ?<!, etc.) or am I missing something?