Help with Formula/ Power query to get the number of matching row counts of codes

Contributor
Hi All,

Hope you are doing well!... I have the input data at a ctextid and a vbillid level...I also have the list of codes input by a human (agentcode) and the number of codes input by an automated algorithm (predicted code)... Now for all the individual codes in the agent code column , I need to find the count of rows in the agentcode column that have the specific code and the count of rows in the predict code column that have matching codes for the specific set of ctextid and vbillid ....Please find attached with the input tab and the needed output tab..Can you please help me here..

Thanks,
Arun
 
11 Replies

 

Capture.JPG

@Arun Chandramouli 

 

I did it with power query, but for some reason it doesn´t allow me to attach an excel file.

I can copy and paste the steps in the advance editor if it works for you or if you have an email I can send you the file.

 

@Arun Chandramouli 

 

Sorry, I was able to attach the excel file. 

 

It´s weird, if I have it in my desktop it showed me an error but now it was inside a folder Downloads and it works.

@Arun Chandramouli 

One more PQ variant is in attached file

Hi Sergei, Really appreciate your response...The only problem with this is that the percentages are going beyond 100 when the number of times the code is present in predictedcode is greater than the number of times the code is present in the agentcode...So the logic that I am looking to build is that for a unique code in the agentcode (Number of encounters that have the matching code in predictcode for the agent code)/ total number of encounters with code in the agent code column.. An encounter is a combination of ctext id and visitbillid... For example consider a agentcode C102 for an encounter (ctextid:3421 vbillid : 231) if the same code is present in predictcode for the same encounter(ctextid:3421 vbillid : 231 then the matching count is 1 ..(Numerator -> Number of matching counts for the code/ Denominator -> Number of times the code appears in the agentcode column)..Basically I am trying to calculate the precision for a particular agent code
Hi @alannavarro, Really appreciate your response...The only problem with this is that the percentages are going beyond 100 when the number of times the code is present in predictedcode is greater than the number of times the code is present in the agentcode...So the logic that I am looking to build is that for a unique code in the agentcode (Number of encounters that have the matching code in predictcode for the agent code)/ total number of encounters with code in the agent code column.. An encounter is a combination of ctext id and visitbillid... For example consider a agentcode C102 for an encounter (ctextid:3421 vbillid : 231) if the same code is present in predictcode for the same encounter(ctextid:3421 vbillid : 231 then the matching count is 1 ..(Numerator -> Number of matching counts for the code/ Denominator -> Number of times the code appears in the agentcode column)..Basically I am trying to calculate the precision for a particular agent code

@Arun Chandramouli 

Could you please generate sample file for testing of such behaviour. In current one I see nothing > 100% with current PQ logic as well as no agentcode C102 and ctextid:3421 you mentioned.

@Sergei Baklan :Please find attached the input file where I have modified the input data and also updated the expected output data ; along with the output data from the existing code...Can you please help me here..

 

Thanks,

Arun

@Arun Chandramouli 

Thank you for the sample.

 

Perhaps like this

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    agentList = Table.AddColumn(
        Source,
        "agentlist",
        each List.Transform(
            Text.Split([agentcode], "," ),
            each Text.Trim(_) )),
    predictList = Table.AddColumn(
        agentList,
        "predictlist",
        each List.Transform(
            Text.Split([predictcode], "," ),
            each Text.Trim(_) )),
    #"Expanded agentlist" = Table.ExpandListColumn(
        predictList,
        "agentlist"),
    matching = Table.AddColumn(#"Expanded agentlist",
        "Total Matching",
        each fnCount( [predictlist], [agentlist] )),
    #"Removed Other Columns" = Table.SelectColumns(
        matching,
        {"agentlist", "Total Matching"}),
    #"Grouped Rows" = Table.Group(
        #"Removed Other Columns",
        {"agentlist"},
        {
            {"total number", each Table.RowCount(_), Int64.Type},
            {"total matching", each List.Max([Total Matching]), type number}}),
    percent = Table.AddColumn(
        #"Grouped Rows",
        "% matching",
        each [total matching] / [total number], Percentage.Type)
in
    percent

with function

(lst as list, v as any) =>
let
    Source = List.Count( List.Select( lst, each _ = v) )
in
    Source

Hi @Sergei Baklan....Really appreciate you getting back...Please find attached the example in which it is failing...I have highlighted the same in the output tab...

In the file of Sergei Baklan, I think that you just need to change the step "Grouped Rows" in the "Table" query. It should work.
Right now
{"total matching", each List.Max([Total Matching]), type number}})
Change it for
{"total matching", each List.Sum([Total Matching]), type number}})

@alannavarro , @Arun Chandramouli 

Perhaps, thank you. But when U071 gives 57%, however 14% is marked as correct result.