Forum Discussion

Kralin's avatar
Kralin
Copper Contributor
Dec 07, 2023

Assistance with a formula please

Hello - I would be very grateful for help with a formula please.

Please see sample dataset (image below sorry couldn't see how to attach it!)

I need the formula to look at each individual ID and then to return a 'Y' in the row of the Alert with the most recent date (column D) if the ID has had:

Either two or more successive  instances of 'Did not attend'

Or two or more successive instances of 'did not attend'  and  'cancelled' (or variations)

Or two or more successive instances of 'cancelled' and 'cancelled' (or variations)

Or any variation of 'did not attend' or 'cancelled' as long as there are two or more

 

following an instance of 'attended on time' - and if there is no 'to be attended' set at a future date.

 

ID 1 has not attended but because they have a future date they don't need an alert

 

ID 2 needs an alert as they have had two instances of 'did not attend' or 'cancelled' since attending.

 

ID 3 has had two cancellations but has attended since then and so doesn't need an alert.

 

ID4 has only had cancellations so needs an alert

 

ID 5 has had a variety of cancellations since attending so would need an alert.

 

Please note this is sample data in real life the spreadsheet is many 1000s of rows long, so any formula would need to be robust enough to handle a lot of data.

 

I would only need to see 'Y' or 'N' on the row with the most recent date.

 

 

 

6 Replies

  • djclements's avatar
    djclements
    Bronze Contributor

    Kralin This is a pretty tall order for a dataset consisting of "many 1000s of rows". To help maintain an acceptable level of performance, the route I chose was to first generate a unique list of alerts by ID. I went through a few formula variations, the most efficient of which happened to be the longest:

     

    =LET(
    data, SORT(A2:C10001, {1,2}, {1,-1}),
    id, CHOOSECOLS(data, 1),
    next, id=DROP(VSTACK("", id), -1),
    num, SCAN(0, next, LAMBDA(v,n, IF(n, v+1, 1))),
    two, FILTER(data, num<3),
    status, CHOOSECOLS(two, 3),
    arr, FILTER(CHOOSECOLS(two, 1), (status="Did not attend")+(status="Cancelled")),
    unq, UNIQUE(arr),
    incl, BYROW(unq, LAMBDA(r, SUM(N(r=arr))=2)),
    FILTER(unq, incl))

     

    When tested with 10,000 rows of randomized data, there was a very slight hesitation (split second), but tolerable. With other methods I attempted, the lag was noticeably longer.

     

    With the above-mentioned formula entered in cell F2, for example, the formula to generate "Y" or "N" in the "Alert" column could be:

     

    =IF(A2=PreviousID, "", IF(COUNTIF($F$2#, A2), "Y", "N"))

     

    ...where the data is sorted by ID in ascending order, then by Date in descending order. Also, PreviousID is a defined name (in Name Manager) referring to =Sheet1!$A1 (relative row reference).

     

    Please see the attached workbook, if desired...

    • Kralin's avatar
      Kralin
      Copper Contributor

      djclementsthank you for taking the time and trouble to get back to me - much appreciated.

       

      I have tried your formula and I get a 'function not valid' error.

       

      I have a terrible feeling it may be because I'm using Excel Professional 2019, not a more recent version, and I should have mentioned that in the first instance. I'm terribly sorry.

       

      Given this restriction, can the formula you (kindly) suggested be amended in any way to fit this earlier version of Excel?

       

      Many thanks.

      • djclements's avatar
        djclements
        Bronze Contributor

        Kralin Yes, the formulas used in my first reply will only work with Excel for MS365. Please see the newly attached version, which should be compatible with Excel 2019. It uses similar logic, but with helper columns in a structured table in order to minimize the number of calculations necessary to get the desired result. It's not as robust as the MS365 version, but still performs alright with up to 5000 rows of data. If your dataset is larger than that, you may want to set Calculation Options to Manual while working with your data, then switch back to Automatic (or Calculate Now) to refresh the results.

         

        Again, the formulas will only work properly if the table has been sorted by ID from Smallest to Largest, then by Date from Newest to Oldest:

         

        Home > Sort & Filter > Custom Sort

         

        I hope that works out for you. Cheers!

  • Kralin The following formula should correctly set the alert flag based on the attendance. Took quite a while to come up with, I recorded a custom video on how to build and implement this formula: https://www.youtube.com/watch?v=jBog2DTuAIg

     

     

    =LET(a,$A$2:$A$21=A3,
    b,FILTER($B$2:$B$21,a),
    c,SORTBY(FILTER($C$2:$C$21,a),b,-1),
    d,INDEX(c,1),
    e,INDEX(c,2),
    IF(MAX(b)=B3,IF(AND(OR(d="Did not attend",d="Cancelled"),
    OR(e="Did not attend",e="Cancelled")),"Y","N"),""))

     

     

    Optional tip if this was helpful: [link removed by admin] 

Resources