A LAMBDA Exercise

Valued Contributor

For this exercise I stripped down a calendar template and put in some sample data.  The goal is to obtain gross pay for July for 3 employees.

 

The data arrangement:

Patrick2788_0-1656185077295.png

I believe there are several solutions and I welcome them all.  My first take was using REDUCE but I thought I could be more efficient with another approach. My solution is included in the workbook.  I may revisit the REDUCE solution after I've put some more thought into it.

 

Formulas only please (No fill handle). No VBA.

I'll mention a few who people might be interested in this exercise. All are welcome.

@Peter Bartholomew @mtarler @L z. @Quadruple_Pawn 

 

 

20 Replies

@Patrick2788 

As variant

weekdayMultiplier=
LAMBDA( weekday, CHOOSE(weekday, 1.5, 1, 1, 1, 1, 1, 1.5) );

weekdayHours= 
LAMBDA( vector, weekday,
    SCAN(0, vector,
        LAMBDA(a,v,
            IF( v= 4, 16, // with simplified holidays
            IF( v = "", 0,
            IF( ISTEXT(v), a, 8*weekdayMultiplier(weekday)) ) ) ) )
);
weekdayPersonHours=
LAMBDA( vector, weekday, name,
    SUM( weekdayHours(vector, weekday)*(vector=name) ));

monthPersonHours=
LAMBDA( month, name,
        REDUCE(0, SEQUENCE(7), LAMBDA( a,v,
            a + weekdayPersonHours( CHOOSECOLS( month,v), v, name) ))
);

GrossPay=
LAMBDA( month, names, rates,
    SCAN(0, names, LAMBDA(a,v, monthPersonHours(month, v) ) )*
    rates
);

@Patrick2788 Maybe I'm missing some of the challenge but I didn't use Lambda:

=(1.5*COUNTIFS(july,staff)-0.5*COUNTIFS(WkDays,staff)+COUNTIFS(holiday,staff))*8*rate

I added 2 Names: WkDays = the range of the month M-F and holiday = July 4

 

@Patrick2788 

I was undecided whether to delicately unwrap the calendar format into meaningful arrays or to simply hit it with a blunt instrument, COUNTIFS in the present case.  I came to the conclusion that the latter was sufficient, provided I was allowed to mark the holiday as a named range.

Worksheet formula
= rate * MAP(staff, Hoursλ);

Hoursλ = LAMBDA(s,
    8 * SUM(COUNTIFS(holiday, s), dayRate * BYCOL(july, LAMBDA(day, COUNTIFS(day, s))))
);

dayRate = {1.5, 1, 1, 1, 1, 1, 1.5};

One thing that I would observe about Patrick's original solution, is that 'hours()' is a Lambda function and therefore will accept parameters directly from the helper functions without the use of an additional Lambda to pass parameters.  I only realised that once I had reduced my own formula by naming its Lambda function to be "Hoursλ".

 

@Patrick2788 

I think I made this exercise too easy.  @mtarler struck it with a blunt instrument (COUNTIFS) and got the job done.  @Peter Bartholomew - I thought I made this one a pain with the holiday pay but you seem to have resolved this one easily working COUNTIFS into your LAMBDA.

I believe @Sergei Baklan has a solution that can be scaled to include 3D referencing.

 

I had to do it. I've updated the workbook, so it now includes 3 months.  I've made some design changes: the number for the day of the week is now a true date and a list of holidays is included in the summary sheet.

@Patrick2788 

Yes, there are minor adjustments in previous solution to work with few months. The only, I have a bit different result with you. Calculated weekend hours for Bev

image.png

Didn't dig what and where exactly is wrong.

@Sergei Baklan 

Thanks for catching that one.

I feel silly:

(I've replaced the workbook previously uploaded)

Patrick2788_0-1656346821056.png

 

@Patrick2788 

Now we are in sync

@Patrick2788  without those new beta functions it seems harder (at least to me) so I added the sheet names and made a 2 level BYROWS() call:

=BYROW(staff,LAMBDA(e,
   SUM(BYROW(SheetNames, LAMBDA(mm, 
      LET(sm, INDIRECT(mm&"!a3:g20"), 
         swd, INDIRECT(mm&"!b3:f20"),
         smb, INDIRECT(mm&"!a4:g21"),
         smc,INDIRECT(mm&"!a5:g22"),
         1.5*COUNTIF(sm,e)-0.5*COUNTIF(swd,e)
            +SUM(COUNTIFS(sm,holidays,smb,e)+COUNTIFS(sm,holidays,smc,e)) 
         )
      )))
   ))*8*rate

It's certainly a good solution with Beta functions not available. Use of INDIRECT is not ideal but the workbook is not large enough where there might be a calculation crunch.

@Patrick2788  I agree using INDIRECT() is not ideal and I tried to avoid it but couldn't find anything that I could get to work on the 3d (i.e. multi-sheet) reference range.  I have used CONCAT/TEXTJOIN type of functions on 3d but for example using TEXTJOIN with "," delaminated created a string >1500 long so the 'traditional' trick of replacing with spaces and then using MID() to then break it up errors out due to the length.  That said I came up with a cute variant that seems to work:

=LAMBDA(ref,
    LET(
        mm, "," & TEXTJOIN(",", FALSE, ref) & ",",
        ll, LEN(mm),
        cc, ll - LEN(SUBSTITUTE(mm, ",", "")) - 1,
        s, SEQUENCE(cc),
        sp, REPT(" ", ll),
        dd, TRIM(
            MID(
                SUBSTITUTE(
                    LEFT(SUBSTITUTE(mm, ",", sp, s + 1), ll),
                    ",",
                    sp,
                    s
                ),
                ll,
                ll
            )
        ),
        dd
    )
)

 I then wrapped that with a calculation:

=LET(mm, ToCol(AllMonthsRef),
     holidayloc,XMATCH(holidays,--mm,0,1),
     BYROW(staff,LAMBDA(n,LET(
             ws,SEQUENCE(ROWS(mm)/7,1,1,7),
             alldays,SUM(--(mm=n)),
             wkends,0.5*SUM(--(INDEX(mm,ws)=n),--(INDEX(mm,ws+6)=n)),
             holly,SUM(--(INDEX(mm,holidayloc+7)=n),--(INDEX(mm,holidayloc+14)=n)),
             alldays+wkends+holly)))
     *8*rate)

@Patrick2788 

This is not an independent solution since I started of by examining @Sergei Baklan's solution and only then started to play.  The initial manipulation of the 3D Range is truly amazing, I am only beginning to get to grips with some of the new array shaping functions.

 

I liked Sergei's use of nested Lambda functions; I am moving towards using such functions wherever possible rather than wherever necessary to improve the structure and self-documenting character of solutions, but I have not got there yet!  I did think the use of SCAN to multiply the hourly rates was something of overkill though.

 

My formulas were

 

 

Worksheet formula
= rate * MAP(staff, BillableHoursλ);

BillableHoursλ=
LAMBDA(name,
    LET(
        stack,      SORT(WRAPROWS(TOCOL(calendar, , 1), 3)),
        dates,      TAKE(stack,,1),
        holiday?,   COUNTIFS(holidays,dates),
        weekend?,   WEEKDAY(dates,2)>5,
        time,       1 + 0.5*weekend? + holiday?,
        staff,      TAKE(stack,,-2),
        onshift?,   BYROW(staff,LAMBDA(s,OR(s=name))),
        SUM( IF(onshift?, 8*time) )
    )
);

 

 

@Patrick2788 

I have used your challenge as a further exercise.  I note that the formula that Sergei introduced to restructure the 3D Range, that contains the calendar, is recalculated in its entirety once for each staff member.  Provided the calculation is fast that is not a problem, but out of interest I set out to limit the calculation step to a single occurrence.  That meant I could not use a defined Name or a Thunk to pass the restructured array 'stack'.

 

I could create a monolithic block of code and place the calculation of 'stack' before the MAP that determines the eligible hours for each staff member.  To move away from such monolithic code, I wanted to use a further nested Lambda function.  This could either be defined within the outer Lambda function by using LET, in which case it would be within scope of the formula-local name, or 'stack' itself could be passed as a parameter.  I implemented each, but chose the second option.

GrossPayλ 
= LAMBDA(calendar, names, rate,
    LET(
        stack, SORT(WRAPROWS(TOCOL(calendar, , 1), 3)),
        rate * MAP(staff, EquivalentHoursλ(stack))
    )
);

EquivalentHoursλ 
= LAMBDA(stack,
    LAMBDA(name,
        LET(
            dates,    TAKE(stack, , 1),
            holiday?, COUNTIFS(holidays, dates),
            weekend?, WEEKDAY(dates, 2) > 5,
            time,     1 + 0.5 * weekend? + holiday?,
            assigned, TAKE(stack, , -2),
            onshift?, BYROW(assigned, LAMBDA(s, OR(s = name))),
            SUM(IF(onshift?, 8 * time))
        )
    )
);

In order to pass the additional parameter into the MAP construct I needed to define the function 'EquivalentHoursλ' in a Curried form, passing one parameter at a time by using multiple nested LAMBDAs.

 

I am sorry if this reads like complete gobbledegook; I have tried my best to express the ideas clearly, but I am not convinced I have succeeded!  

The more I study stacking (and unstacking), the more I believe there's a need for a function capable of determining the position of an element in an array (Not a range but an array). The other day someone had requested a formula to 'un-stack' a range. Worked on it bit (Explored pulling row and column with MAP from a mirrored range) but it kept coming back to finding the position within in the array. I saw L Z's tidy solution and moved on.
https://techcommunity.microsoft.com/t5/excel/how-to-build-a-destacked-table/m-p/3558426

ROW and COLUMN do not work on arrays and there's only so much XMATCH can do as arrays get larger and it won't calculate. I believe SCAN can sometimes get me the element position but it seems like a lot of heavy lifting.

Maybe the function would be called ELEMENT and would return position number within a vertical or horizontal array,

@Patrick2788 

Hi Patrick

I am not so sure.  If anything, there is less of a need, given the new array shaping functions, for formulas relating indices in order to generate array transformations.  I followed your link and posted a formula solution.  It required 200ms to generate a 50,000 x 12 array of results.

 

The question of locating an element of an array is a little unusual to my mind.  More often it is a case of mapping an entire array to a new layout, though one could conceivably play 'find the "X"'.  One thing I have done to get a handle on some of the new array shaping functions is to apply them to a data set but, in parallel, to apply the same formula to an equivalent index array. 

 

Then again, maybe this isn't what you are trying to achieve!!

image.png

 

I'd have to step you through my thinking in re: De-stacking.

Essentially, I was working on a solution where I had created an array and then needed to check each element in the array to determine if it would be a 1 or 0. I was using MAP with a dummy range of identical dimensions to obtain Row and Column. The problem was getting the array to know when to stop filling in 1s.

For example. Item 1 shows 3. By the time I get to the 4th position in the array, I need to tell it to stop filling in 1s even though it's still concerned with Item 1.

I hope that makes sense. I may revisit de-stacking with a clearer head.
I think what you are getting at could be explained using an n-point weighted average filter (similar to a recent post I answered). So think of a set of data going up and down but with a lot of noise. You can take the average of the points around each spot as a way to smooth the curve. Let's say you want a 5 point average so starting with data point 3 you average data points 1,2,3,4,5 and get a new value for that location. Then at point 4 you average 2,3,4,5,6 for a new point there and so on. You can also add a weight factor like 0.1, 0.2, 0.4, 0.2, 0.1 so that the new replacement is weighted more by the center points and less by the outside points. So basically as you SCAN or MAP through you want to access 'near-by' points to use in the calculation of the 'present' point (doesn't have to be average, it could be making a replica of that minefield game that inserts a number based on the # of bomb in adjacent cells).
So we CAN use INDEX if we know we want a 3-point average but and n goes up the number of INDEX needed goes up 2x or a recursive call might be possible but either way it is bulky, complicated and inefficient and a native capability could be much better.
That all said, if I totally got your intentions (@Patrick2788) wrong I apologize and ignore this.

@Patrick2788 

I am not sure I fully understand the scenario and, it may well be that I am simply agreeing with Matt.  That said, I see that helper functions like MAP and SCAN may perform calculations element by element without there being an obvious means of addressing adjacent elements of the array.  In such a situation, my suggestion would be to scan an index array rather than directly scanning the target array.  Elements of the target array could then be returned by use of the INDEX function, e.g.

= MAP(indexArray, 
    LAMBDA(k, 
      AVERAGE(
        INDEX( targetArray, k+{0,1,2} )
      )
    )
  )

Am I on the right track?

 

@Peter Bartholomew 

Yes, I think you and @mtarler have it correct.  I have a way (SCAN with XLOOKUP using SEQUENCE for lookup array and return array) to reset sequential numbering (e.g 1,2,3,4,5,6,1,2,3, etc) but I don't believe it can be scaled for large data sets.

 

I still need to study your workbook from the destacking thread.  When I'm at my desk at work I don't have access to an Insider build so I'm missing out during typical working hours!

Thank you for sharing. The way you approach this task is fascinating. The way I was taught to write code in VBA is avoid lengthy modules and split subroutines in smaller tasks and simply call them in another macro. Your LAMBDAs are like a quality Dutch coffee maker. If one part needs to be removed or replaced, it's easy to do so. Additionally, I do like the parameter names like 'vector' instead of single 'x' or 'y', for example.