Counting employees during each hour of the day

Occasional Reader

Hello. I am having a hard time figuring out how to use functions to count the number of staff I have working during each hour of the day. I am a beginner, but I enlisted the help of someone who has more experience and they aren't able to figure it out either.

 

I have 45 employees in a 24/7 department, so there are night shift staff that start on 1 day and finish on the next - that is throwing a huge wrench into things. 

The schedule also has 30 minute lunches built into their day, so there is a 30 minute overlap that I don't want to include those people into.

 

I do my schedule in excel per pay period as shown below. I am willing to adapt the way I currently do the schedule if needed, I just need to save myself time and prevent miscounting errors. 

NicoleI210_0-1723731361089.png

 

Any help would be appreciated.

16 Replies

@NicoleI210 

Could you attach a small sample workbook demonstrating the problem (without sensitive data), or if that is not possible, make it available through OneDrive, Google Drive, Dropbox or similar?

@HansVogelaar Hans, can you help?  so i started to try and tackle this (or at least what I think he wants/needs) in the attached file.  My function first breaks up the text to create a grid of start times for each employee and a complimentary grid of end times.  By making those times actual date-time I could address the overnight shifts.  Then I created a grid of date-times for every hour in those corresponding days (actually shifted to the 1/2 hour).  Finally I thought I could then MAP those date-times but summing up the instances they are both > start time grid and < end time grid.  but my results of that MAP are not what I expect and I think it has to do with how the LAMBDA is handling the arrays.  Hans, maybe you have an idea.

=LET(dates, B1:E1, times,B2:E20,
days,COLUMNS(dates),
timegrid, DROP(REDUCE("",SEQUENCE(days),LAMBDA(p,q,LET(
  tt, CHOOSECOLS(times,q),
  dd, INDEX(dates,q),
  sText,TEXTBEFORE(tt,"-",,,,0),sTime,LEFT(sText,2)/24+RIGHT(sText,2)/60/24,
  eText,TEXTAFTER(tt,"-",,,,0),etimeTemp,LEFT(eText,2)/24+RIGHT(eText,2)/60/24,etime,etimeTemp+((etimeTemp-sTime)<0),
  HSTACK(p,sTime+dd,etime+dd)))),,1),
startgrid,CHOOSECOLS(timegrid,SEQUENCE(days,,,2)),
endgrid, CHOOSECOLS(timegrid,SEQUENCE(days,,2,2)),
TimesInDay, SEQUENCE(24,,0)/24/60,
timeDayGrid, TimesInDay+dates+0.5/24,
oneH, SEQUENCE(1,ROWS(endgrid),1,0),
oneV, SEQUENCE(COLUMNS(endgrid),1,1,0),
counts, MAP(timeDayGrid,LAMBDA(q,MMULT(MMULT(oneH,(q>startgrid)*(q<endgrid)),oneV))),
counts2, MAKEARRAY(ROWS(timeDayGrid), COLUMNS(timeDayGrid),LAMBDA(r,c, SUM((INDEX(timeDayGrid,r,c)>startgrid)*(INDEX(timeDayGrid,r,c)<endgrid)))),
out, VSTACK(dates,startgrid,endgrid),
testout, MMULT(MMULT(oneH,(45516.5>startgrid)*(45516.5<endgrid)),oneV),
counts2)

so in the above I'm relatively confident of the first 1/2 creating the startGrid and endGrid and then creating a timeDayGrid (simple a grid of times for each day).  it is that last part I tried both MAP and MAKEARRAY and tried both SUM and MMULT inside.  The last line 17-19 are just looking at test outputs.

@m_tarler 

Hi Matt,

That formula is way above my pay grade! :cry:

Where is @PeterBartholomew1 when you need him?

@HansVogelaar as long as we are throwing out a life line, maybe we can call on a few other active contributors @SergeiBaklan , @Patrick2788 ,  @JKPieterse , @djclements . can anyone help me understand why my MAP or MAKEARRAY functions don't produce the counts for each time slot in my LAMBDA above?  thank you.

 

@HansVogelaar 

I was on grandchild-minding duty (times 2), picnicking and walking around parkland.  I will take a look after the "Financial Modelling with Modern Excel" bootcamp meeting, unless all is resolved by the time I make it!

@m_tarler 

There's two things that are holding me back for indulging in this adventure:

 

1. The data arrangement is not ideal. Why are the start/end times in the same cell?

2. What is the desired goal/return from the formula?

 

I suspect this could be made a lot simpler by arranging the data vertically in 4 columns: Employee, Start, End, Duration.  Much easier to analyze and a Gantt (Stacked column) chart could be created.

@m_tarler 

No fresh ideas, just to play with what we have

=LET(
 range, $A$1:$E$20,

 data,  DROP(range,1,1),
 dates, TAKE( DROP(range,,1), 1),
 HoursStart, SEQUENCE(24,,0,1/24),
 HoursEnd,   SEQUENCE(24,,1/24,1/24),
 Hours,      TEXT(HoursStart, "hh:mm") &
             "-" &
             TEXT(HoursEnd,"hh:mm"),

 DateTime,      LAMBDA(day,hhmm, day + REPLACE(hhmm, 3, , ":" ) ),
 TimeStart,     LAMBDA(day,times, DateTime(day, TEXTBEFORE(times, "-") ) ),
 TimeEnd,       LAMBDA(day,times, DateTime(day, TEXTAFTER(times, "-") ) ),
 ShiftHours,    LAMBDA(day,times, MOD( TimeEnd(day, times) - TimeStart(day, times), 1) ),

 DayShiftStart, LAMBDA(day,times, TOCOL( TimeStart(day, times), 3 ) ),
 DayShiftEnd,   LAMBDA(day,times, TOCOL( TimeStart(day, times) + ShiftHours(day, times), 3 ) ),

 ShiftRange, LAMBDA(fn,
    DROP( REDUCE(
        "",
        SEQUENCE(COLUMNS(dates)),
        LAMBDA(a,v,
            LET(
                col, CHOOSECOLS(range, v+1),
                VSTACK(a,  fn( TAKE(col,1), DROP(col,1) ) )
            )
        )
    ), 1)
 ),

 ShiftStart,    ShiftRange(DayShiftStart),
 ShiftEnd,      ShiftRange(DayShiftEnd),

 CountEmployees, MAKEARRAY(
    24, COLUMNS(dates),
    LAMBDA(n,m,
        SUM(
          (INDEX(dates, 1, m) + INDEX( HoursStart, n, 1) <= ShiftEnd )*
          (INDEX(dates, 1, m) + INDEX( HoursEnd, n, 1) > ShiftStart )
        )
    )
 ),

 VSTACK(
    HSTACK("Hour/Date", dates),
    HSTACK( Hours, CountEmployees )
 )
)

@NicoleI210 

I started by trying to understand the nature of the problem rather that attempting to evaluate the proposed solution in isolation.  I may have made a bit of a meal of it but conclusions such as 'it is easier to stack the weekly hours as a continuous datetime column probably hold.  I have used thunks to hold the counts for employee before aggregating but I could have nested blocks of formula instead.

=LET(
    timeLine, EXPAND(TOCOL(dates + SEQUENCE(24, , 0) / 24, TRUE), , 2, 0),
    empHrsϑ, BYROW(
        times,
        LAMBDA(emplTimes,
            LET(
                startHour, REGEXEXTRACT(emplTimes, "^\d{2}"),
                startMins, REGEXEXTRACT(emplTimes, "\d{2}(?=-)"),
                departHour, REGEXEXTRACT(emplTimes, "\d{2}(?=..$)"),
                departMins, REGEXEXTRACT(emplTimes, "\d{2}$"),
                startTime, dates + (startHour + startMins / 60) / 24,
                departTime, dates + (departHour + departMins / 60) / 24,
                finishTime, departTime + (departTime < startTime),
                emplStarts, SORT(VSTACK(timeLine, EXPAND(TOCOL(startTime, 3), , 2, 1))),
                emplFinish, SORT(VSTACK(timeLine, EXPAND(TOCOL(finishTime, 3), , 2, 1))),
                emplHours, 24 * TAKE(emplFinish - emplStarts, , 1),
                filter, VSTACK(DROP(emplFinish - emplStarts, 1, 1) <> -1, TRUE),
                TOROW(THUNK(WRAPCOLS(FILTER(emplHours, filter), 24, TRUE)))
            )
        )
    ),
    REDUCE(EXPAND(0, 24, 1, 0), empHrsϑ, LAMBDA(acc, hrsϑ, IFERROR(acc + hrsϑ(), acc)))
)

 

 

@m_tarler OK after all that, I went back and inserted the header date row and the time column on the left and discovered my problem, I did a sequence of 24 minuted instead of hours... ooof.  I fixed that and it worked fine.  My results are a little different than at least Sergei I think because I use the 1/2 hour segments so any time that starts or ends on the 1/2 hour it ignores for that hour, which I hope addresses the OP's original request.

@m_tarler Sorry, I just came across this thread today... not too sure why I didn't get a notification when you tagged me the other day. Looks like you solved it already though! 🙂

 

It's interesting to see the different approaches taken by everyone thus far. I think I would be more inclined to unpivot the data into a separate range first, then work with the new structure. There's also a number of questions I would want answered:

 

  1. What does the "3" mean in the Sunday header "S3"? Is this the week number for the current pay period?
  2. Are there separate worksheets for each pay period? Or does one worksheet contain data for the entire year?
  3. What does "PTO 0", "PTO 8" and "PTO 10" mean? Are there any more variations of this?
  4. What does "LB" mean in the last entry on 8/11 ("1500-2300 LB")?
  5. Why do two of the entries on 8/15 begin with "r/"? What does this mean, and what comes after it?
  6. How many other similar "short codes" are there, what do they all mean, and how are they used? Can they be used as either a prefix or suffix to a scheduled time entry? Do any of them also contain a hyphen (e.g. "EDO-8")?

 

I agree with @Patrick2788 that the current layout is not ideal. They appear to be tracking multiple pieces of information in a single cell, which is a common design flaw when starting with the desired output, then attempting to input data into the same structure. Data entry should be done in a separate table (or tables) with appropriate fields for each piece of information, so it can easily be summarized and output as desired (e.g. pivot tables, power query, dynamic array formulas, etc.).

 

That's just my two cents and is in no way meant to denigrate the solutions presented thus far. I quite enjoy seeing the different methods each person uses, as it's a great way to pick up new tips. Cheers!

@PeterBartholomew1 , it looks like count for the last hour in last date is not correct.

Played a bit with regex as well

=LET(
    time, SEQUENCE(24, , 0) / 24,
    datetime, dates + time,
    timeStart, REGEXREPLACE(
        times,
        "(^\d{2})(\d{2})(.*)",
        "$1:$2"
    ) * 1 + dates,
    timeEndRaw, REGEXREPLACE(
        times,
        "(.{5})(\d{2})(\d{2})",
        "$2:$3"
    ) * 1 + dates,
    timeEnd, timeEndRaw + (timeStart > timeEndRaw),
    shiftStart, TOCOL(timeStart, 3),
    shiftEnd, TOCOL(timeEnd, 3),
    count, MAP(
        datetime,
        LAMBDA(v, SUM((v >= shiftStart) * (v <= shiftEnd)))
    ),
    VSTACK(HSTACK("Time/Date", dates), HSTACK(time, count))
)
And I agree with you and Patrick on all your points, but unfortunately the OP has gone silent. My pings were really to understand where I went wrong and thought it was an issue with how I was using the LAMBDA and hitting myselfself when I realized my mistake was minutes instead of hours because those minutes cost me many hours (lol).
As for @SergeiBaklan latest post I just weep as you flex those new functions I still don't have access to. LOL. just kidding (sort of) but would totally want the OP to chime in on some of those points to figure out which issues need to be addressed in the formula vs the data entry. Either way, I always appreciate seeing your solutions and always learn something new.

@SergeiBaklan 

Yes, I should have checked the result more carefully.  I was trying to understand the structure of the problem and pretty much stopped when I got numbers!  I then side-tracked myself with playing REGEX games and working towards Lambda functions as modules, for example

Durationλ
= LAMBDA(shift,
    LET(
      ExtractTimeλ, LAMBDA(s, SUM({60,1}/60/24 * REGEXEXTRACT(s, "\d{2}", 1))),
      valid?,       REGEXTEST(shift, "\d{4}-\d{4}"),
      start,        REGEXEXTRACT(shift, "^\d{4}"),
      end,          REGEXEXTRACT(shift, "\d{4}$"),
      startTime,    MAP(start, ExtractTimeλ),
      endTime,      MAP(end,   ExtractTimeλ),
      IF(valid?, endTime - startTime + (startTime>endTime), "")
    )
  );

Too many games to play and too little time.

@PeterBartholomew1 

Yes, too many games...

@SergeiBaklan 

I have taken another crack at the problem.  My goal was to reduce the solution to the same form as I use for banded tax problems, sales commission tiers, FIFO calculations and a number of other problems that involve overlapping range calculation.  Namely, I stack all types of events in a single block and sort them into a single table for calculation.  Nowadays GROUPBY and PIVOTBY can be valuable.

 

=LET(
    timeLine,       EXPAND(TOCOL(dates + SEQUENCE(24, , 0) / 24, , TRUE), , 2, 0),
    validTimes,     REGEXEXTRACT(times, "\d{4}-\d{4}"),
    assocDates,     TOCOL(IF(ISERROR(validTimes), NA(), dates), 3, TRUE),
    clockTimes,     TOCOL(validTimes, 3, TRUE),
    startTimes,     MAP(
        TEXTBEFORE(clockTimes, "-"),
        LAMBDA(t, SUM({60, 1} * REGEXEXTRACT(t, "\d{2}", 1) / 60 / 24))
    ),
    endTimes,       MAP(
        TEXTAFTER(clockTimes, "-"),
        LAMBDA(t, SUM({60, 1} * REGEXEXTRACT(t, "\d{2}", 1) / 60 / 24))
    ),
    finishTimes,     endTimes + (startTimes > endTimes),
    startDatetimes,  EXPAND(assocDates + startTimes, , 2, 1),
    finishDatetimes, EXPAND(assocDates + finishTimes, , 2, -1),
    keyDatetimes,    VSTACK(timeLine, startDatetimes, finishDatetimes),
    events,          SORT(keyDatetimes),
    sortedDatetimes, TAKE(events, , 1),
    headcount,       SCAN(0, TAKE(events, , -1), SUM),
    timeslot,        FLOOR.MATH(sortedDatetimes, 1 / 24),
    duration,        DROP(sortedDatetimes, 1) - sortedDatetimes,
    manhours,        headcount * duration,
    day,             INT(timeslot),
    hour,            MOD(timeslot, 1),
    DROP(PIVOTBY(hour, day, 24 * manhours, SUM, , 0, , 0), , -1)
)

 

There might be a case for modularising the formula further to separate the restructuring (using REGEX) from the subsequent calculation.  The calculations for headcount and time durations could also be Lambda function.  I feel I should use more in the way of functional breakdown of problems using a tree of Lambda functions than I do.  The "1500-0330" is still a bit of an unresolved problem since it pushes 3½ hours labour outside the period of the analysis.

@PeterBartholomew1 

That's interesting approach, especially if apply it to more adequate task. This one is not well defined initially, thus results are not very practical.