Forum Discussion
Is Excel capable of this operation?
Thank you for taking the time to upload the sample workbook and notes.
I have some questions. Beginning at Row 50, there's 37 consecutive data points where White is 6 and Orange is 0. I entered "Label?" in column C because this seems to meet the criteria of 24+ rows (or 6 minutes) of stability. Should these be marked with labels?
This might be simplifying things, but sometimes with a complex task it's best to start small and scale it to add nuance and complexity as needed. One way to measure stability is obtaining a rolling Standard Deviation. For the sake of this example, we'll say Standard Deviation must be 0 for there to be Stability.
This function obtains a rolling Standard Deviation:
=SCAN(, White, LAMBDA(a, v, LET(r, ROW(v) - 1, STDEV(TAKE(White, r)))))The same function can be used on the Orange array, too.
This snip illustrates the stability early on in the sample (0 and 0 for those last two columns):
The question I have is: when the stretch of 4-0 ends, would you like to discard the previous numbers for White and Orange and start over in the calculations?
Patrick2788 Wow! I am reeling from your insight! Amazing! "Label ?" question is best answered as "values for Column A are below threshold limit", so no analysis (or label) is required. Values don't meet criteria. Answer for 2nd question regarding 4-0 is YES, deviation should reset to new plateau values, so 5% deviation at plateaus around Column A data of 30, should be re-calc'd when new plateau is around Column A of 50. Rounding up value of deviation is alao acceptable, cuz we only use integers. I hope these answers help to clarify. I am travelling from late day 8Aug until eve of 11 Aug. My delayed responses are not due to lack of interest, only to connectivity and eyesight issues. Best regards
PS: Your skills and knowledge are impressive! Good for you!
- mtarlerAug 07, 2023Silver ContributorIf 6 is below threshold then why are rows 2-34 given Label A since those are only 4 and then 98 to 273 are given Label B also with White value of 6. And then Label E is rows 432 - 479 where White starts at 51 and max of 56 and then very next value is also 51 but not included. So it seems you can have a ramp (sliding threshold?) AND 51 to 56 may/could be 53+/-5% but then that would be 50.35 (round to 50) and 55.65 (round to 56) but then the 50 just before the first 51 isn't included either. and then Label F and G and L and Q and R and T and U are each 0 in column 1 (certainly below threshold). and then Label J does NOT include a 43 just before it starts but DOES include a 40 at the end right after a 43. label K is too short. Label P is also only 23 long but at least that is really close to 24.
So back to my points above. You mention an 'attached graphic' but I can't find it and that may help explain a lot of this. You mention thresholds, minimum deviations, averaging, etc... but I don't see those values anywhere except the 5% deviation you mention. Patrick suggested using StDev instead of a 'flat' deviation and you seem to like that idea? I suspect some of these labels maybe to point out things like areas not to count or plateaus for Orange and not White and hence I suggested the 2 separate columns.
I'd love to sink my teeth into this but feel like I'm grabbing blind from the refrigerator drawer and just as likely to grab an onion as a I am to grab an apple.- asezasAug 08, 2023Copper Contributor
mtarler FIRST, I like your humour! 2nd, you are correct in thinking your are missing MUCH by not having the explanatory graphic! How can I get it to you? Patrick opened it OK, and it was definitely uploaded as a PNG. I will shortly try to post as a PDF. Following the notice that the "Labels" are identified in the graphic using alpha characters, they serve only as landmarks, to identify a particular location on the graphic; their inclusion in the xls was to help connect data values to a location on the graphic. They have no connection to CASE occurences. Without the graphic, explanation is difficult. However, I can say that all values below 23 in Column A are exempt from analysis, and I can also say that Label location F (Column B) is a CASE for Label location E (Column A); Label O is a CASE for Label N; Label T is a CASE for Label R; and, Label W is a CASE for Label U. Until we can get you a copy of the graphic I worry that you are "blindly grabbing in a dark fridge", if you'll pardon the expression! Ha! I hope this helps to clarify matters. - Best regards
- Patrick2788Aug 08, 2023Silver Contributor
The approach I'm taking is to identify the plateaus in the data where White and Orange are stable for at least 24 rows. Essentially, it's identifying the rows that are part of a 'streak'. This draft does not allow for any wiggle room. For White/Orange to be stable, there must be no change for 24+ rows.
=LET( Change, LAMBDA(arr, N(DROP(arr, 1) = DROP(arr, -1))), Stable, DELTA(Change(White), Change(Orange)), Seq, SEQUENCE(ROWS(Stable)), NonStable, 1 + FILTER(Seq, Stable = 0), Streak, NonStable - VSTACK(0, DROP(NonStable, -1)), Seq_2, SEQUENCE(ROWS(White), , 2), Plateau, LAMBDA(e, LET( x, XLOOKUP(e, NonStable, Streak, , 1), IFNA(IF(x >= 24, "Plateau", ""), "") ) ), MAP(Seq_2, Plateau) )- asezasAug 12, 2023Copper ContributorThis is intended for Patrick2788: Trying to grok the fishing pole analogy, what I seek thru my query is "all of", or "the big" picture. Please let me take a step back from the granularity and scratchiness, and try to explain it this way: Column A is an "event". Let's call it a sound, like a bang or a clap of thunder, or a snap of fingers. The louder the sound, the higher the value, and the longer the sound duration, the more data points get lit up. Column B, is a response to that sound event, by persons. The response (of course) always happens after the event, but sometimes the frequency of the event is too high or too low and is therefore inaudible for a person, and sometimes the response is small because the perceived volume of the sound was small. Sometimes, the response is long because event was long. In the end, one thing I'd like to determine is where the threshold of detection is for the person giving the response ... but right now I'm happy just MANUALLY plugging in a number for that threshold value. Similarly, I'm also happy manually plugging in a number that can be used to determine how many sounds the person heard that were longer than x data points. Visually, it's a relatively simple task to look at the graphic of the plotted data, and tell whether there appears to be a correlation between the sound intensity and the response. It's less easy to determine how long it takes to get a response for a sound of a certain volume, or if sounds of higher volume require less duration to get a response. But for now, if I can plug a value into a formula for the Threshold value (nothing below that needs to be analyzed), and/or plug in a value that says that there has to be a specific number or correct/ positive responses to N number of sounds, I will be a happy camper. I offer this reply to all of you following this problem.