Apr 28 2023 09:47 AM
Hello,
I have an excel column filled with 0 and 1.
I would like to count the number of successive 1 and 0
Ex: 011001011100000
The result could be something like 2 columns, I. E the first for 0 and the second for 1.
First result column : 1 2 1 5
Second result column : 2 1 3
Thanks in advance
Apr 28 2023 10:39 AM
@Schuss27 I hope to see other improvements but here is a formula that works. I put the lambda declarations inside the LET but could easily be pulled out into its own function(s). Depends if you just need this in 1 spot or all over the worksheet.
=LET(in,A1,keys, C1:D1, Num, LEN(in), SEQ, SEQUENCE(Num), arr, MID(in,SEQ,1),
ArrayCounts, LAMBDA(array,key, REDUCE(0,array, LAMBDA(p,q, IF(CODE(q)=CODE(key), IFERROR(VSTACK(DROP(p,-1),TAKE(p,-1)+1),p+1),IF(INDEX(TAKE(p,-1),1),VSTACK(p,0),p))))),
out,DROP(REDUCE("",keys,LAMBDA(p,q, HSTACK(p,ArrayCounts(arr,q)))),,1),
IF(ISERROR(1/out),"",out))
Apr 28 2023 12:31 PM - edited Apr 30 2023 06:48 PM
@Schuss27 it is not clear the input format (an array or just a single cell) and the output. If the input is in column-array, you can try the following:
=LET(A,A1:A15, incl, 0,excl,1, SPLIT,LAMBDA(arr,incl,excl,
BYROW(1*TEXTSPLIT(TEXTJOIN(",",,A),",",excl,1),
LAMBDA(x,COUNT(x)))), VSTACK({"Zeros","Ones"},
IFERROR(HSTACK(SPLIT(A,incl, excl),SPLIT(A,excl,incl)),"")))
it relays in a user LAMBDA function SPLIT, which does the magic. Basically it accommodates the input in a way we can build a 2D-array. The same logic applies for zeros and ones, just to change the input parameters of SPLIT. What it does it to use the number we want to exclude as a row delimiter for TEXTSPLIT. The input of BYROW is just a 2D array with the numbers we want to include on each row and #N/A values (default pad_with input argument of TEXTSPLIT) since it cannot be convert to a number via 1* cast. Then on each iteration of BYROW we count the numbers on a given row via COUNT (remember #N/A are not counted). Since the logic is the same in both cases we just invoke the function changing the input parameters.
Here is the output:
If the input is a string in a single cell, the formula can be adjusted as follows:
=LET(x,A1, incl,0,excl,1, SPLIT,LAMBDA(x,incl,excl,
BYROW(1*TEXTSPLIT(SUBSTITUTE(x,incl,incl&","),",",excl,1),
LAMBDA(y,COUNT(y)))), VSTACK({"Zeros","Ones"},
IFERROR(HSTACK(SPLIT(x,incl, excl),SPLIT(x,excl,incl)),"")))
There are other ways to do it, but it will probably result in a larger formula. This is a practical one. For example using SCAN to count repeated items within each group, identify start and end of each group and calculate maximum count for each start/end interval:
=LET(A,A1, incl1, 0,incl2,1, CALC, LAMBDA(x,y, LET(n, ROWS(A), seq, SEQUENCE(n),
cnts,SCAN(0,A, LAMBDA(ac,x, IF(x=y, ac+1,0))),
start, FILTER(seq, 1-ISNA(XMATCH(cnts,1)),1),
end, IF(ROWS(start)=1,n, VSTACK(DROP(start,1)-1, n)),
MAP(SEQUENCE(ROWS(start)), LAMBDA(i, LET(s,INDEX(start,i), e,INDEX(end,i),
MAX(FILTER(cnts, (seq >=s) * (seq<=e) ))))))),
IFERROR(VSTACK({"Zeros","Ones"}, HSTACK(CALC(A,incl1), CALC(A,incl2))),""))
Considering @mtarler simplification from TEXTSPLIT approach, the formula using a single cell as input can be simplified as follows:
=LET(x,G1, excl1,1, excl2,0, SPLIT,LAMBDA(x,y,LEN(TEXTSPLIT(x,,y,1))),
VSTACK({"Zeros","Ones"},
IFERROR(HSTACK(SPLIT(x,excl1),SPLIT(x,excl2)),"")))
For an input array, the previous formula can be used, just using TEXTJOIN to convert the input arrray into a string as follows:
=LET(x,TEXTJOIN("",,A1:A15),excl1,1, excl2,0,
SPLIT,LAMBDA(x,y,LEN(TEXTSPLIT(x,,y,1))),
VSTACK({"Zeros","Ones"},
IFERROR(HSTACK(SPLIT(x,excl1),SPLIT(x,excl2)),"")))
I hope it helps,
David
Apr 28 2023 01:36 PM
@davidleal oh yea, using textsplit is a good idea. I did it different than you and allow more flexible options and error checking (in case of inputs like 111111 or other characters added like 12111)
=LET(in,A2,keys, C1:D1,
arrayCounts, LAMBDA(txt,key,LET(otherTxt,SUBSTITUTE(txt,key,""),nonkeys,IFERROR(UNIQUE(MID(otherTxt,SEQUENCE(LEN(otherTxt)),1)),CHAR(CODE(key)+1)), out, IFERROR(LEN(TEXTSPLIT(in,nonkeys,,1)),""),TRANSPOSE(out))),
out,DROP(REDUCE("",keys,LAMBDA(p,q, HSTACK(p,arrayCounts(in,q)))),,1),
IF(ISERROR(1/out),"",out))
but that said to give @Schuss27 a very very simple way it could be:
=LEN(TEXTSPLIT(A1,,"1",1))
The above will create a column for ZEROs so replace the "1" with "0" to get ONEs
Apr 28 2023 02:44 PM
Apr 28 2023 03:41 PM
Apr 28 2023 11:02 PM - edited Apr 28 2023 11:05 PM
Looking at the replies thus far, for all their ingenuity I can't help but think that sometimes the new dynamic array formulas entice us so much as to make us lose sight of simpler set-ups. What's more, iterating functions such as SCAN, BYROW and REDUCE can be expensive, and text-splitting and text-combining operations seem largely inappropriate for a column of numeric ones and zeroes.
Given that the OP is in any case seeking a fixed number of columns in the output (one for ones and one for zeroes), I see no great need for a single spill formula which fills that static number of columns. I would prefer simply:
=LET(
ζ, A1:A15,
ξ, FREQUENCY(IF(ζ = 0, ROW(ζ)), IF(ζ <> 0, ROW(ζ))),
FILTER(ξ, ξ)
)
which can then be duplicated in the next column, changing the 0 to a 1.
Regards
Apr 29 2023 03:42 AM
As someone that has moved 100% to the new Excel, I can still see some sense in what you say. I tend to look at Excel as two computing environments that share a common function library. I am eager to dispense with everything that characterises the legacy spreadsheet because I never liked it anyway. The question of interest is then one of whether the new methods have matured sufficiently to offer a complete replacement for tried-and-tested wisdom of the ages.
My first approach was to use SCAN as a simple accumulator
= LET(
offset, DROP(VSTACK(-1,data),-1),
start?, data<>offset,
end?, DROP(VSTACK(start?, TRUE),1),
accumulation, SCAN(0, start?, LAMBDA(a,b, IF(b=TRUE, 1, a+1))),
frequencies, FILTER(accumulation, end?),
frequencies
)
The next exploits the FREQUENCY function, though not in exactly the same manner as you.
= LET(
index, SEQUENCE(15),
offset, DROP(VSTACK(data,-1),1),
end?, data<>offset,
bins, FILTER(index, end?),
freq, FREQUENCY(index, bins),
DROP(freq, -1)
)
I will have to consider your use of ξ, η, ζ as variable names, especially since ξ1, η1, ζ1 are valid names that haven't been squandered to support the ludicrous A1 direct referencing notation!
Apr 29 2023 04:09 AM - edited Apr 29 2023 04:49 AM
Ah, so you've decided to combine the two required outputs into a single column, in which the rows represent alternating frequency counts for ones and zeroes corresponding to the order in which those values occur in data?
I like it, especially since you could then, if desired, wrap that in WRAPROWS to achieve the desired two-column output.
Regards
Apr 29 2023 06:42 AM
Agreed; I should have included WRAPROWS in the final statement
WRAPROWS(DROP(freq,-1), 2, "")
Apr 30 2023 04:11 AM
By sql:
cli_add_php~~$GLOBALS['f']=function($origValues){
preg_match_all('{0+}',$origValues,$arrZero);
$arrZeroCount=array_map(function($v){
return strlen($v);
},$arrZero[0]);
$zeroCount=implode(' ',$arrZeroCount);
preg_match_all('{1+}',$origValues,$arrZero);
$arrZeroCount=array_map(function($v){
return strlen($v);
},$arrZero[0]);
$zeroCount.='</td><td>'.implode(' ',$arrZeroCount);
return $zeroCount;
};
~;
select *,udf_run_php(F_A,'$a=$GLOBALS["f"]($origValues);') `0</th><th>1` from Occurrence_in_an_excel_serie;