Forum Discussion
Working with Arrays of Ranges
OK, so strictly speaking an array of ranges is an error in Excel right from the outset. However, I set up an array of tables contained within the named Range 'allocatedRange', here shown with a red outline.
I then set up a function that would return the range corresponding to a country
SelectDataλ
= LAMBDA(country,
LET(
countries, FILTER(TAKE(allocatedRange, , 1), markers = "Country:"),
recordNum, FILTER(SEQUENCE(ROWS(allocatedRange)), markers = "Country:"),
recordEnd, VSTACK(DROP(recordNum - 1, 1), ROWS(allocatedRange)),
rangeRows, recordEnd - recordNum,
countryArrϑ, MAP(
recordNum,
rangeRows,
LAMBDA(start, rows, LAMBDA(TRIMRANGE(TAKE(DROP(allocatedRange, start), rows))))
),
XLOOKUP(country, countries, countryArrϑ)()
)
)
The start and row count of each table is determined by using the string "Country:" as a marker and differencing row numbers. A combination of TRIMRANGE/TAKE/DROP picks out each range in turn and assembles them into an array by turning the range references into thunks (TYPE=128). The function SelectDataλ is used to look up any specific country and return the corresponding range.
To demonstrate that the function indeed returns ranges, the COUNTIFS function is used to count the number of cities within each country that have a population exceeding 100 000.
= LET(
countries, FILTER(TAKE(allocatedRange,,1), markers="Country:"),
countLarge, MAP(countries,
LAMBDA(country,
COUNTIFS(TAKE(SelectDataλ(country),,-1), ">1000000")
)
),
HSTACK(countries, countLarge)
)
The point of this post is to introduce the idea of treating tables as elements of an array, so allowing further tables to be inserted or removed dynamically. TRIMRANGE ensures that each range is sized correctly and MAP is used to analyse each table in turn. Whilst Excel throws hissy fits at being asked to store arrays of arrays, arrays of ranges, or nested arrays, it is perfectly happy to store arrays of functions. When evaluated, each element of the function array is perfectly free to return an array or even a range.
The effect is to permit Excel to process 'arrays of Tables' faultlessly.
16 Replies
- PeterBartholomew1Silver Contributor
The ideas proposed in this discussion appear to be working well. The data dictionary allows arrays of ranges to be stored (as would the multi-area range) but the ranges may be mixed with calculated arrays. It is possible to return a data dictionary from a LAMBDA function as a single array object.
A specific use examined in Dynamic Let Function | Microsoft Community Hub allows the contents of multiple LET variables to be returned and scrolled through as and aid to debugging.
The format introduced in this discussion allows an in-memory data dictionary to be written to a worksheet to persist the data beyond the LAMBDA/LET environment in which it was created. Functions are provided to enable any other Lambda function to reload the dictionary from the worksheet if needed.
Since the data dictionary contains only references to data in memory and not the actual data arrays it should be possible to replicate it or pass it to other functions with relatively low overhead (this is speculation not, as yet, backed by evidence). Due to immutability, it is not possible to append a data item to a dictionary, but multiple dictionaries may be stacked to form a new, more comprehensive, dictionary.
The image shows the result of simply returning a dictionary (here formed from a set of LET variables); the result of creating a persistent copy using a function;
= DB.PERSISTλ(TESTλ(a,b))
and a single item selected from the re-read dictionary by a spinner setting (this final object is included merely to demonstrate that the data has been re-read from the worksheet).
- lori_mIron Contributor
Interesting scenario and method.
I played around a little looking to see if it was possible to avoid the use of thunks by combining references into arrays and had partial success. If the names of the ranges are Table1,..,Table4 we can define a multi-area range: Tables=Table1,Table2,Table3,Table4 and try the following formula to convert this to an array of ranges.
=SUMIF(LAMBDA(T,MAP(SEQUENCE(AREAS(T)),LAMBDA(a,DROP(INDEX(T,,,a),{0}))))(Tables),"<>")
Strangely, this appears to work fine in Monthly Enterprise Channel but not Beta Channel.
An alternative that was more successful was to build a multi-area reference via REDUCE as shown
- SergeiBaklanDiamond Contributor
Nice idea with areas. In general we may avoid explicit markers and don't care about tables location. If only to keep existing layout for each table.
=LET( Tables, (Table1,Table2,Table3,Table4), n, SEQUENCE( AREAS(Tables) ), bar, 1000000, city, LAMBDA(i, OFFSET( TAKE(INDEX(Tables,,,i),1,1),-2, ) ), qty, LAMBDA(i, COUNTIF( DROP(INDEX(Tables,,,i), ,1), ">" & bar ) ), VSTACK( {"Countries","Cities > 1m"}, HSTACK( MAP( n, city ), MAP( n, qty ) ) ) )
and have not thousands of tables to care about performance.
- lori_mIron Contributor
Related to the DICTIONARYλ suggestion, I wonder if it might be worth defining a nestable dictionary such as,
DICT = LAMBDA(array, LAMBDA([key], IF(ISOMITTED(key), array, XLOOKUP(key, VALUETOTEXT(TAKE(array, , 1)), TAKE(array, , -1)) ) ) );
which, like in python, returns a dictionary from a two column array. e.g. one could try
=countries()
=countries("United Kingdom")()
=countries("France")("Paris")where
countries =DICT( WRAPROWS(VSTACK( "United Kingdom", DICT(Table1), "Germany", DICT(Table2), "France", DICT(Table3), "Italy", DICT(Table4) ),2) );
Though I'd still tend to side with the earlier PIVOTBY suggestion.
- PeterBartholomew1Silver Contributor
Hi Lori
I think my experimentation may have more significance in terms of my working than I realised!
The pairing of a text string and a thunk allows me to build a data dictionary that provides a mechanism to pass complex data structures between Lambda function (in particular to return multiple data objects from a sub-function).
So far I have written a DICTIONARYλ constructor function that builds the data structure
/* NAME: DICTIONARYλ REVISIONS: Date Developer Description Dec 19 2024 Peter Bartholomew Original Development */ DICTIONARYλ = LAMBDA( _key01, object01, [_key02], [object02], ... // Check inputs - Omitted required arguments Help?, OR( ISOMITTED(_key01), ISOMITTED(object01), ), NULLOMMITTED, LAMBDA(x, IF(ISOMITTED(x), "", x)), _key02, NULLOMMITTED(_key02), ... // Procedure // Build array of unique keys uniqueKeys, VSTACK(_key01, _key02,_...), // Build array of thunked arrays (also applicable to scalars) objectArray, VSTACK(LAMBDA(object01), LAMBDA(object02), ... ), // Combine unique keys with references to data objects result, FILTER(HSTACK(uniqueKeys, objectArray), uniqueKeys<>""), CHOOSE(Help? + 1, Result, Help) ) );
as well as generic and dictionary-specific functions to return data from dictionaries
GETλ = LAMBDA(dict, LAMBDA(key, LET( uniqueKeys, TAKE(dict,,1), objects, DROP(dict,,1), result, XLOOKUP(key, uniqueKeys, objects, "keyword not found"), (@result)() ) )); GETSERIESDATAλ = GETλ(seriesDataDictionary) ;
and the ability to return a list of keys
KEYSλ = LAMBDA(dict, TAKE(dict,,1));
Within a functional programming environment, each dictionary is write once, read many since all objects are immutable, but I should be able to combine two directories to give a fresh data object.
That might sound extravagant in terms of memory commitment, but since I am only storing thunks (references to memory allocated to a LET variable) the burden might be modest.
The Excel failure to return arrays of arrays makes it a little difficult to dump an image of the dictionary to the worksheet for checking or to provide persistence but I think my MAPλ function should work.
I have addressed these thoughts to you, as a Python programmer, in the hope that you might have additional insight that you could draw upon to educate me🙂! Of course, others are welcome to contribute.
- lori_mIron Contributor
Hi Peter - I think a natural way to define a data dictionary in Excel is as an (n x 2)-array consisting of key/value pairs. Given definitions as in the earlier attachment, a data dictionary could be defined pairwise,
dic = WRAPROWS(VSTACK( "United Kingdom", LAMBDA(Table1), "Germany", LAMBDA(Table2), "France", LAMBDA(Table3), "Italy", LAMBDA(Table4) ),2)
or alternatively as a pair of lists,
dic = LET( keys, FILTER(TAKE(allocatedRange,,1),ISTEXT(markers)), values, StructureDataλ(markers, allocatedRange), HSTACK(keys, values) )
In either case TAKE(dic,,1) returns the keys and (@VLOOKUP("United Kingdom",dic,2,0))() returns the values corresponding to the key. This appears much like the GETλ / KEYSλ functions in your comment. However, I wouldn't advise creating DICTIONARYλ as I don't know of any way to specify an extensible list of arguments within a lambda definition - the first method above seems like a reasonable compromise in my view.
For comparison the equivalent definitions in Python would be,
dic={"United Kingdom":Table1,"Germany":Table2,"France":Table3,"Italy":Table4}
or otherwise utilising tuples,
dic = dict(zip( ("United Kingdom","Germany","France","Italy"), (Table1,Table2,Table3,Table4) ))
dic.keys() and dic.get("United Kingdom") return keys / values. A custom data dictionary could be defined by passing arguments using unpacking operators (*args / **kwargs) - something Excel currently lacks
- djclementsBronze Contributor
Good day Peter,
While I cannot advocate for this type of table/data structure, I do appreciate the TRIMRANGE demonstration as it is still currently unavailable to me on the business channel. The concept of working with an array of ranges is also very useful and is something I would highly recommend to anyone looking to up their game.
There was a SUMIFS riddle posted on another forum a few weeks ago, and this concept worked beautifully for solving it. Even after broadcasting a vector of thunked ranges across multiple rows/columns, they still returned a range object when recalled later. I've attached a copy of the file if anyone's interested, which includes the link to my post explaining how it works. Incidentally, IFNA can also be used as a broadcasting tool instead of IF({1},...).
I tried to play around with your scenario a bit, but without TRIMRANGE at my disposal, I was forced to rewrite it using other methods.
Kind regards.
- Patrick2788Silver Contributor
The way I see the problem in the attached workbook is that it can be solved with PIVOTBY but there are two obstacles:
- The data is already pivoted
- The row labels needed to be swapped
For both obstacles I have generalized Lambda functions: UnPivotMλ and FindReplaceλ. The solution is then:
=LET( row_labels, FindReplaceλ(lstFruits, includes, lstBundles), flat, UnPivotMλ(row_labels, dates, data), r, TAKE(flat, , 1), c, CHOOSECOLS(flat, 2), v, TAKE(flat, , -1), PIVOTBY(r, c, v, SUM, , , , 0) )
UnPivotMλ uses shaping functions and sequencing while FindReplaceλ is essentially a recursive REGEXREPLACE. I'd drop the code here, but I don't mean to derail Peter's post!
- SergeiBaklanDiamond Contributor
Off topic. Not to forget OfficeScript tried to repeat that particular sample on it
function main(workbook: ExcelScript.Workbook) { const tables = workbook.getTables() const target = workbook.getActiveWorksheet().getRange("L5:M5") const bar = 1000000 let i = 0 target.setValues([["Countries", "Cities > 1m"]]) for (let table of tables) { target .getOffsetRange(++i, 0) .setValues( [[ table.getRange() .getCell(-1, 0) .getValue() .toString(), table.getRangeBetweenHeaderAndTotal() .getLastColumn() .getValues() .filter(x => x[0] > bar).length ]] ) } }
- PeterBartholomew1Silver Contributor
It would appear to do the job!
The Excel world is getting more confusing. There always were those for whom the Excel grid was merely a space from which values could be read into VBA for processing and returned for display. Then we were told that was to end, and everything would be OfficeScript. Yet I see less focus on OfficeScript than on Python or PowerQuery M. DAX also appears to be a somewhat niche artform.
I really should broaden my skillset but at the moment it is an interesting challenge to see whether the Excel formula language can do every calculation that can be performed in Turing machines. Python has NumPY and SciPY to support calculation but, for Excel formulas, one has to write one's own library of algorithms. Whether one regards that as service to enhance Excel, or merely a waste of time replicating code that already exists elsewhere, is a matter for debate 🤔
- Patrick2788Silver Contributor
This appears to be a more flexible XLOOKUP/Thunk demo. I recall a workbook from a few years ago where XLOOKUP was being used to unpack thunks and return tables. The catch was the tables had to be VSTACK'd and thunked - every one of them. It made for an exciting demo but would be difficult to find a practical use for the technique. Nevertheless, I studied it and made my own workbook until I could internalize it.
A few observations:
- I think what is fascinating with this workbook is how you've managed to avoid converting ranges to arrays. TAKE/DROP are interesting because the result of each function is accepted by the range-only functions. The part I wouldn't have guessed would hold up is when countryArrϑ is unpacked with XLOOKUP then later used in COUNTIFS!
- There's no declared Thunk function in your module but I see where countryArrϑ is being thunked. Very subtle and smooth.
- I messed around a bit with trying SelectDataλ with 3D references but received errors because I used VSTACK on my references and made arrays out of ranges. Could this function handle 3 references?
- PeterBartholomew1Silver Contributor
In a sense, the array of 2D ranges down the sheet is a representation of a 3D range. If we turn attention to 3D ranges of the form
range3D = Sheet1:Sheet3!$K$3:$K$20
I suspect that life is going to get a whole lot trickier.
It should be possible to return an array of thunked arrays, but that would not allow the use of TRIMRANGE or any of the xIFS functions.
= LET( TRIMARRλ, LAMBDA(column, LAMBDA(FILTER(column, NOT(ISBLANK(column)),""))), stacked, HSTACK(range3D), thunkArr, TOCOL(BYCOL(stacked, TRIMARRλ)), MAP(thunkArr, LAMBDA(ϑ, AVERAGE(ϑ()))) )
would apply the AVERAGE function to each sheet individually. Another option might be to use INDIRECT which can return ranges. For example
=INDEX( MAP( TOCOL(Sheet1:Sheet3!$A$1) & "!K3:K20", LAMBDA(string, LAMBDA(TRIMRANGE(INDIRECT(string))) ) ), 2, 1 )( )
returns the trimmed range on sheet 2.
Note: The range Sheet1:Sheet3!$A$1 contains the sheet names
= TEXTAFTER(CELL("filename",A1), "]")
Nothing too elegant but, as yet, we are simple sounding out the limits of what is possible, though I concede that may be limited by my lack of understanding.
- PeterBartholomew1Silver Contributor
I have updates the file to show the thunk array as #CALC! errors when output to the worksheet and also evaluated to show their content (Edit does not seem to be an option on the original post).