Searching a complex set of data

Copper Contributor

I have an Excel File generated from a debase report. Each "Record" in the database fills 6 rows by 15 columns. I want to sort the "records" by cell Row3 Column N. Any idea how to do that?

10 Replies

@Greg_Taylor 

 

= LET(
    blockHeight,    6,
    blockWidth,    15,
    keyRow,         3,
    keyColumn,     14,
    singleRows,    WRAPROWS(TOCOL(data), blockHeight*blockWidth),
    sortedRecords, SORT(singleRows, (keyRow-1)*blockWidth+keyColumn),
    displayed,     WRAPROWS(TOCOL(sortedRecords), blockWidth),
    displayed
  )

 

Another strategy might be required if you have more than 10,000 records

@Greg_Taylor 

... or a more 'nuts and bolts' approach:

= LET(
    blockHeight,  6,
    blockWidth,  15,
    blockCount,   4,
    keyRow,       3,
    keyColumn,   14,
    keyRows,     SEQUENCE(blockCount, 1, keyRow, blockHeight),
    keyValue,    INDEX(data, keyRows, keyColumn),
    order,       SORTBY(SEQUENCE(blockCount), keyValue),
    rowNumber,   TOCOL(blockHeight*(order-1) + SEQUENCE(1,blockHeight)),
    CHOOSEROWS(data, rowNumber)
  )
Thank you SOOO much Mr. Bartholomew. There are about 6000 records.
This is very encouraging; however, I am really rudimentary with Excel, so I don't know where to put this string :upside_down_face:.... Do I just put this in the first cell of the file?
I'll experiment in the meantime. Is it possible to give me more "Idiot's Guide" directions? Thank you again

@Greg_Taylor 

The key feature of the formulas is that they are written for Excel 365.  Array formulas are used so that the entire table of data is returned in the required sort order from a single cell and spills out down and to the right.

 

This is very different from traditional spreadsheet methods in which a single cell is calculated using scalar arithmetic and the formula is copied down and to the right using relative referencing.

 

image.pngimage.png

This final version uses a Lambda function which collects the references together as a parameter string and means you do not have to look at the formulas, just the results.

= BlockSortλ(data, 6, 3, 14)

= LAMBDA(data, blockHeight, keyRow, keyColumn,
    LET(
      blockCount,  QUOTIENT(ROWS(data), blockHeight),
      keyRows,     SEQUENCE(blockCount, 1, keyRow, blockHeight),
      keyValue,    INDEX(data, keyRows, keyColumn),
      blockedKey,  TOCOL(keyValue + SEQUENCE(1, blockHeight, 0, 0)),
      SORTBY(data, blockedKey)
    )
  )

You may observe that the whole formula looks more like a code module than a traditional spreadsheet formula and this version is a function stored by name in Name Manager.

 

If you cannot access 365, similar results may be available to you using PowerQuery.

Hi @Greg_Taylor 

 

Attached is a Get & Transform aka Power Query option:

 

// Table
let
    Source = Excel.CurrentWorkbook(){[Name="Table"]}[Content],
    TableSplit = Table.Split(Source, 6),
    ToTable = Table.FromList(TableSplit,
        Splitter.SplitByNothing(), {"RECORD"}, null, ExtraValues.Error
    ),
    SortBy = Table.AddColumn(ToTable, "SortBy", each
        Table.First( Table.Skip([RECORD], 2) )[N]
    ),
    SortedRows = Table.Sort(SortBy, {{"SortBy", Order.Ascending}}),
    RemovedSortBy = Table.RemoveColumns(SortedRows, {"SortBy"}),
    CombinedRecords = Table.Combine(RemovedSortBy[RECORD])
in
    CombinedRecords

 

 

- Put your data in Table sheet 'Input'

- Switch to sheet 'Output'

- Right-click in a cell below A-O > Refresh

@L z. 

Interesting formula!  I don't think I could have created that.  Are steps such as 

Table.Split(Source, 6)

available from the UI or do you have to know the commands to enter them?  At least reading your query caused me to dip into the joys of 'selection' and 'projection' once more with a little more sense of purpose.

 

One thing that occurs to me is that, if Excel 365 is now Turing complete, I should be able to replicate the action of any PQ step using a suitable defined Lambda function.   In this case, a list of tables could be replicated as an array of thunks.  Unfortunately replicating the functionality of Table.Combine, though possible, is something of a nightmare.   The sooner Excel allows lists of arrays the better!

 

Talking of nightmares, we should persuade @Greg_Taylor to post a sample of his workbook here.  The data blocks appear to serve the role of a sequence of forms, separated by blank rows and any field can display across several blank columns before the next set of values are encountered.  The set task was to sort the 'forms' but I was sorely tempted to normalise the data format instead.

Thank you Peter and L z....  I'm only an egg with Excel, so I Private Messaged Peter to be less exposed in my ignorance, but am happy to move back to a more public area, and to further the education of those who can follow, with my DEMO file (once I figure out how to do that....).
 I am, as I said, very appreciative of all help.  I have to admit 2 things though: re: Peter's solution, that I am still confused about how to use these wonderful tools you have giving me: it seems to be an elegant scalpel but I seem to be only equipped to use it as a hammer.... it is good to feel humble, I guess.... and second, I can't count well, as the data file looks different to me know, that it has "7" rows (plus a following hidden row, just to throw a curve in.). So I am reluctant to try L z's (much more fit to me, cause I THINK it is meant to be a Copy and Paste solution, much more in keeping with my skill set....). Anyway, here is the DEMO file...

BTW, I am happy if the output is in a single row , especially if the columns are labeled. Even if they are not, it will be easy, I think, to figure the column names out "manually."

@Greg_Taylor 

This version has a copy of the table with single row records and my attempt at implementing @L z. 's Power Query solution.

@Peter Bartholomew 

 

Table.Split(Source, 6)

can't be done with the UI => Advanced Editor