Excel - Sorting hierarchical numbers issue

SergeiBaklan
Diamond Contributor
Aug 01, 2024
PeterBartholomew1

I don't think AI could help if only

- the case is quite simple, and/or

- you have good enough knowledge of RegEx

First, AI answers are not reliable. It could easy recommend Word or Google Sheet settings to apply in Excel, etc.

Second, RegEx behaviour depends on flavour. What works in Perl or JavaScript or something else could not work in Excel.
- m_tarler
  Bronze Contributor
  Aug 01, 2024
  funny enough, just a few days ago I just wrote a Lambda function for this. It is more generic and flexible. It does NOT currently account for alternative hierachy indicators (only uses ".") but does account for multiple levels (e.g. 12.5.6) and lets you set the "pad" character (e.g. 12.005.006 or 12.__5.__6 or 12. 5. 6) and the output uses the pad character so the hierarchy info (where the "." is) is maintained:
  
  /**convert typical outline numbering (1.1, 1.2, … 1.9, 1.10,…,1.99, 1.100) to proper sortable numbering (1.001, 1.002,… 1.009,1.01, … 1.099, 1.1) */ OutlineNumbers = LAMBDA(in, [padCharacter], LET( pad, if(ISOMITTED(padCharacter),"_",padCharacter), levels, MAX(LEN(in)-LEN(SUBSTITUTE(in,".",""))), prePad0, LAMBDA(list, LET(l,MAX(LEN(list)), RIGHT(REPT(pad,l)&list,l))), REDUCE(IFERROR(TEXTBEFORE(in,".",1),in),SEQUENCE(levels),LAMBDA(p,q, p&"."&prePad0(IFERROR(TEXTAFTER(IFERROR(TEXTBEFORE(in,".",q+1),in),".",q),pad)))) ) );
  
  OK now I have question for all of you. My original default was to use "-" but in a sample test case this is how Excel sorts these samples with "-" mixed in:
  
  everything makes sense until the last 2 where the 1.--9 and 1.-ab is AFTER 1.100 while I would expect sorting to be as shown in the 2nd,3rd, and 4th columns where "_", "0", and " " were used.
  
  EDIT ... NVM I found my answer. Excel apparently completely ignores "-" during a sort operation and therefor sorted it as if it isn't there. Notice in the next image how the SORT for both with and without "-" come out the same:
  
  outline_sorting.xlsx14 KB
  - SergeiBaklan
    Diamond Contributor
    Aug 01, 2024
    m_tarler
    
    That's not only during the sorting, dash is ignored when two values are compared, i.e. ="10" < "-9" or
    
    ="ab" < "-z" both return TRUE.
    
    However, ="99" = "-99" returns FALSE but ="99" = CHAR(173) & "99" returns TRUE.
    
    In general there are 2808 unichars which work the same way as soften hyphen, we may list them as
    
    =LET( codesRaw, UNICHAR( SEQUENCE(999999,,1 ) ), codes, FILTER( codesRaw, NOT( ISNA(codesRaw) ) ), test, codes & "abc" = "abc", chars, FILTER( codes, test ), result, HSTACK( chars, UNICODE(chars) ), result )

Forum Discussion

Excel - Sorting hierarchical numbers issue