Forum Discussion

Antman75's avatar
Antman75
Copper Contributor
Feb 07, 2025

Help creating a matrix table

Good day wonderful people,

I am trying to create a matrix table that shows correlations between types of issues people may be worrying about. The end result is to enable me to bring it into Power BI and utilize the chord visualization diagram to show where certain issues are connected and the strength of those connections.

I have an excel sheet with a range (se below). The first column is a unique client ID. As the data may be collected over a period of time, the same client IDs may appear more than once (if they come back to the service provider). The data may include up to a maximum four issues that they are worrying about, and these are in adjacent columns to the client ID.

As an example, in the below table, homelessness has low level interconnectivity with 'Priority Debt'; 'Disabled' & 'Mental Health' as they appear as issues alongside with that string; "Mental health" has a strong connection to "Non-Priority Debt" (and of course vice versa) with it occurring each time the other is mentioned. This is what I am aiming to show.

Due to the way in which this data is laid out (from a database export), I am struggling to bring it into a matrix table (if that is the best method) and then on into Power BI. I've spent quite a bit of time in the forums, but can't find a way to apply the learning in them to the table output, or the right way to transform the data (via power query or otherwise) to help identify the interconnectivity. 

If anyone can point me in the right direction to solve this, I would be most grateful! 

Thank you in advance :-)

  • djclements's avatar
    djclements
    Bronze Contributor

    If you're interested in a dynamic array option, try the following:

    =LET(
       cId, A2:A12,
       txt, BYROW(B2:E12, LAMBDA(r, LET(
            a, TOCOL(r, 1),
            n, ROWS(a),
            v, IF(ISERROR(n), "^", IF(n = 1, a & "^", TOCOL(IFS(SEQUENCE(, n) > SEQUENCE(n), a & "^" & TOROW(a)), 2))),
            TEXTJOIN("|", 0, ROWS(v), v, "")))),
       arr, TEXTBEFORE(TEXTAFTER(txt, "|", SEQUENCE(, MAX(--TEXTBEFORE(txt, "|")))), "|"),
       vec, TOCOL(arr, 2),
       HSTACK(TOCOL(IFS(NOT(ISERROR(arr)), cId), 2), TEXTBEFORE(vec, "^"), TEXTAFTER(vec, "^"))
    )

    Please note, a, TOCOL(r, 1), assumes the blank cells shown in your screenshot are truly blank. If that's not the case with your exported data, and they actually contain zero-length strings (""), change this line to:

    a, TOCOL(IFS(r<>"", r), 2),

    The attached file also contains a couple of options for counting the occurrence of pairs afterwards using either GROUPBY or PIVOTBY...

    • Antman75's avatar
      Antman75
      Copper Contributor

      Hi Sergei.

      Thank you and apologies for my delayed reply. If I am understanding the correct method to format the range / table for Power BI to create a chord diagram, then I believe I need to create 'pairwise' relationships between the issues. If I am correct in this assumption then I think it needs to be a two stage process, firstly identifying all pairings (which I think would look something like this):

      Table A: (Possible output from original table)

      Then another step would require counting the occurrence of pairs. 

      I am not certain this is the correct method, but seems to be the manner suggested in other forums. However, how to get from the original table to Table A (where pairs are combined) and then onto a final table counting the occurrences alludes me!

      Thanks again,

       

       

      • OliverScheurich's avatar
        OliverScheurich
        Gold Contributor
        Sub matrixtable()
        
        Dim i As Long, j As Long, k As Long
        Range("F:H").Clear
        
        k = 2
        For i = 2 To Range("B" & Rows.Count).End(xlUp).Row
        j = Application.WorksheetFunction.CountA(Range(Cells(i, 2), Cells(i, 5)))
        Select Case j
        
        Case Is = 1
        Cells(k, 7).Value = Cells(i, 2).Value
        Cells(k, 6).Value = Cells(i, 1).Value
        k = k + 1
        Case Is = 2
        Cells(k, 7).Value = Cells(i, 2).Value
        Cells(k, 8).Value = Cells(i, 3).Value
        Cells(k, 6).Value = Cells(i, 1).Value
        k = k + 1
        Case Is = 3
        Cells(k, 7).Value = Cells(i, 2).Value
        Cells(k, 8).Value = Cells(i, 3).Value
        Cells(k + 1, 7).Value = Cells(i, 2).Value
        Cells(k + 1, 8).Value = Cells(i, 4).Value
        Cells(k + 2, 7).Value = Cells(i, 3).Value
        Cells(k + 2, 8).Value = Cells(i, 4).Value
        Cells(k, 6).Value = Cells(i, 1).Value
        Cells(k + 1, 6).Value = Cells(i, 1).Value
        Cells(k + 2, 6).Value = Cells(i, 1).Value
        k = k + 3
        
        Case Is = 4
        Cells(k, 7).Value = Cells(i, 2).Value
        Cells(k, 8).Value = Cells(i, 3).Value
        Cells(k + 1, 7).Value = Cells(i, 2).Value
        Cells(k + 1, 8).Value = Cells(i, 4).Value
        Cells(k + 2, 7).Value = Cells(i, 2).Value
        Cells(k + 2, 8).Value = Cells(i, 5).Value
        Cells(k + 3, 7).Value = Cells(i, 3).Value
        Cells(k + 3, 8).Value = Cells(i, 4).Value
        Cells(k + 4, 7).Value = Cells(i, 3).Value
        Cells(k + 4, 8).Value = Cells(i, 5).Value
        Cells(k + 5, 7).Value = Cells(i, 4).Value
        Cells(k + 5, 8).Value = Cells(i, 5).Value
        Range(Cells(k, 6), Cells(k + 5, 6)).Value = Cells(i, 1).Value
        
        k = k + 6
        
        End Select
        
        
        Next i
        
        End Sub

        This code returns the intended output in column F:H in my sheet. For ~1500 rows in cols A-E the code runtime was ~3 seconds in my sheet. If necessary the code can be changed and executed in the main memory which would speed up the runtime significantly. The attached sample file doesn't contain the macro because for some reason i can't attach a macro-enabled Excel file anymore.

        =SUM(
            (MMULT(--($B$2:$E$25=G3), TRANSPOSE(--(COLUMN($A$1:$D$1)>0)))>0 )*
            (MMULT(--($B$2:$E$25=H3), TRANSPOSE(--(COLUMN($A$1:$D$1)>0)))>0 )
        )

        This formula was contributed by SergeiBaklan in an earlier discussion and i adapted it for your task.

         

Resources