Forum Discussion

GibbE155's avatar
GibbE155
Copper Contributor
Jul 18, 2020

Reorganised a long format table to a wide data table

Hi,

I'm struggling to reorganise a data table from long format to a wide format. my data is organised as follows:

Value Sample
34564341
566456771
463561
245562
2354782

This is how the data is organised but for 7 samples but hundreds of thousands of values per sample.

I'd like to organised the table like this:

Sample 1Sample 2Sample 3Sample 4Sample 5Sample 6
2234564664337765437754386537564
6765437643234567334563445673345

Can anyone advise me how to best achieve this in Excel?

  • OwenPrice's avatar
    OwenPrice
    Iron Contributor

    GibbE155 

     

    Since you said "hundreds of thousands", I recommend you don't use formulas for this, but instead use Power Query.

     

    Select any cell in your data and use Data>Get & Transform Data>From Table/Range

     

    This will open the Power Query Editor.

     

    Now, use Add Column>General>Custom Column and use this formula:

     

     

    "Sample " & Number.ToText([Sample])

     

     

    Call the new column ColumnHeader (not essential, you can call it whatever you want).

     

    Now right-click the Sample column and select Remove.

     

    Next, select the ColumnHeader column you created above and use Home>Transform>Group By and configure the dialog like this:

     

     

    Now use Add Column>General>Custom Column with this formula:

     

     

    Table.AddIndexColumn([GroupIndex], "Index", 1, 1)

     

     

    I called this new custom column DataWithGroupIndex.

     

    Now I have three columns:

     

    Right-click GroupIndex and Remove that column, then click the double-arrow in the top right hand corner of the DataWithGroupIndex column to expand the data that currently says "Table".

     

    In the expand field dialog, I've configured it like this:

     

    I know this seems like a lot of steps, but once done, this process will be repeatable and you won't be sat waiting for 7 hundreds of thousands of formulas to recalculate.

     

    The point of the steps leading up to here was to get an index column that repeats when the column header changes, which will be important for the next step.

     

    Now select the ColumnHeader column and use Transform>Any Column>Pivot Column and configure it like this:

     

     

    After clicking OK, you'll see that the data are properly top-loaded into each Sample column.

     

     

    You can right-click the Index and Remove it, then use Home>Close & Load to put the results back into the workbook.

     

    If you want, you can just open the attached workbook, select any cell in the green table, go to the Query Tab, select Edit, then on the Home Tab of the Power Query Editor, click Advanced Editor to see the code for the whole query, which you should be able to put into your own workbook with some minimal editing if you're comfortable with that.

     

     

    • GibbE155's avatar
      GibbE155
      Copper Contributor

      Hi OwenPrice , 

      Thanks, this appears to be the best solution. How do you return the workbook to a normal worksheet after this? The power query editor doesn't appear to allow independent sorting of columns.

      Many thanks  

      • OwenPrice's avatar
        OwenPrice
        Iron Contributor
        Do you want to sort each column with the largest value at the top?
    • Greenfire555's avatar
      Greenfire555
      Copper Contributor
      Hi! Thanks for taking the time to write out this great explanation. I have a similar data issue, however my values are text fields (and I have additional columns of data). Should it still be possible to use this method? When I tried to replicate I still ended up with two columns of data as I started... Thank you so much for your help!

      And if you are able to assist further, I can send more information about the table. Thanks again!
    • Varun_Nair's avatar
      Varun_Nair
      Copper Contributor

      OwenPrice Hey thanks a lot, this helped me resolve my Power Query issue that I was facing because I had Pivoted the 'Year' column which repeated for several row observations and gave the error of data type being "List" which is obvious. But this helped overcome the issue. 
      Thanks and Regards.

  • TheAntony's avatar
    TheAntony
    Iron Contributor

    GibbE155 , since you already have the data in Excel, a pivot table may be a quick solution. Select the data and Insert a Pivot Table. Add Value to the Values section and Sample to the Columns. See attached file.

     

    • GibbE155's avatar
      GibbE155
      Copper Contributor

      Hi TheAntony, this only appears to give me a value which is  the sum of all the values in the new columns, rather than listing each value individually. Would you know of a way around this?

      Many thanks, 

       

  • GibbE155 

    If with formulas

    that could be

    =$C$2 & " " & TRANSPOSE(UNIQUE(C3:C7))

    for headers and

    =FILTER($B$3:$B$7,$C$3:$C$7=INDEX(UNIQUE($C$3:$C$7),COLUMN()-COLUMN($E$2)+1))

    for the column.

    If use named range same formulas could be

    =INDEX(Range,1,2) & " " & TRANSPOSE(UNIQUE(INDEX(Range,2,2):INDEX(Range,ROWS(Range),2)))

    and

    =FILTER(INDEX(Range,2,1):INDEX(Range,ROWS(Range),1),
            INDEX(Range,2,2):INDEX(Range,ROWS(Range),2)=
            INDEX(UNIQUE(INDEX(Range,2,2):INDEX(Range,ROWS(Range),2)),COLUMN()-COLUMN($H$2)+1)
    )

    Formulas could be generated for pre-DA Excel as well.

     

    If with Power Query another variant could be

    let
        Source = Excel.CurrentWorkbook(){[Name="Range"]}[Content],
        #"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
        #"Grouped Rows" = Table.Group(#"Promoted Headers", {"Sample"}, {{"Count", each _[Value ]}}),
        Custom1 = Table.FromColumns(#"Grouped Rows"[Count],
            List.Transform(#"Grouped Rows"[Sample], each "Sample " & Text.From(_)))
    in
        Custom1

Resources