Jul 18 2020 04:34 AM
Hi,
I'm struggling to reorganise a data table from long format to a wide format. my data is organised as follows:
Value | Sample |
3456434 | 1 |
56645677 | 1 |
46356 | 1 |
24556 | 2 |
235478 | 2 |
This is how the data is organised but for 7 samples but hundreds of thousands of values per sample.
I'd like to organised the table like this:
Sample 1 | Sample 2 | Sample 3 | Sample 4 | Sample 5 | Sample 6 |
223456 | 466433 | 776543 | 77543 | 8653 | 7564 |
676543 | 76432 | 34567 | 33456 | 344567 | 3345 |
Can anyone advise me how to best achieve this in Excel?
Jul 18 2020 06:36 AM - edited Jul 18 2020 06:42 AM
Since you said "hundreds of thousands", I recommend you don't use formulas for this, but instead use Power Query.
Select any cell in your data and use Data>Get & Transform Data>From Table/Range
This will open the Power Query Editor.
Now, use Add Column>General>Custom Column and use this formula:
"Sample " & Number.ToText([Sample])
Call the new column ColumnHeader (not essential, you can call it whatever you want).
Now right-click the Sample column and select Remove.
Next, select the ColumnHeader column you created above and use Home>Transform>Group By and configure the dialog like this:
Now use Add Column>General>Custom Column with this formula:
Table.AddIndexColumn([GroupIndex], "Index", 1, 1)
I called this new custom column DataWithGroupIndex.
Now I have three columns:
Right-click GroupIndex and Remove that column, then click the double-arrow in the top right hand corner of the DataWithGroupIndex column to expand the data that currently says "Table".
In the expand field dialog, I've configured it like this:
I know this seems like a lot of steps, but once done, this process will be repeatable and you won't be sat waiting for 7 hundreds of thousands of formulas to recalculate.
The point of the steps leading up to here was to get an index column that repeats when the column header changes, which will be important for the next step.
Now select the ColumnHeader column and use Transform>Any Column>Pivot Column and configure it like this:
After clicking OK, you'll see that the data are properly top-loaded into each Sample column.
You can right-click the Index and Remove it, then use Home>Close & Load to put the results back into the workbook.
If you want, you can just open the attached workbook, select any cell in the green table, go to the Query Tab, select Edit, then on the Home Tab of the Power Query Editor, click Advanced Editor to see the code for the whole query, which you should be able to put into your own workbook with some minimal editing if you're comfortable with that.
Jul 18 2020 10:28 AM
@GibbE155 , since you already have the data in Excel, a pivot table may be a quick solution. Select the data and Insert a Pivot Table. Add Value to the Values section and Sample to the Columns. See attached file.
Jul 18 2020 12:50 PM
If with formulas
that could be
=$C$2 & " " & TRANSPOSE(UNIQUE(C3:C7))
for headers and
=FILTER($B$3:$B$7,$C$3:$C$7=INDEX(UNIQUE($C$3:$C$7),COLUMN()-COLUMN($E$2)+1))
for the column.
If use named range same formulas could be
=INDEX(Range,1,2) & " " & TRANSPOSE(UNIQUE(INDEX(Range,2,2):INDEX(Range,ROWS(Range),2)))
and
=FILTER(INDEX(Range,2,1):INDEX(Range,ROWS(Range),1),
INDEX(Range,2,2):INDEX(Range,ROWS(Range),2)=
INDEX(UNIQUE(INDEX(Range,2,2):INDEX(Range,ROWS(Range),2)),COLUMN()-COLUMN($H$2)+1)
)
Formulas could be generated for pre-DA Excel as well.
If with Power Query another variant could be
let
Source = Excel.CurrentWorkbook(){[Name="Range"]}[Content],
#"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
#"Grouped Rows" = Table.Group(#"Promoted Headers", {"Sample"}, {{"Count", each _[Value ]}}),
Custom1 = Table.FromColumns(#"Grouped Rows"[Count],
List.Transform(#"Grouped Rows"[Sample], each "Sample " & Text.From(_)))
in
Custom1
Jul 19 2020 09:31 AM
Hi @TheAntony, this only appears to give me a value which is the sum of all the values in the new columns, rather than listing each value individually. Would you know of a way around this?
Many thanks,
Jul 19 2020 09:33 AM
Hi @OwenPrice ,
Thanks, this appears to be the best solution. How do you return the workbook to a normal worksheet after this? The power query editor doesn't appear to allow independent sorting of columns.
Many thanks
Jul 19 2020 09:54 AM
Jul 19 2020 09:58 AM
Added sorting to my variant
let
Source = Excel.CurrentWorkbook(){[Name="Range"]}[Content],
#"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
#"Grouped Rows" = Table.Group(#"Promoted Headers", {"Sample"},
{{"Count", each List.Sort(_[Value ])}}),
Custom1 = Table.FromColumns(#"Grouped Rows"[Count],
List.Transform(#"Grouped Rows"[Sample], each "Sample " & Text.From(_)))
in
Custom1
Jul 19 2020 10:50 AM - edited Jul 19 2020 10:53 AM
You can insert a step to sort the data before it's transformed to ensure it's sorted in the output.
Just sort the value column either ascending or descending in this position in the query:
The result is now:
I've attached the workbook containing the query that includes the sort step described.
In answer to your question, make sure you've used Home>Close & Load to put the results into the workbook. Then select any cell and use Table Design>Tools>Convert to Range.
Nov 05 2020 10:27 AM
Nov 05 2020 11:40 AM
Yes, it's better to provide sample file and it's better to start new conversation with this question.
Apr 17 2024 11:50 PM
@OwenPrice Hey thanks a lot, this helped me resolve my Power Query issue that I was facing because I had Pivoted the 'Year' column which repeated for several row observations and gave the error of data type being "List" which is obvious. But this helped overcome the issue.
Thanks and Regards.