Forum Discussion
Reorganised a long format table to a wide data table
Hi,
I'm struggling to reorganise a data table from long format to a wide format. my data is organised as follows:
Value | Sample |
3456434 | 1 |
56645677 | 1 |
46356 | 1 |
24556 | 2 |
235478 | 2 |
This is how the data is organised but for 7 samples but hundreds of thousands of values per sample.
I'd like to organised the table like this:
Sample 1 | Sample 2 | Sample 3 | Sample 4 | Sample 5 | Sample 6 |
223456 | 466433 | 776543 | 77543 | 8653 | 7564 |
676543 | 76432 | 34567 | 33456 | 344567 | 3345 |
Can anyone advise me how to best achieve this in Excel?
- OwenPriceIron Contributor
Since you said "hundreds of thousands", I recommend you don't use formulas for this, but instead use Power Query.
Select any cell in your data and use Data>Get & Transform Data>From Table/Range
This will open the Power Query Editor.
Now, use Add Column>General>Custom Column and use this formula:
"Sample " & Number.ToText([Sample])
Call the new column ColumnHeader (not essential, you can call it whatever you want).
Now right-click the Sample column and select Remove.
Next, select the ColumnHeader column you created above and use Home>Transform>Group By and configure the dialog like this:
Now use Add Column>General>Custom Column with this formula:
Table.AddIndexColumn([GroupIndex], "Index", 1, 1)
I called this new custom column DataWithGroupIndex.
Now I have three columns:
Right-click GroupIndex and Remove that column, then click the double-arrow in the top right hand corner of the DataWithGroupIndex column to expand the data that currently says "Table".
In the expand field dialog, I've configured it like this:
I know this seems like a lot of steps, but once done, this process will be repeatable and you won't be sat waiting for 7 hundreds of thousands of formulas to recalculate.
The point of the steps leading up to here was to get an index column that repeats when the column header changes, which will be important for the next step.
Now select the ColumnHeader column and use Transform>Any Column>Pivot Column and configure it like this:
After clicking OK, you'll see that the data are properly top-loaded into each Sample column.
You can right-click the Index and Remove it, then use Home>Close & Load to put the results back into the workbook.
If you want, you can just open the attached workbook, select any cell in the green table, go to the Query Tab, select Edit, then on the Home Tab of the Power Query Editor, click Advanced Editor to see the code for the whole query, which you should be able to put into your own workbook with some minimal editing if you're comfortable with that.
- OwenPriceIron ContributorDo you want to sort each column with the largest value at the top?
- Greenfire555Copper ContributorHi! Thanks for taking the time to write out this great explanation. I have a similar data issue, however my values are text fields (and I have additional columns of data). Should it still be possible to use this method? When I tried to replicate I still ended up with two columns of data as I started... Thank you so much for your help!
And if you are able to assist further, I can send more information about the table. Thanks again!Yes, it's better to provide sample file and it's better to start new conversation with this question.
- Varun_NairCopper Contributor
OwenPrice Hey thanks a lot, this helped me resolve my Power Query issue that I was facing because I had Pivoted the 'Year' column which repeated for several row observations and gave the error of data type being "List" which is obvious. But this helped overcome the issue.
Thanks and Regards.
If with formulas
that could be
=$C$2 & " " & TRANSPOSE(UNIQUE(C3:C7))
for headers and
=FILTER($B$3:$B$7,$C$3:$C$7=INDEX(UNIQUE($C$3:$C$7),COLUMN()-COLUMN($E$2)+1))
for the column.
If use named range same formulas could be
=INDEX(Range,1,2) & " " & TRANSPOSE(UNIQUE(INDEX(Range,2,2):INDEX(Range,ROWS(Range),2)))
and
=FILTER(INDEX(Range,2,1):INDEX(Range,ROWS(Range),1), INDEX(Range,2,2):INDEX(Range,ROWS(Range),2)= INDEX(UNIQUE(INDEX(Range,2,2):INDEX(Range,ROWS(Range),2)),COLUMN()-COLUMN($H$2)+1) )
Formulas could be generated for pre-DA Excel as well.
If with Power Query another variant could be
let Source = Excel.CurrentWorkbook(){[Name="Range"]}[Content], #"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]), #"Grouped Rows" = Table.Group(#"Promoted Headers", {"Sample"}, {{"Count", each _[Value ]}}), Custom1 = Table.FromColumns(#"Grouped Rows"[Count], List.Transform(#"Grouped Rows"[Sample], each "Sample " & Text.From(_))) in Custom1