Transform data to fix duplicate rows into columns

Copper Contributor

I have a spreadsheet that is an output from a separate system, which does not allow me to configure the output in any way. I need to modify the output in Excel or a related tool (macro, BI, etc.)

 

The problem with the data is that every new value in later columns, results in a whole new row with a number of duplicates for the earlier column values. Specifically:JeffWow_0-1692974511438.jpeg

 

Every time a new C or D record is created, the report adds whole new rows duplicating A1 and B1 and B2 to match.

 

I want to take C1/2/3 and D1/2/3 and split them out into separate columns of their own, so that all of the above info boils down to ONE row:

 JeffWow_1-1692974511438.jpeg

 

There are already hundreds of populated rows (and dozens of columns following the above pattern), and I will be outputting a new, updated spreadsheet from the external system every two weeks which will result in updates to the original data plus additional new rows of data. Manually fixing this will not be an option.

 

Is there any way to do some sort of quick, automatic operations to transform this data? Thanks!

4 Replies

@JeffWow 

I think we have to cycle through all columns (from column A to the last filled column) and extract the unique values and return them in one row.

Sub transform_data()

Dim i, j, k, l, m As Long
Dim var As Variant

Range("J:J").Clear

j = Cells(1, Columns.Count).End(xlToLeft).Column
k = Range("A" & Rows.Count).End(xlUp).Row
m = 1

For i = 1 To j
For l = 1 To k

var = Application.Match(Cells(l, i).Value, Columns(10), 0)

If IsError(var) Then
Cells(m, 10).Value = Cells(l, i).Value
m = m + 1

Else

End If

Next l
Next i

End Sub

This macro returns the intended result in column J in my understanding. For better visibility the result is returned within a column but this can be changed to a row.

 

In this example i've created records E and F and added new rows duplicating the values A1, B1, B2, C1, C2, C3, D1, D2 and D3.

transform data.png

 

Do you work with Office 365 or Excel for the web? If so a solution with LAMBDA is possible.

 

@JeffWow 

=DROP(TOROW(REDUCE("",SEQUENCE(COLUMNS(A1:F1)),LAMBDA(x,y,VSTACK(x,UNIQUE(CHOOSECOLS(A1:F24,y)))))),,1)

 

You can try this formula if you work with Office 365 or Excel for the web.

lambda.png

 

@JeffWow 

I may need to see a larger sample, but for what you've provided, perhaps this:

 

=UNIQUE(TOROW(data,,1),1)

 

 

Patrick2788_0-1692985800310.png

 

Hi All,

 

Thanks for these ideas! However, they seem to only be sorting all data into one row. That's probably my fault for not providing a more robust example.

 

I've attached some dummy data here. This is a spreadsheet that I need to sort hundreds of rows, giving me usable columns.