SOLVED

New Contributor

# Combine data from multiple rows into a single row

My apologies for the long post... I've been wrestling with this for awhile and am at wit's end...

I have a table of data that contains the results of chemical analyses of water samples. There is a column with a single unique ID for each sample, and over 90 columns with the various chemical parameters (i.e., variables) and the associated data. The problem is that the data are scattered across multiple rows for each sample/analysis. Here's an example:

This is what I have:

 SampleID Row# Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9 Var10 4660905_1 0 -14.8 2574 4660905_1 1 6.73 30.1 <2 3291 4660905_1 2 123 122 4660905_1 3 123 122 4661103_2 0 110 2235 4661103_2 1 6.94 -214 27.6 114 <1 3213 4661103_2 2 114

I need to consolidate the samples into this:

 SampleID Row# Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9 Var10 4660905_1 6.73 -14.8 30.1 123 122 <2 2574 3291 4661103_1 6.94 -214 27.6 110 114 <1 2235 3213

Here are the complicating issues:

• There is no rhyme or reason to the distribution of data - it's not like, for example, row 1 always has Var2, 3, 4, 7,; row 2 always has Var1, 5, 6, etc.; so I can't assume that each sample will have a consistent distribution of data
• Some of the variables have the same data in multiple rows (e.g., Var6 and Var7 in the example above)
• Some of the values are not considered numbers by Excel (see Var8 above)
• The overall data table has thousands of samples; hundreds of them have multiple rows and the number of multiple rows ranges from 2-5 rows per sample. I can do this manually, but it will take at least an entire day to get through everything.

Does anyone know of a simple way to do this? Pivot tables won't work because of the non-numeric entries. I've looked into the Power Query tool, but can't figure out how to do this with so many variable columns. Any help would be greatly appreciated.

-Matt

5 Replies
best response confirmed by Dr. U (New Contributor)
Solution

# Re: Combine data from multiple rows into a single row

Hi Matt,

Attached is the sample with Power Query. That's done not optimal way to avoid coding and do all from UI, just to demonstrate the approach.

In brief, we unpivot other than SampleID columns, remove duplicates (as Var6, Var7), in second query Columns generate list of columns to be sure empty ones will be included (Var4, Var5), append that query to first one and pivot on attribute finally. Not sure if you need empty Row# column in final result, could be added.

# Re: Combine data from multiple rows into a single row

Sergei,

Thanks so much for your response. I think I understand what you did there, but I'm not sure if I can re-created it with my main spreadsheet of data. I'll give it a try, and I may ask for more guidance if you're willing to help.

- Matt

# Re: Combine data from multiple rows into a single row

Matt,

Sure, will be glad to help. The only do not promise immediate answers.

# Re: Combine data from multiple rows into a single row

Hi Sergei,
I have a PQ merge that is spitting our double rows (62 rather than 31) because the source files include all columns the same but some cells of data reflect "---".

Would this method be the most efficient to use on larger datasets? My file has several columns.

# Re: Combine data from multiple rows into a single row

Hi, - yes, it shall work