SOLVED

A simple merge function

Brass Contributor

I'd like to write a simple merge function along these lines:

 

First, find all identical (duplicate) entries in Column B.

 

Then, for all columns after column B (in the data range):

 

Merge the values in each corresponding cell, comma separated, uniquely. That is, if Row X and Y have the same value in Column C, keep only one copy. If Row X and Y have a different value in Column D, keep both in one cell, but separated by commas.

 

Here's a weak attempt at a pseudocode for this:

 

 

  For row = 2 To 1000:

 

      For duplicate = (row + 1) To 1000:

 

        If Cells(duplicate, 2).Value = Cells(row, 2).Value Then

             mergeDuplicate(row, duplicate) ' a merge function (below) which takes in the two row indices

       End If

 

 

    End For

 

 

End For

 

Sub mergeDuplicate(row, duplicate)

 

  For column = 3 to 14:

 

    If Cells(row, column) != Cells(duplicate, column) Then

 

      Cells(row, column).Value.Append(", " + Cells(duplicate, column).Value)

 

    End If

 

 

  End For

  Delete Rows(duplicate)

 

End Sub

 

I can imagine there are more sophisticated ways to do this without so much iteration (storing values or cell references in lists, defining certain functions and objects), but I decided this was a good, simple, pure way to start.

 

Could any comment on my function, perhaps by providing a more elegant, VBA-style way of doing it?

 

Thank you very much.

2 Replies
best response confirmed by jukhamil (Brass Contributor)
Solution

First I ask if you really want to change the data or create a 'better' way to view the data. I ask because often people treat excel sheets as data collection, storage and viewing when often it is better to have data collection and storage set up separate from data viewing. For example, that is why Excel has Pivot Tables. By the way, you might consider if a Pivot Table view might be adequate for what you want. If not, some the new Dynamic Array formulas could be used to generate a view that you want.
That all said, if you want to change the data (i.e. use VBA) then I might suggest a slight variation on your pseudo-code:

Sub mergeDups()
  Dim i, FoundRow, C, LastRow, LastCol As Long
  With ActiveSheet
    LastRow = .UsedRange.Rows.Count
    LastCol = .UsedRange.Columns.Count
    For i = LastRow To 1 Step -1
      For FoundRow = 1 To i
           If (.Cells(i, 1) = .Cells(FoundRow, 1)) Then Exit For
       Next FoundRow
      If (FoundRow < i) Then
         For C = 2 To LastCol
              If InStr(1, .Cells(FoundRow, C).Value2, .Cells(i, C).Value2) = 0 Then
                     .Cells(FoundRow, C).Value2 = .Cells(FoundRow, C).Value2 & ", " & .Cells(i, C).Value2
              End If
         Next C
          .Cells(i, 1).EntireRow.Delete
       End If
    Next i
  End With
End Sub
Thanks, I'll study your response and get back to you. This is what I was hoping for, some external perspective and new ideas. Thank you very much.
1 best response

Accepted Solutions
best response confirmed by jukhamil (Brass Contributor)
Solution

First I ask if you really want to change the data or create a 'better' way to view the data. I ask because often people treat excel sheets as data collection, storage and viewing when often it is better to have data collection and storage set up separate from data viewing. For example, that is why Excel has Pivot Tables. By the way, you might consider if a Pivot Table view might be adequate for what you want. If not, some the new Dynamic Array formulas could be used to generate a view that you want.
That all said, if you want to change the data (i.e. use VBA) then I might suggest a slight variation on your pseudo-code:

Sub mergeDups()
  Dim i, FoundRow, C, LastRow, LastCol As Long
  With ActiveSheet
    LastRow = .UsedRange.Rows.Count
    LastCol = .UsedRange.Columns.Count
    For i = LastRow To 1 Step -1
      For FoundRow = 1 To i
           If (.Cells(i, 1) = .Cells(FoundRow, 1)) Then Exit For
       Next FoundRow
      If (FoundRow < i) Then
         For C = 2 To LastCol
              If InStr(1, .Cells(FoundRow, C).Value2, .Cells(i, C).Value2) = 0 Then
                     .Cells(FoundRow, C).Value2 = .Cells(FoundRow, C).Value2 & ", " & .Cells(i, C).Value2
              End If
         Next C
          .Cells(i, 1).EntireRow.Delete
       End If
    Next i
  End With
End Sub

View solution in original post