Forum Discussion

jukhamil's avatar
jukhamil
Brass Contributor
Sep 24, 2021
Solved

A simple merge function

I'd like to write a simple merge function along these lines:

 

First, find all identical (duplicate) entries in Column B.

 

Then, for all columns after column B (in the data range):

 

Merge the values in each corresponding cell, comma separated, uniquely. That is, if Row X and Y have the same value in Column C, keep only one copy. If Row X and Y have a different value in Column D, keep both in one cell, but separated by commas.

 

Here's a weak attempt at a pseudocode for this:

 

 

  For row = 2 To 1000:

 

      For duplicate = (row + 1) To 1000:

 

        If Cells(duplicate, 2).Value = Cells(row, 2).Value Then

             mergeDuplicate(row, duplicate) ' a merge function (below) which takes in the two row indices

       End If

 

 

    End For

 

 

End For

 

Sub mergeDuplicate(row, duplicate)

 

  For column = 3 to 14:

 

    If Cells(row, column) != Cells(duplicate, column) Then

 

      Cells(row, column).Value.Append(", " + Cells(duplicate, column).Value)

 

    End If

 

 

  End For

  Delete Rows(duplicate)

 

End Sub

 

I can imagine there are more sophisticated ways to do this without so much iteration (storing values or cell references in lists, defining certain functions and objects), but I decided this was a good, simple, pure way to start.

 

Could any comment on my function, perhaps by providing a more elegant, VBA-style way of doing it?

 

Thank you very much.

  • First I ask if you really want to change the data or create a 'better' way to view the data. I ask because often people treat excel sheets as data collection, storage and viewing when often it is better to have data collection and storage set up separate from data viewing. For example, that is why Excel has Pivot Tables. By the way, you might consider if a Pivot Table view might be adequate for what you want. If not, some the new Dynamic Array formulas could be used to generate a view that you want.
    That all said, if you want to change the data (i.e. use VBA) then I might suggest a slight variation on your pseudo-code:

    Sub mergeDups()
      Dim i, FoundRow, C, LastRow, LastCol As Long
      With ActiveSheet
        LastRow = .UsedRange.Rows.Count
        LastCol = .UsedRange.Columns.Count
        For i = LastRow To 1 Step -1
          For FoundRow = 1 To i
               If (.Cells(i, 1) = .Cells(FoundRow, 1)) Then Exit For
           Next FoundRow
          If (FoundRow < i) Then
             For C = 2 To LastCol
                  If InStr(1, .Cells(FoundRow, C).Value2, .Cells(i, C).Value2) = 0 Then
                         .Cells(FoundRow, C).Value2 = .Cells(FoundRow, C).Value2 & ", " & .Cells(i, C).Value2
                  End If
             Next C
              .Cells(i, 1).EntireRow.Delete
           End If
        Next i
      End With
    End Sub

2 Replies

  • mtarler's avatar
    mtarler
    Silver Contributor

    First I ask if you really want to change the data or create a 'better' way to view the data. I ask because often people treat excel sheets as data collection, storage and viewing when often it is better to have data collection and storage set up separate from data viewing. For example, that is why Excel has Pivot Tables. By the way, you might consider if a Pivot Table view might be adequate for what you want. If not, some the new Dynamic Array formulas could be used to generate a view that you want.
    That all said, if you want to change the data (i.e. use VBA) then I might suggest a slight variation on your pseudo-code:

    Sub mergeDups()
      Dim i, FoundRow, C, LastRow, LastCol As Long
      With ActiveSheet
        LastRow = .UsedRange.Rows.Count
        LastCol = .UsedRange.Columns.Count
        For i = LastRow To 1 Step -1
          For FoundRow = 1 To i
               If (.Cells(i, 1) = .Cells(FoundRow, 1)) Then Exit For
           Next FoundRow
          If (FoundRow < i) Then
             For C = 2 To LastCol
                  If InStr(1, .Cells(FoundRow, C).Value2, .Cells(i, C).Value2) = 0 Then
                         .Cells(FoundRow, C).Value2 = .Cells(FoundRow, C).Value2 & ", " & .Cells(i, C).Value2
                  End If
             Next C
              .Cells(i, 1).EntireRow.Delete
           End If
        Next i
      End With
    End Sub
    • jukhamil's avatar
      jukhamil
      Brass Contributor
      Thanks, I'll study your response and get back to you. This is what I was hoping for, some external perspective and new ideas. Thank you very much.

Resources