Forum Discussion

Felicity123's avatar
Felicity123
Brass Contributor
Oct 07, 2021
Solved

Need a macro to remove duplicate header rows but leave selected rows at the top

I have a very large data output file (approx 30k rows) that contains repeated headers and blank rows.  Can anyone help with a macro that lets me select which rows to keep at the top and removes all duplicates of them?

There is an example attached, in which the first 4 rows are what I want to keep.  But I need the flexibility to be able to say just keep row 4.

 

In the actual file the data I need to keep contains duplicate data so I can't use a macro that simply deletes duplicates.  I did try to do one that selected blanks from another column and deleted the entire rows but it took the headers too.

 

I need to import the data from HTML files then activate the macro in the resulting excel file.  If there is a way to automate the entire process that would be ideal but it is the macro for the headers that is my main issue.  I am working in excel for microsoft 365 (v2108) and am very rusty when it comes to writing macros so appreciate any ideas.

Thanks! 

  • Thank you so much! This is gold!! I tested it using A1:C1 with E1:F1 as well and it works perfectly.
    This is also much, much quicker than using the 2 short macros I had been using and much more flexible being able to select the cells rather than have the value hardcoded into the macro.

    Regarding importing html files, do you have a solution to automate that? The main issue here is I need an input box where I can select the location of the file that will be converted because this will vary.
    (I realise that should probably be a separate post)

    Thanks so much for your help! I really appreciate it!

5 Replies

  •  

    Sub Macro2()
    '
    ' Macro2 Macro
    '
    ' Keyboard Shortcut: Ctrl+q
    '
        Range("G4").Select
        Selection.End(xlDown).Offset(1, 0).Select
    If Len(ActiveCell.Offset(0, -6)) = 314 Then
    Do While Len(ActiveCell.Offset(0, -6)) = 314
        Range(Selection, ActiveCell.Offset(3, 0)).EntireRow.Select
        Selection.Delete Shift:=xlUp
        Range("G4").Select
        Selection.End(xlDown).Offset(1, 0).Select
    Loop
    ElseIf Len(ActiveCell.Offset(0, -6)) <> 314 Then
        MsgBox "Finished"
    End If
    End Sub

    Felicity123 As per file sent by you. Try this on your file.

    • Felicity123's avatar
      Felicity123
      Brass Contributor
      Thank you for your reply. I see your logic but this does not delete any rows in my file - only result I am getting from it running it is the Msgbox.
    • mtarler's avatar
      mtarler
      Silver Contributor

      Felicity123  this seemed like an interesting problem/challenge but I didn't understand how/if the solution given would work for you, especially in the more generic case so I created another option for you to try.  In the attached the macro (hot key is ctrl-m) will delete any other row in the file that matches any region you select.  So if you select A1:C3 and D4:E4 (i.e. hold ctrl key when selecting additional areas) it will delete any row that has the same values in columns A:C as any of the rows 1-3 OR identical values in columns D&E as found in row 4.

      I recommend NOT selecting entire rows as it will cause it to go very slow!  only select the areas of interest.  Also (although I didn't test it) if you select A1:C1 and E1:F1 it will treat those areas separately so if the values are the same as columns A:C OR the columns in E:F are the same then that row will be deleted.

      Lastly, in the sample file you gave you will find that the 1 line of the header will not get deleted because the 2 are NOT identical (there is a difference in number of spaces I believe). 

      I hope this is useful.

      • Felicity123's avatar
        Felicity123
        Brass Contributor
        Thank you so much! This is gold!! I tested it using A1:C1 with E1:F1 as well and it works perfectly.
        This is also much, much quicker than using the 2 short macros I had been using and much more flexible being able to select the cells rather than have the value hardcoded into the macro.

        Regarding importing html files, do you have a solution to automate that? The main issue here is I need an input box where I can select the location of the file that will be converted because this will vary.
        (I realise that should probably be a separate post)

        Thanks so much for your help! I really appreciate it!

Resources