Duplicate Data

Copper Contributor

I have a spreadsheet with 4,000 lines. When I am extracting the data from the original source, its in a 1 to 1 relationship. How do I get it from a 1 to 1 relationship to 1 to many relationship in excel. Or is there a way to just remove the duplicates in column A without losing the data in the same row in column B & C. Now I could be saying all of the previous statement wrong. I created a sample spreadsheet with what the data looks like and how I want it to look. Columns A,B,C is the data in its current step. I want the data to look like columns J,K,L. Any ideas?

5 Replies

Hi Ryan,

 

You may add helper column D with formula

=IF(COUNTIF($A$1:A1,A1)>1,"",A1)

drag it down till end of the range.

Filter column D and select only blank cells. On filtered range select column A except header row and press Delete. Remove the filter and helper column if not needed.

 

Attached.

Great, this is exactly what I was looking for. Is it possible to do this is power BI before I load the formula into excel?

Ryan, do you mean to do that transformation with Power Query into separate table?

If you publish that workbook on Power BI services you may do what you want with it at any time.

I ran into some issues with the data. What changes would I need to make to the formula if items owned is not consistent among owners? I have corrected my sample data and uploaded it again.

Hi Ryan,

 

You may do transformation in Power Query (aka Get & Transform) like this

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    ChangeType = Table.TransformColumnTypes(Source,{
        {"Owner of Product", type text},
        {"Itmes Owned", type text},
        {"Purchase Date", type date}}),
    AddIndex = Table.AddIndexColumn(ChangeType, "Index", 0, 1),
    CleanRepeatedOwners = Table.AddColumn(AddIndex, "CleanOwner", each
        if [Index] = 0 then [Owner of Product]
        else
            if AddIndex{[Index]-1}[Owner of Product] = [Owner of Product] then null
            else [Owner of Product]),
    RemoveInitialOwnersColumn = Table.RemoveColumns(CleanRepeatedOwners,{"Owner of Product"}),
    RenameNewColumnAsOld = Table.RenameColumns(RemoveInitialOwnersColumn,{{"CleanOwner", "Owner of Product"}}),
    KeepColumns = Table.SelectColumns(RenameNewColumnAsOld,{"Owner of Product", "Itmes Owned", "Purchase Date" })
in
    KeepColumns

and load result back into the Excel sheet.

Attached.